📖

RAG Technology

How AI Buddha Zen accurately cites from 10,000+ Buddhist scripture verses.
A deep dive into Retrieval-Augmented Generation for religious AI.

10,023
Scripture verses in database
18
Buddhist scriptures
20
Theme categories
3
Confidence levels

1 What is RAG?

RAG (Retrieval-Augmented Generation) is a technique that enhances AI responses by first retrieving relevant information from a curated database, then using that information to generate accurate, grounded answers. Unlike standard AI chatbots that rely solely on training data, RAG ensures every response is backed by verifiable source material.

Why RAG matters for religious AI

Religious AI faces a unique challenge: hallucination in sacred texts. A standard AI might fabricate scripture quotes that sound authentic but don't actually exist. This is unacceptable in a religious context. RAG solves this by constraining the AI to cite only from verified, real scripture verses in our database.

Standard AI Chatbot AI Buddha Zen (RAG)
Knowledge source Training data (static) 10,023 verified scripture verses
Citation accuracy May fabricate quotes Every quote traceable to source
Verifiability Difficult to verify Scripture name, chapter, verse #
Repetition control None Last 30 quotes excluded

2 5-Step RAG Pipeline

When you send a message to AI Buddha Zen, it goes through a 5-step pipeline before generating a response. This process takes about 3-5 seconds.

Your message ↓ Step 1: Theme Detection (20 themes + 44 bridge tags) ↓ Step 2: Recently-Seen Exclusion (last 30 quotes) ↓ Step 3: 5-Candidate Retrieval (confidence + priority + randomization) ↓ Step 4: AI Selection (Claude picks the best 1-2 from 5 candidates) ↓ Step 5: Response Generation (empathy + scripture quote + practical advice)

Step 1: Theme Detection

Your message is analyzed against 20 theme categories and 44 "bridge tags" — trigger phrases that help map everyday language to Buddhist concepts.

Example: "I can't sleep because of work stress" → Themes detected: anxiety work mindfulness

📋 All 20 themes

suffering · impermanence · anger · attachment · compassion · wisdom · emptiness · karma · mindfulness · relationship · death · anxiety · work · happiness · self · loneliness · craving · gratitude · aging · family

Step 2: Recently-Seen Exclusion

To prevent the same verse from being shown repeatedly, the system checks the last 30 quotes shown to you (stored in rag_usage_log). These are excluded from the candidate pool, ensuring you encounter a wide variety of the 10,000+ verses in our database.

Step 3: 5-Candidate Retrieval

From the remaining verses matching the detected themes, 5 candidates are selected using a priority system:

direct Directly cited from Pali Canon with verified source (highest priority)

aligned Aligned with scripture teaching, paraphrased or summarized

reference Reference material — cited with cautious language ("it is said that...")

A randomization factor (rand_order shift) ensures that even within the same theme and confidence level, different verses appear each time.

Step 4: AI Selection

The 5 candidates, complete with Pali text, Japanese/English translation, source information, and confidence level, are injected into the AI's prompt. Claude (Anthropic) then selects the 1-2 verses that best resonate with your specific concern.

━━━ Scripture Reference Data (RAG) ━━━ Below are 5 candidate verses. Select the 1-2 most relevant. 【STRICT】Quote the English translation directly. 【STRICT】Copy the source name exactly for the Reference line. 【STRICT】Do NOT fabricate verses not listed below. 【STRICT】You MUST cite at least one verse. 【Scripture 1】Dhammapada, Chapter 1, Twin Verses (Verse 1) Confidence: direct Pāli: Manopubbaṅgamā dhammā... English translation (QUOTE THIS): "All things are preceded by mind..." ...

Step 5: Response Generation

The AI generates a response following a structured format:

Empathy — Acknowledge the seeker's feelings (1-2 sentences)

Wisdom — Scripture quote + explanation (3-5 sentences)

Practice — Concrete suggestion (1-2 sentences)

Reference — Scripture name, chapter, verse number

3 Scripture Database

AI Buddha Zen's database contains 10,023 verses from 18 Buddhist scriptures, primarily from the Pali Canon (the oldest surviving Buddhist texts).

Scripture Verses Source
Dhammapada / 423Khuddaka Nikāya
Therāgāthā / 1,279Khuddaka Nikāya
Therīgāthā / 494Khuddaka Nikāya
Sutta Nipāta / 1,149Khuddaka Nikāya
Aṅguttara Nikāya / 855Sutta Piṭaka
Saṃyutta Nikāya / 1,132Sutta Piṭaka
+ 12 more scriptures...
Total10,023

Data Structure Per Verse

{ "id": 1, "canon": "dhammapada", "source_ja": "Dhammapada Ch.1 Twin Verses (Verse 1)", "pali": "Manopubbaṅgamā dhammā, manoseṭṭhā manomayā...", "original": "All things are preceded by mind, led by mind, created by mind...", "japanese": "(Japanese translation)", "theme": "wisdom", "sub_themes": "self,mindfulness", "confidence_type": "direct", "keywords": "mind,heart,thought,creation" }

Source: SuttaCentral (CC0 license) — Bhikkhu Sujato English translations + Mahāsaṅgīti Pāli text.

4 Hallucination Prevention

AI Buddha Zen employs 5 layers of hallucination prevention to ensure no fabricated scripture quotes reach users:

🔒

Closed Canon RAG

The AI can only cite from the 10,023 verified verses in our database. It cannot search the internet or generate quotes from training data.

📋

Verbatim Quoting

The prompt instructs: "Quote the Japanese/English translation EXACTLY as provided. Do NOT paraphrase or re-translate."

⚠️

Confidence Labels

Each verse is tagged as "direct", "aligned", or "reference". Lower-confidence verses are cited with hedging language ("it is said that...").

🚫

Fabrication Block

The prompt explicitly states: "Do NOT fabricate verses not listed in the 5 candidates below."

Mandatory Citation

The AI is required to cite at least 1 verse from the candidates. If it cannot find a relevant verse, it says "this is not in our database" rather than inventing one.

5 Safety Framework (CAP-SRP v2.0)

Beyond RAG, AI Buddha Zen implements a multi-layered safety framework based on clinical psychology and religious AI ethics research.

Risk CategoryDetectionActionReference
🚨 Suicide Risk (3-tier)C-SSRS Tier 1-3Tier 2-3: Crisis intervention
Tier 1: Empathetic response
Posner et al. (2011)
🧘 Spiritual BypassingDistress + avoidance co-occurrenceSuppress superficial spiritual comfortSBS-13 (Fox et al. 2017)
🔗 AI DependencyMessage + usage patternEncourage human connectionsAMDF (2026)
🔐 PrivacySHA-256 + HMACMessage text never storedCAP-SRP Spec

→ Learn more about safety features

6 Comparison with BuddhaBot-Plus (Kyoto University)

Kyoto University's BuddhaBot-Plus, led by Professor Seiji Kumagai, pioneered the "source-first architecture" for Buddhist AI. AI Buddha Zen shares this RAG-based approach while adding consumer-facing features and safety layers.

BuddhaBot-Plus AI Buddha Zen
DeveloperKyoto UniversityVeritasChain Inc.
ArchitectureSource-first RAGSource-first RAG
Scripture DB~3,000 verses10,023 verses
PlatformResearch prototypeLINE Bot + iOS App
Safety frameworkELSICAP-SRP v2.0 (C-SSRS + SBS-13 + AMDF)
AccessNot publicFree, public

Note: BuddhaBot-Plus is an academic research project with different goals. This comparison is for technical context, not competitive positioning.

🪷 Experience RAG-Powered Buddhist Wisdom

Try AI Buddha Zen for free. Every response cites the actual scripture name, chapter, and verse number.

💬 Add on LINE (Free)

Download iOS App