Rice Knowledge Graph RAG QA System (Enhanced)
A GraphRAG system for rice pest and disease management with advanced retrieval features.
Knowledge Graph: ~11,000 triples | Embedding: BGE-small-zh-v1.5 | Vector Search: FAISS | LLM: DashScope (qwen-turbo)
Pipeline: Ready
Retrieved Knowledge Sources
Ask a question to see retrieved knowledge.
Knowledge Graph Entity Explorer
Search for any entity (disease, pest, treatment, variety) to view its knowledge graph connections.
Enter an entity name and click Explore to see its knowledge graph connections.
Ablation Study (消融实验)
Tests the contribution of each RAG component by running 5 test queries across 6 configurations (30 evaluations total). Uses LLM-as-judge to score answers on relevance, completeness, and accuracy (1-5 scale).
Test queries: single entity lookup, treatment inquiry, comparison, complex multi-hop, multi-aspect.
Configurations:
- Baseline (Vector Only)
- +Re-ranking
- +Multi-hop
- +HyDE
- +KG Reasoning
- Full (All Features)
⚠️ This will take 2-3 minutes and consume API calls.
Click 'Run Ablation Study' to start the evaluation.
Architecture
User Query
|
v
[Query Decomposition] --(optional, LLM-driven)
|
v
[Stage 1: Bi-Encoder Retrieval] -- BGE-small-zh-v1.5 + FAISS (top-20)
|
v
[Stage 2: Cross-Encoder Re-ranking] -- BGE-reranker-v2-m3 (top-5)
|
v
[Prompt Construction] -- Retrieved knowledge + user question
|
v
[LLM Generation] -- Qwen-Turbo / configurable
|
v
Answer + Sources
Features
1. Two-Stage Retrieval (Bi-encoder + Cross-encoder)
- Stage 1: Fast bi-encoder retrieval (FAISS) returns top-20 candidates
- Stage 2: Cross-encoder (BGE-reranker-v2-m3) re-scores and selects top-5
- Significantly improves retrieval precision over single-stage
2. Multi-hop Query Decomposition
- LLM decomposes complex queries into simpler sub-questions
- Each sub-query retrieves independently, results are merged & deduplicated
- Example: "稻瘟病和纹枯病怎么防治?" -> ["稻瘟病怎么防治?", "纹枯病怎么防治?"]
3. Knowledge Graph Explorer
- Interactive entity lookup to explore KG connections
- Entity-level text chunking: triples grouped by head entity
- Treatment entities auto-enriched with reverse links (what they can treat)
4. Entity-Level Text Chunking
- Raw triples are grouped by head entity and converted to natural language
- Treatment entities get auto-enriched: "可用于防治: 稻瘟病(化学防治)、纹枯病(生物防治)..."
- Preserves semantic coherence for better embedding quality
5. Conversation Memory + Context Rewriting
- Maintains multi-turn dialogue history (up to 5 turns)
- LLM rewrites context-dependent follow-ups into standalone queries
- Example: "用量呢?" -> "三环唑的用量是多少?" based on previous conversation
6. HyDE (Hypothetical Document Embeddings)
- Generates a hypothetical answer to the question via LLM
- Uses the hypothetical answer as a search query for better retrieval
- Bridges the gap between short queries and verbose knowledge chunks
7. KG Path Reasoning
- BFS-based multi-hop path finding in the knowledge graph
- Discovers entity connections: disease -> treatment -> dosage
- Appends reasoning paths to the retrieval context for structured answers
Tech Stack
| Component | Technology |
|---|---|
| Knowledge Graph | ~11,000 triples (rice diseases, pests, treatments) |
| Embedding | BAAI/bge-small-zh-v1.5 (512-dim, Chinese-optimized) |
| Re-ranker | BAAI/bge-reranker-v2-m3 (cross-encoder) |
| Vector Search | FAISS (IndexFlatIP, inner product) |
| LLM | DashScope Qwen-Turbo (configurable) |
| UI | Gradio |
Dataset
The knowledge graph contains triples covering:
- Diseases: Rice blast, sheath blight, bacterial leaf blight, etc.
- Pests: Rice stem borer, planthopper, leaf roller, etc.
- Treatments: Chemical, biological, and agricultural control methods
- Properties: Symptoms, dosage, application timing, pathogen info