Questro — Document Q&A Chatbot with LlamaIndex

2024–2025 | Python, Streamlit, LlamaIndex, OpenAI API

Questro is an AI-powered document Q&A chatbot built with Streamlit and LlamaIndex that uses the OpenAI API to answer questions from PDF, DOCX, and TXT files. See the Documentation.

Interactive document question-answering chatbot that allows users to query PDF, DOCX, and TXT documents using a retrieval-augmented generation (RAG) workflow.

Problem / Motivation

Finding specific information in documents can be tedious, especially in research papers, textbooks, or legal documents. Users often waste time searching manually or context-switching between multiple resources.

Questro addresses this by providing an AI-powered chatbot that can:

Answer questions directly from uploaded documents
Maintain context to provide relevant answers
Persist document indices across sessions for continuous use

This project demonstrates practical applications of RAG, vector indexing, and full-stack LLM integration for document understanding.

Overview

Questro is a Streamlit-based chatbot powered by LlamaIndex:

Pre-loaded document: Users can start asking questions immediately using a default document
Document upload: Supports multiple PDFs, DOCX, or TXT files
Index creation: LlamaIndex builds a vector index for efficient semantic search
Query processing: Retrieves relevant content and generates contextual answers using an OpenAI LLM
Persistent storage: Document indices are saved to allow continuity across sessions

The system ensures answers are contextually accurate, informing users if queries fall outside the scope of the indexed content.

Tech Stack & Frameworks

Languages / Frameworks: Python, Streamlit
AI / ML: OpenAI API, LlamaIndex, cosine similarity for semantic search
Storage: LlamaIndex persistent storage mechanism
Deployment: Vercel serverless hosting (for demo)

Features / Capabilities

User-facing

Ask questions from pre-loaded or uploaded documents
Support multiple document formats (PDF, DOCX, TXT)
Persistent index storage for continuous usage
Immediate contextual feedback

Backend / Engineering

Vector index creation using LlamaIndex
Semantic search for accurate retrieval
RAG-based LLM responses
Streamlit interface for lightweight deployment and demo

Potential Applications

Education: Quickly retrieve information from textbooks or research papers
Customer Support: Answer questions based on internal documentation or product manuals
Legal Research: Assist in finding relevant content in contracts or case studies
Knowledge Management: Turn personal notes or document collections into an interactive knowledge assistant

Future Enhancements

Support multiple indices for different topics
Advanced relevance checking (passage ranking, query expansion)
Fine-tuning LLMs on domain-specific content
User authentication for private document management
Enhanced UI for multi-document search and visualization

Learning Outcomes

Implemented RAG workflows for document Q&A
Built Streamlit-based chatbot UI
Integrated LlamaIndex vector indices with OpenAI LLMs
Learned persistence and session management for document indices
Practiced end-to-end AI product development from document ingestion to query responses

Links

Documentation: https://questro.vercel.app