Skip to content

Questro — Document Q&A Chatbot with LlamaIndex

2024–2025 | Python, Streamlit, LlamaIndex, OpenAI API

Questro is an AI-powered document Q&A chatbot built with Streamlit and LlamaIndex that uses the OpenAI API to answer questions from PDF, DOCX, and TXT files. See the Documentation.

Interactive document question-answering chatbot that allows users to query PDF, DOCX, and TXT documents using a retrieval-augmented generation (RAG) workflow.

Problem / Motivation

Finding specific information in documents can be tedious, especially in research papers, textbooks, or legal documents. Users often waste time searching manually or context-switching between multiple resources.

Questro addresses this by providing an AI-powered chatbot that can:

  • Answer questions directly from uploaded documents
  • Maintain context to provide relevant answers
  • Persist document indices across sessions for continuous use

This project demonstrates practical applications of RAG, vector indexing, and full-stack LLM integration for document understanding.

Overview

Questro is a Streamlit-based chatbot powered by LlamaIndex:

  • Pre-loaded document: Users can start asking questions immediately using a default document
  • Document upload: Supports multiple PDFs, DOCX, or TXT files
  • Index creation: LlamaIndex builds a vector index for efficient semantic search
  • Query processing: Retrieves relevant content and generates contextual answers using an OpenAI LLM
  • Persistent storage: Document indices are saved to allow continuity across sessions

The system ensures answers are contextually accurate, informing users if queries fall outside the scope of the indexed content.

Tech Stack & Frameworks

  • Languages / Frameworks: Python, Streamlit
  • AI / ML: OpenAI API, LlamaIndex, cosine similarity for semantic search
  • Storage: LlamaIndex persistent storage mechanism
  • Deployment: Vercel serverless hosting (for demo)

Features / Capabilities

User-facing

  • Ask questions from pre-loaded or uploaded documents
  • Support multiple document formats (PDF, DOCX, TXT)
  • Persistent index storage for continuous usage
  • Immediate contextual feedback

Backend / Engineering

  • Vector index creation using LlamaIndex
  • Semantic search for accurate retrieval
  • RAG-based LLM responses
  • Streamlit interface for lightweight deployment and demo

Potential Applications

  • Education: Quickly retrieve information from textbooks or research papers
  • Customer Support: Answer questions based on internal documentation or product manuals
  • Legal Research: Assist in finding relevant content in contracts or case studies
  • Knowledge Management: Turn personal notes or document collections into an interactive knowledge assistant

Future Enhancements

  • Support multiple indices for different topics
  • Advanced relevance checking (passage ranking, query expansion)
  • Fine-tuning LLMs on domain-specific content
  • User authentication for private document management
  • Enhanced UI for multi-document search and visualization

Learning Outcomes

  • Implemented RAG workflows for document Q&A
  • Built Streamlit-based chatbot UI
  • Integrated LlamaIndex vector indices with OpenAI LLMs
  • Learned persistence and session management for document indices
  • Practiced end-to-end AI product development from document ingestion to query responses
  • Documentation: https://questro.vercel.app