Customer Support Chatbot using FastAPI and Groq

Customer Support Chatbot using FastAPI and Groq
Photo by Berkeley Communications / Unsplash

Overview

This project aims to create a customer support chatbot for the company "&ai" that interacts with users, answers questions based on provided company documents, and provides relevant, conversational responses. It is built using FastAPI, Groq API for generating responses, and Jina embeddings for text-to-vector conversion. The project employs a Retrieval-Augmented Generation (RAG) approach, where document data is used to provide accurate and contextually relevant responses.
The system integrates several key components: document processing, embedding generation, vector database storage, conversational retrieval, and user interaction through a frontend interface built using AI-builder's Typebot.

Key Features

  1. Document-based Question Answering (QA):
    The chatbot answers user questions by retrieving relevant data from company documents.
  2. Conversational Memory:
    The system maintains context across conversations using a memory buffer, ensuring fluid and coherent exchanges.
  3. Embeddings and Vector Database:
    Jina embeddings are used to convert document text into vectors, and Supabase is used to store and retrieve these vectors efficiently.
  4. RAG (Retrieval-Augmented Generation):
    Combines document retrieval and Groq's language model to generate detailed responses.
  5. Frontend Integration:
    Typebot provides the user interface for interacting with the chatbot, calling the FastAPI backend to process the requests.

System Architecture

  1. FastAPI Backend:
    The backend is built using FastAPI, a modern web framework for building APIs. The backend exposes an endpoint /chat which accepts user questions and returns chatbot responses based on the document data.
  2. Document Processing:
    The documents are stored in a folder named pdfs and contain the necessary text for the chatbot to refer to. These PDFs are processed using PyPDF2, where the text from each PDF is extracted. The extracted text is then split into smaller chunks for more effective embedding.
    Functionality:
    - get_pdf_text: function Extracts text from PDF documents.
    - get_text_chunks:function Splits the extracted text into smaller chunks for processing and vectorization.
  3. Jina Embeddings:
    To convert the text chunks into a vector format that can be used for retrieval, the project uses JinaEmbeddings. This converts textual data into vector representations, enabling the chatbot to find the most relevant information when answering user questions.
  4. Supabase Vector Storage:
    The system uses PGVector, a Postgres-based vector database, to store and retrieve the text vectors. This allows for efficient searching and comparison between the user query and stored document data.
  5. Groq API for LLM-based Responses:
    The chatbot uses Groq's language model (LLM) to generate responses. By combining the RAG approach, the system first retrieves relevant document data from the vector database, then passes the retrieved information to Groq's model, which generates a human-like response.
  6. Conversational Memory:
    The chatbot maintains conversation history using ConversationBufferMemory. This allows the chatbot to remember previous interactions and respond in a more contextually aware manner.

Endpoint

  • POST /chat:
    This endpoint accepts a user question and provides a response based on the processed documents. The API processes the input, retrieves relevant information from the document vectors, and generates a response using Groq's model.

Workflow

  1. User Interaction: The user interacts with the chatbot through Typebot, which serves as the frontend.
  2. Question Submission: The question is sent to the FastAPI backend via the /chat endpoint.
  3. Document Processing: If required, the text from company documents is extracted and chunked.
  4. Embedding Generation: The text chunks are converted into vectors using Jina embeddings.
  5. Vector Store Retrieval: The user's question is compared against the vectors stored in the Supabase database to find relevant information.
  6. Response Generation: Groq's LLM generates a response based on the retrieved information, and conversational memory ensures that the system remembers previous exchanges.
  7. Response Delivery: The answer is returned to the user through the Typebot interface.

Conclusion

This chatbot project leverages modern AI technologies like FastAPI, Groq API, Jina embeddings, and vector databases to provide a scalable and intelligent solution for customer support. By utilizing document data and conversational memory, the chatbot can handle complex queries while maintaining conversational flow, providing users with a highly interactive and informative experience.

Read more