Ragie Connect: Build RAG Apps Over Users' Data in No Time

Setting up a RAG infra from scratch can be an absolute nightmare. Here’s what you must build:

  • OAuth for third-party data integrations.
  • Designing UIs for ingesting user data.
  • Continuous sync for data updates.
  • Enabling user data partitioning
ragie.ai
ragie.ai

Ragie Connect provides the complete infrastructure to handle authentication, authorization, and syncing for your users’ data from several sources:

  • Google Drive
  • Salesforce
  • Notion
  • Jira
  • etc.

If you’re looking to integrate RAG into your platform in no time, start with Ragie.

RAG over GitHub repos

Continuing the discussion from Ragie...

Today, let's show you how we built a ​​​RAG app over GitHub repos using:

  • GitIngest to parse the entire repo in markdown.
  • Llama-3.2 as the LLM.
RAG over Github Repos
RAG over Github Repos

Let's build it now.

Step 1) Parse GitHub

GitIngest turns any Git repository into a simple text digest of its codebase. It provides the summary, the directory structure and parses all files to markdown.

We use this as follows:

Parse Github Repo
Parse Github Repo

We also save the above content to a local file, which is then loaded with LlamaIndex's SimpleDirectoryReader as follows:

Load Documents
Load Documents

Step 2) Set up LLM and embedding model

Since our knowledge base is ready, we set up the LLM and embedding model.

Set up llm model
Set up llm model
  • We use Ollama to run the LLM locally.
  • And an embedding model from HuggingFace.

Step 3) Embed data, create an index and chat

We pass the documents loaded documents (docs = loader.load_data()), create a vector store below, and then chat with the repo as follows:

Generate Response
Generate Response

Done!

There’s some streamlit part (full code here) we haven't shown here, but after building it, we get this clear and neat interface:

Final product
Final product