← GlossaryGlossary

What is RAG (retrieval-augmented generation)?

RAG — retrieval-augmented generation — is when an AI looks up relevant info from your documents before answering, so its replies are grounded in your actual content instead of just its training data.

RAG, retrieval-augmented generation, is the technique of giving an AI access to your documents at the moment it's answering a question, instead of relying only on what it learned during training. The agent searches your documents, pulls the most relevant snippets, and uses them to compose its answer.

Three pieces make up a RAG system. First, an indexed knowledge base — your documents broken into chunks and stored so they can be searched by meaning, not just by keyword. Second, a retriever that, given a question, finds the most relevant chunks. Third, the generator (the LLM) that uses those chunks plus the question to write the answer.

RAG is the standard answer to two of the biggest LLM weaknesses: hallucination (making up plausible-sounding facts) and stale knowledge (the model doesn't know about anything that happened after its training cut-off). With RAG, the model only answers from sources you control, and those sources can be updated whenever you want.

A simple example

A customer-support agent at a SaaS company has access to all the company's help docs, release notes, and historical support tickets. When a customer asks "how do I export my data?" the agent searches the knowledge base, finds the relevant help article, and answers using its content — and cites the article so the customer can read more. No hallucination, current info, transparent source.

Why it matters.

Without RAG, AI agents that need to know about your business have only two options: be trained on your data (expensive, slow, leaky) or guess. Both fail in different ways. RAG threads the needle — current, grounded, private.

For non-technical operators, RAG is the difference between "the agent gives generic answers" and "the agent answers like someone who actually knows our business". The quality of your knowledge base directly determines the quality of the agent.

The risk is bad retrieval. If the system pulls the wrong snippets, the agent's answer will be confidently wrong. Strong RAG systems use re-rankers (a second-pass model that reorders search results) and citation systems so the user can verify what the agent based its answer on.

How Squidgy handles it

RAG (retrieval-augmented generation) on Squidgy.

Squidgy gives every agent a knowledge base. You drop in URLs, PDFs, Notion pages, or CSV exports — Squidgy indexes them and the agent uses them at answering time. You can update the knowledge base at any point and the agent uses the new content immediately.

Citations are on by default — when an agent answers from your knowledge base, it shows the customer which document the answer came from. That keeps the agent honest and the customer confident.

Frequently asked

Common questions about rag.

RAG vs fine-tuning — which is better?+

Different jobs. Fine-tuning teaches a model new behaviours or styles. RAG gives a model new information at answer time. For business knowledge that changes (prices, policies, product details) RAG wins because it's instantly updatable. Use fine-tuning for stable behaviours; use RAG for facts.

How big can my knowledge base be?+

On Squidgy, large — millions of words is fine. The retriever only sends the relevant chunks to the model, so the model itself doesn't need to fit the whole knowledge base. The practical limit is more about quality than size: a 50-page well-curated knowledge base usually beats a 5,000-page messy one.

Does RAG completely stop hallucinations?+

It dramatically reduces them but doesn't eliminate them. If your knowledge base is missing relevant info, a model can still make something up to fill the gap. Strong systems detect this ("I don't have information about that") instead of guessing. Eval is how you measure it.

Can the agent cite its sources?+

Yes — and on Squidgy this is on by default. Every answer the agent draws from your knowledge base shows the source document, so the customer can read the original.

Build your own AI agent.

No code. Hands-on onboarding from the team in your first cohort.