# RAG (Retrieval Augmented Generation)

### The concept of the RAG <a href="#the-concept-of-the-rag" id="the-concept-of-the-rag"></a>

With vector retrieval at its core, the RAG architecture has become the leading technological framework for addressing two major challenges of large models: acquiring the latest external knowledge and mitigating issues of generating hallucinations. This architecture has been widely implemented in numerous practical application scenarios.

Developers can utilize this technology to cost-effectively build AI-powered customer service bots, corporate knowledge bases, AI search engines, etc. These systems interact with various forms of organized knowledge through natural language input. A representative example of a RAG application is as follows:

In the diagram below, when a user asks, "Who is the President of the United States?", the system doesn't directly relay the question to the large model for an answer. Instead, it first conducts a vector search in a knowledge base (like Wikipedia, as shown in the diagram) for the user's query. It finds relevant content through semantic similarity matching (for instance, "Biden is the current 46th President of the United States…"), then provides the user's question and the found knowledge to the large model. This gives the model sufficient and complete knowledge to answer the question, yielding a more reliable response.

<figure><img src="/files/wuVht3v4pisAiLBymiAM" alt=""><figcaption></figcaption></figure>

### Why is this necessary? <a href="#why-is-this-necessary" id="why-is-this-necessary"></a>

We can liken a large model to a super-expert knowledgeable in various human domains. However, this expert has its limitations; for example, it doesn't know your situation, as such information is private and not publicly available on the internet, and therefore, it hasn't had the opportunity to learn it beforehand.

When you want to hire this super-expert as your family financial advisor, you need to allow them to review your investment records, household expenses, and other relevant data before they can respond to your inquiries. This enables them to provide professional advice tailored to your circumstances.

**This is what the RAG system does: it helps the large model temporarily acquire external knowledge it doesn't possess, allowing it to search for answers before responding to a question.**

Based on this example, the most critical aspect of the RAG system is retrieving external knowledge. The expert's ability to provide professional financial advice depends on accurately finding the necessary information. If the expert retrieves information unrelated to financial investments, like a family weight loss plan, even the most capable expert would be ineffective.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.din.lol/din-cook-data-for-ai/the-concept/comprehensive-network-architecture/service-layer-toolkit-for-ai-agent/rag-retrieval-augmented-generation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
