RAG (Retrieval Augmented Generation)
最后更新于
最后更新于
With vector retrieval at its core, the RAG architecture has become the leading technological framework for addressing two major challenges of large models: acquiring the latest external knowledge and mitigating issues of generating hallucinations. This architecture has been widely implemented in numerous practical application scenarios.
Developers can utilize this technology to cost-effectively build AI-powered customer service bots, corporate knowledge bases, AI search engines, etc. These systems interact with various forms of organized knowledge through natural language input. A representative example of a RAG application is as follows:
In the diagram below, when a user asks, "Who is the President of the United States?", the system doesn't directly relay the question to the large model for an answer. Instead, it first conducts a vector search in a knowledge base (like Wikipedia, as shown in the diagram) for the user's query. It finds relevant content through semantic similarity matching (for instance, "Biden is the current 46th President of the United States…"), then provides the user's question and the found knowledge to the large model. This gives the model sufficient and complete knowledge to answer the question, yielding a more reliable response.
We can liken a large model to a super-expert knowledgeable in various human domains. However, this expert has its limitations; for example, it doesn't know your situation, as such information is private and not publicly available on the internet, and therefore, it hasn't had the opportunity to learn it beforehand.
When you want to hire this super-expert as your family financial advisor, you need to allow them to review your investment records, household expenses, and other relevant data before they can respond to your inquiries. This enables them to provide professional advice tailored to your circumstances.
This is what the RAG system does: it helps the large model temporarily acquire external knowledge it doesn't possess, allowing it to search for answers before responding to a question.
Based on this example, the most critical aspect of the RAG system is retrieving external knowledge. The expert's ability to provide professional financial advice depends on accurately finding the necessary information. If the expert retrieves information unrelated to financial investments, like a family weight loss plan, even the most capable expert would be ineffective.