⏳
DIN: AI Agent Blockchain
English
English
  • ABOUT DIN
    • ⏳ Overview
    • 🛣️ Our Journey
  • The Concept
    • 💡 Market and Trend Analysis
      • Overview of the current data trend and market
      • Overview of the current AI trend and market
      • Existing gaps and opportunities in the market
    • 🏠 DIN Architecture
      • 🟡Data Layer: All for the Data
        • Data Flow of AI
        • Data Collection
        • Data Validation
        • Data Vectorization
        • The Reward Mechnism
      • 🟩Service Layer: Toolkit for AI-Agent
        • LLMOps
        • RAG (Retrieval Augmented Generation)
          • Hybrid Search
          • Rerank
          • Retrieval
        • Annotation Reply
      • 💙Application Layer: The Ecosystem and Product
        • Analytix
        • xData
        • Reiki
  • How DIN works
    • ⛓️DIN Blockchain
      • 🌏Mainnet
      • 🧪Testnet
    • 🏤DIN Foundation
      • Team&Advisor wallet
      • MM & Liquidity wallet
      • Community wallet
      • Investors wallet
      • Ecosystem wallet
    • 💰 Tokenomics and Utilities
      • Token Allocations
      • Airdrop
      • Contract Details
      • Use cases for the token within the ecosystem
  • HOW TO JOIN
    • 🧲xData Explained
    • ⚙️Chipper Node Explained
      • How to run Chipper Node
      • Farm xDIN
      • Delegation
        • Revoke delegation
        • As an Operator
      • Node Stats
      • Smart Contract Addresses
    • 🤑Earn $DIN
    • 💹Staking
    • 🌉Buy $DIN
  • ROADMAP
    • 🎆 2025 Forward
由 GitBook 提供支持
在本页

这有帮助吗?

  1. The Concept
  2. 🏠 DIN Architecture
  3. Service Layer: Toolkit for AI-Agent
  4. RAG (Retrieval Augmented Generation)

Rerank

Why is Rerank Necessary?

Hybrid Search combines the advantages of various search technologies to achieve better recall results. However, results from different search modes must be merged and normalized (converting data into a uniform standard range or distribution for better comparison, analysis, and processing) before being collectively provided to the large model. This necessitates the introduction of a scoring system: the Rerank Model.

The Rerank Model works by reordering the list of candidate documents based on their semantic match with the user's question, thus improving the results of semantic sorting. It does this by calculating a relevance score between the user's question and each candidate document, returning a list of records sorted by relevance from high to low. Common Rerank models include Cohere rerank, bge-reranker, and others.

Usually, there is an initial search before rerank because calculating the relevance score between a query and millions of documents is inefficient. Therefore, rerank is typically placed at the end of the search process, making it suitable for merging and sorting results from different search systems.

However, rerank is not only applicable to merging results from different search systems. Even in a single search mode, introducing a rerank step can effectively improve document recall, such as adding a semantic rerank after a keyword search.

In practice, apart from normalizing results from multiple queries, we usually limit the number of text chunks passed to the large model before providing the relevant text chunks (i.e., TopK, which can be set in the rerank model parameters). This is done because the input window of the large model has size limitations (generally 4K, 8K, 16K, 128K Token counts), and you need to select an appropriate segmentation strategy and TopK value based on the size limitation of the chosen model's input window.

It should be noted that even if the model's context window is sufficiently large, too many recalled chunks may introduce content with lower relevance, thus degrading the quality of the answer. Therefore, the TopK parameter for rerank is not necessarily better when larger.

Rerank is not a substitute for search technology but an auxiliary tool to enhance existing search systems. Its most significant advantage is that it offers a simple and low-complexity method to improve search results and allows users to integrate semantic relevance into existing search systems without significant infrastructure modifications.

上一页Hybrid Search下一页Retrieval

最后更新于4个月前

这有帮助吗?

🟩