⏳
DIN: AI Agent Blockchain
English
English
  • ABOUT DIN
    • ⏳ Overview
    • 🛣️ Our Journey
  • The Concept
    • 💡 Market and Trend Analysis
      • Overview of the current data trend and market
      • Overview of the current AI trend and market
      • Existing gaps and opportunities in the market
    • 🏠 DIN Architecture
      • 🟡Data Layer: All for the Data
        • Data Flow of AI
        • Data Collection
        • Data Validation
        • Data Vectorization
        • The Reward Mechnism
      • 🟩Service Layer: Toolkit for AI-Agent
        • LLMOps
        • RAG (Retrieval Augmented Generation)
          • Hybrid Search
          • Rerank
          • Retrieval
        • Annotation Reply
      • 💙Application Layer: The Ecosystem and Product
        • Analytix
        • xData
        • Reiki
  • How DIN works
    • ⛓️DIN Blockchain
      • 🌏Mainnet
      • 🧪Testnet
    • 🏤DIN Foundation
      • Team&Advisor wallet
      • MM & Liquidity wallet
      • Community wallet
      • Investors wallet
      • Ecosystem wallet
    • 💰 Tokenomics and Utilities
      • Token Allocations
      • Airdrop
      • Contract Details
      • Use cases for the token within the ecosystem
  • HOW TO JOIN
    • 🧲xData Explained
    • ⚙️Chipper Node Explained
      • How to run Chipper Node
      • Farm xDIN
      • Delegation
        • Revoke delegation
        • As an Operator
      • Node Stats
      • Smart Contract Addresses
    • 🤑Earn $DIN
    • 💹Staking
    • 🌉Buy $DIN
  • ROADMAP
    • 🎆 2025 Forward
由 GitBook 提供支持
在本页

这有帮助吗?

  1. The Concept
  2. 🏠 DIN Architecture
  3. Data Layer: All for the Data

Data Collection

上一页Data Flow of AI下一页Data Validation

最后更新于10个月前

这有帮助吗?

In AI, gathering data is a big hurdle slowing down progress. A lot of machine learning projects work is about getting the data ready. This includes collecting, cleaning, analyzing, showing it visually, and preparing features. Collecting data is the toughest of all these steps for a few reasons.

First, when machine learning is applied to new areas, there often isn't enough data to train the machines. Older fields like translating languages or recognizing objects have a ton of data collected over the years, but new areas don't have this advantage.

Also, with deep learning becoming more popular, the need for data has gone up. In traditional machine learning, much effort goes into feature engineering, where you need to know the field well to pick and create features for training. Deep learning makes this easier by figuring out features independently, which means less work in preparing data. But, this ease comes with a trade-off: deep learning usually needs more data to work well. So, finding effective and scalable ways to collect data is now more critical than ever, especially for big language models (LLMs).

Fig.1 shows a high-level landscape of data collection for machine learning. The sub-topics that the community can decentrally contribute are highlighted using green text.

The network rewards data collection nodes based on the data quality (this quality assessment standard is automatically determined by the network, that is, with the help of the validator node).

The validator node is permissionless, which ensures that the more people participate in network construction, the more robust the entire network will be.

Anyone can help the entire DIN network collect on-chain and off-chain data through the two dApps in the ecosystem, Analytix and .

🟡
xData
Fig.1 landscape of data collection