DIN: Data Intelligence Network
  • Data Intelligence Network - The Blockchain for AI
    • Overview
    • Purpose and scope of this whitepaper
  • Market and Trend Analysis
    • Overview of the current data trend and market
    • Overview of the current AI trend and market
    • Existing gaps and opportunities in the market
  • Data Layer: All for the Data
    • Data Flow of AI
    • DIN Protocol Architecture
    • Data Collection
    • Data Validation
    • Data Vectorization
    • The Reward Mechnism
  • Service Layer - Toolkit for dAI-Apps
    • LLMOps
    • RAG (Retrieval Augmented Generation)
      • Hybrid Search
      • Rerank
      • Retrieval
    • Annotation Reply
  • Application Layer: The Ecosystem and Product
    • Analytix
    • xData
    • Reiki
  • Tokenomics and Utilities
    • Details about the $DIN Token.
    • Use cases for the token within the ecosystem
  • Future Outlook
    • Roadmap in 2024
    • Future Developments of DIN
      • Data Marketplace
      • The Multi-Agent system(MAS)
  • References
    • Citations and Sources
    • Glossary of Terms
Powered by GitBook
On this page
  1. Data Layer: All for the Data

Data Flow of AI

PreviousData Layer: All for the DataNextDIN Protocol Architecture

Last updated 1 year ago

Data Flow is a machine learning pattern representing the data movement sequence in the AI engineering life cycle.

First, Data is processed layer by layer, as shown in Fig.1, to prepare it for storage, training, etc.

Then, data passes through processing layers as it is stored, refined, and prepared for use in Machine Learning models and applications. In a more functional perspective, the data is then used by different machine learning function groups, as shown below:

A detail for each layer in the above chart is as follows:

Sources

Data sources include:

  • Company Internal Databases

  • Company Internal Files

  • Websites

  • Public Data

  • Smartphone Apps

  • IoT Devices

  • Commercial Data Aggregators

  • Point of Sale

  • Corporate Internal Processes

  • Social Media

  • Data Streams

Capture

Capture mechanisms include:

  • Website Scraping

  • Website and Smartphone Chat Dialogues

  • Website and Smartphone Form Submissions

  • IoT Device Interfaces

  • Commercial Data Aggregator Feeds

  • Corporate Internal Process Feeds

Pipeline

Pipeline processes include:

  • Data Ingestion

  • Data Temporary Storage

  • Data Subscription

  • Data Publication

Databases

Databases include:

  • Data Lakes

  • Sequel Databases

  • Document Databases

  • Graph Databases

ETLs

ETLs Include:

  • Extract Functions: pulling data from selected sources

  • Transform Functions: normalization, regularization, aggregation

  • Load Functions: saving data in formats for use in modeling processes

Models

Model-type category examples include:

  • Artificial Neural Networks

  • Decision Trees

  • Probabilistic Graphical Models

  • Cluster Analysis

  • Gaussian Processes

  • Regression Analysis

Applications

Application examples include:

  • Medical Diagnosis

  • Autonomous Vehicles

  • Chatbot Dialog

  • Image Recognition

  • Face Recognition

  • Product Recommendations

  • Churn Prediction

  • Malware Detection

  • Search Refinement

Data Flow and Functional Groups in AI