Existing gaps and opportunities in the market
In the realm of data and its derivative sectors, including data indexing, storage, and analysis, there's a consensus that significant potential remains untapped within the cryptocurrency domain. This potential is particularly pronounced with the proliferation of various Layer 2 solutions and modular blockchain architectures, which promise to enrich the ecosystem of applications built upon them. Moreover, the advancement of AI technology is set to dramatically increase the demand for data, especially structured data, by AI Agents. The usage of data in AI scenarios is expected to become increasingly complex, manifested in two main aspects:
The Growing Convergence with Traditional Financial Markets: The approval of BTC ETFs signifies that mere on-chain data is no longer sufficient for comprehensive data analysis. Traditional finance, stock, bond markets, and even national-level policies are increasingly interlinked with the cryptocurrency sector. Information and data are multidimensional, requiring alignment and analysis based on timestamps to uncover the underlying signals hidden within diverse data sources.
Enhanced Capabilities of AI Agents: The computational power of AI Agents has reached a level where they can analyze highly complex data, with the parameters of large models now in the trillions. Tasks of data analysis complexity beyond human capability are now within the realm of AI's capabilities.
This leads to the pivotal question: Is there a vast repository of data available that matches the analytical capabilities of AI? If the data exists, are there AI Agents capable of assembling various data sources as needed, allowing developers to write prompts for different scenarios? These questions unfold into a broader discussion on the existing gaps and opportunities within both the cryptocurrency and traditional industries in the field of data and AI:
Data Silos and Monopolization: While numerous data service companies, such as Dune Analytics, Nansen, Chainbase, and Alchemy, have indexed data from nearly every blockchain, this data remains centralized. Data that could be indexed once for universal use is instead repeatedly indexed, with no interoperability between different entities, leading to inefficient utilization within the industry. Moreover, off-chain data, especially from platform X (formerly known as Twitter), is gradually becoming monopolized, contradicting the principles of decentralization and trustlessness. Consequently, many data projects are labeled as Web2.5, not fully embracing the crypto-native ethos.
Lack of Unified Data Collection Methods and Standards: Although numerous open-source frameworks exist for processing on-chain data ETL, there's a lack of consensus on data definitions, resulting in disparate analysis outcomes across platforms. For instance, defining "active users" on a blockchain can vary significantly, depending on the analyst's chosen criteria, highlighting the subjective nature of data interpretation. The absence of tools for systematically collecting and organizing data from platform X exacerbates this issue.
Inequitable Valuation of Data: Platforms like Dune and Footprint host community-created data analysis dashboards, centralizing the data and insights generated. However, the analysts and developers who provide these insights often do not receive tangible economic benefits, perhaps only gaining in reputation. Furthermore, platform X's high API usage costs and monopolistic control over data for developing proprietary models like Grok hinder data openness. If platform X were to restrict API access, it would effectively monopolize the data, imposing high costs on all users.
Neglect of Data Needs by Most Agent Projects: While many claim to build Agent platforms, the foundation of Agent construction is vast amounts of data. Without data, these efforts are futile. Data sets the upper limit for AI capabilities, while AI determines the efficiency of data usage. At this stage, building data barriers and tools is crucial, remembering that algorithms, computing power, and data are the immutable triad of the AI industry.
AI Agents as Next-Generation Assets: A comprehensive infrastructure, including data layers, AgentOps, service layers, and application layers, is essential for AI Agents. Each layer harbors immense opportunities, underscoring the need for a holistic approach to developing the next generation of assets in the AI and cryptocurrency ecosystems.