The Endgame for the Brave: A Crypto Native Foundation Model

PondGNN

Pond is building a decentralized Graph Neural Network (GNN) model for Web 3.0. This model is designed to learn on-chain behaviors and predict future behaviors. Developed by leading data scientists and machine learning researchers, our model is the first of its kind and will support crypto-native use cases such as on-chain trading, leveraging on-chain liquidity, social networking, security, and more.

The on-chain space is a vast graph network. Our vision is to be the first project to integrate an on-chain behavioral model across a wide range of applications to enhance the Web 3.0 user experience, democratize access to machine learning on-chain, and leverage on-chain liquidity to a whole new level.

Below, we will provide an overview of the challenges of on-chain AI, the need for GNN-specific models, and the extensive use cases possible with Pond.

AI model development is always the game for the brave. Join our early ecosystem program to ensure that you can start developing with and using the model from the outset, and receive exclusive rewards:
https://cryptopond.xyz/question

What is an AI Model?

An AI model is an algorithm that can learn and make decisions from a given dataset without any human intervention. They’re capable of analyzing vast amounts of data, demonstrating enhanced performance over time as the model improves and the amount of input data grows. Through iterative training, human inputs, careful supervision, and exposure to diverse datasets, these models gradually become more autonomous and capable of making decisions independently based on the patterns and knowledge they have acquired.

Category of AI and How Model Works

Category of AI and How Model Works

What significant improvements can AI models offer Web 3.0?

In Web 3.0, on-chain data is essential for refining AI models. This data primarily consists of complex interactions between user accounts and smart contracts, which naturally lends itself to a graph-based representation. Wallets, smart contracts, DIDs, and other entities form nodes in a graph, connected by various types of edges, whether social, financial, or otherwise. This inherent graph structure reflects the intricate dependencies and interactions within the blockchain, offering a unique opportunity for analysis and prediction.

The creation of AI models using this data presents a tremendous opportunity. As the information density and richness of these models increase, we will see a proliferation of new AI model-based Web 3.0 applications and integrations.

Pond is developing an industrial-scale GNN model and enabling different developers to collaborate on this model by merging their models together.

On-chain Behaviors and Their Predictions

On-chain Behaviors and Their Predictions

Graph Neural Networks (GNNs) are a type of neural network architecture specifically designed to work with graph-related data. Just as GPT models are trained on massive language data, GNNs are trained on massive graphs. However, whereas GPT aims to predict the next word, GNNs are aimed at predicting complex relationships. This supports a wide variety of use cases, which we will elaborate on below.

Price Predictions: The most direct use-case for on-chain behavioral models is assessing market sentiment and generating future price predictions.

  • AI automated on-chain trading.
  • Analysis of market sentiment from on-chain movements.
  • Crypto asset price predictions.
  • AI models integrated by leading crypto trading markets to increase efficiency and level the playing field for access to market intelligence.

Artificial Intelligence Finance (AiFi): AI model integration to liven existing DeFi applications and create new primitives.

  • AI models trained for price predictions for leveraging DEX liquidity that allows people add liquidity to LPs of DEXs to increase yields.
  • AI models trained to develop advanced yield strategies.
  • AI models trained to respond to anomalous on-chain behavior.
  • AI models predicting on-chain events, i.e., airdrops, NFT mints, and token launches.

Security: AI model analysis of on-chain behavior and anomalies for active, rather than reactive, on-chain security.

  • AI-Agents using models to monitor abnormal on-chain behavior.
  • Consumer applications leveraging the best on-chain behavioral models to include built-in security as part of the user experience.
  • Democratize, simplify, and enhance on-chain auditing for individual users to review abnormal transactions and events.

On-Chain Marketing: Predicting on-chain behavior is valuable for discovering on-chain marketing and advertising opportunities.

  • Model analysis of on-chain behavior as the foundation of the Web 3.0 marketing stack.
  • Models that market new protocols to users based on their on-chain behavior, i.e., NFTs, trades, presale participation, and yield positions.
  • Models that combine on-chain user behavior with social media and external data to predict future marketing opportunities.
  • Recommendation engines using on-chain data.

Social-Fi and Gaming: Predicting on-chain behavior in social-fi and gaming applications is useful for game developers as well as social applications.

  • Model analysis of on-chain user gaming experiences and purchases of in-game items.
  • Model analysis of user social behavior such as tipping and minting within Social-Fi applications, i.e., Farcaster.

Model development process

AI models consist of a collection of mathematical formulas. Their development requires meticulous design to ensure that inputs yield desired outputs. Each type of AI model has its nuances and trade-offs in terms of complexity, interpretability, scalability, and suitability for different tasks. For example, in some cases, labeled data is not needed, while in others, it is essential. Some models learn optimal decision-making strategies through trial and error, while others focus on learning patterns and relationships.

The goal in model development is to balance functionality with accuracy. Two key elements are essential for model development:

  • a) Data (model inputs)
  • b) Model Architecture

To illustrate the relationship between model design and the development process, consider the analogy of directing a torchlight towards a target in a cave using mirrors. Two important objectives emerge: positioning mirrors at the correct angles to reflect the light and understanding the target's location. Just as achieving the right reflection involves both understanding the target's location and the precise adjustment of mirrors, developing an AI model requires accurate data interpretation and careful design of the model's structure. This is achieved through feedback loops of continuous training (choosing the right torchlight and its angle), testing, and modification (changing the torchlight and its angle).

Analogy of AI Model Development in Web 2.0

Analogy of AI Model Development in Web 2.0

Building AI models with on-chain data

As model development takes precedence over traditional data analytics, as exemplified by the latest @ycombinator batch, where many models have been built or fine-tuned rapidly and inexpensively using existing frameworks, we must consider: What AI model should crypto use? What is a crypto-native model for blockchain?

Building AI Models is faster and cheaper than you probably think | Y Combinator

First and foremost, native models are built using the inherent data from a specific industry – a robotics model is built with robotics data, and a voice model is built with voice data. Therefore, the foundation model for blockchains must be built from on-chain data. While the pace of model development in crypto has been hampered by the lack of mature frameworks, the key to accelerating progress lies not just in developing such a framework but in reimagining the very essence of the model itself.

The open nature of permissionless networks leads to several challenges when developing AI models to predict on-chain behavior (movements and activities between wallets, wallets to contracts, and contracts to contracts). For example, detecting abnormal behavior, developing advanced yield strategies, and creating on-chain recommendations are all particularly challenging due to the complex interactions between user accounts and smart contracts. Below is a list further describing the challenges researchers face when building a crypto-native model on blockchain data.

Analogy of AI Model Development in Web 3.0

Analogy of AI Model Development in Web 3.0

  1. Complex and Noisy Data with High Dimensionality: On-chain data is often unstructured and complex, containing levels of noise that make it difficult to process and analyze. This data can include various types of transactions, smart contracts, market anomalies, and other on-chain activities that need to be organized and structured in a way suitable for modeling.
  2. Lack of Standardized Datasets: The AI industry tends to generally lack standardized evaluation frameworks and benchmarks. In Web 3.0, this is further exacerbated by the fact that most protocols will not publicly share their underlying datasets/code. Each protocol must develop its evaluation frameworks from scratch.
  3. Evolving Technology: Due to the fast-changing nature of permissionless networks, it is difficult to create models that remain relevant and accurate over time.
  4. On-Chain vs. Off-Chain Data: The strongest models will require considering factors beyond the blockchain, such as social media activity and macroeconomic market influences. In practice, this adds significant complexity and can be challenging to implement.
  5. Scalability: The volume of data generated by blockchain networks can be enormous, especially from popular chains like Solana. This makes it challenging to create and maintain a model that can efficiently handle such large datasets—not including cross-chain or multi-chain datasets.
  6. Limited Resources and Tooling: There are limited tools and resources available for developing AI models for crypto. The lack of resources can make the process time-consuming and expensive. Often, developers are required to create custom solutions or adapt existing tools and data dimensions to fit their needs.

Although developing prediction models is challenging, the good news is that protocols, such as Space and Time, have emerged to help index on-chain data and smart contract events. Protocols like SpaceandTime lessen the resources needed for researchers to train models.

If your model is not first, it's last

The arms race for the top-performing on-chain behavior prediction model will result in a competitive moat for the winning team. At Pond, we recognize the advantage of developing the first (and best) on-chain behavior predictions model.

_Here’s why:

  1. Training Data Edge: AI models require substantial amounts of data to learn and enhance their performance, and their development and engineering processes are highly specialized. Extensive training and experimentation on tailored data is a competitive data that’s hard for competitors to compete with.

At Pond we use static data such as Earliest Activity Time of Address, Token Initial Mint Amount, Historical DEX Trading Count and dynamic data such as Transaction Volume per Address and Trading Volume and Count of Each Pair. Data selection and processing both require time for experimentation and analysis.

  1. Network Effects: As more users and applications adopt a particular AI model, additional data is generated, which improves the model's performance, attracts more users, and creates a positive feedback loop. Additionally, the model provider gains a deeper understanding of behaviors generated from using the model.

With a behavioral model, using the model can generate further behaviors that may appear chaotic to others. However, from the model provider's perspective, these behaviors can be understood and used to further improve the model based on the behavioral iterations caused by model usage.

  1. First-Mover Advantage: In some cases, the first AI model to achieve significant success in a particular domain will establish a dominant position, making it difficult for subsequent models to gain traction. This can be due to brand recognition, composability, user familiarity, and the friction caused by having to switch to a new model.
  2. Talent: The AI industry is highly competitive, and top AI researchers and engineers are in high demand. Once the first-mover is established, they will have the advantage of attracting top talent.

Join our early ecosystem program to ensure that you can start developing with and using the model from the outset, and receive exclusive rewards: https://forms.gle/9MQuwaBGqy84UnkV8

For more information, please follow us @PondGNN

Special thanks to @lachlanalextodd from @0G_labs and @NFTSWIMM3R for putting everything together.
Huge thanks to @spark_ren, @mheinrich from @0G_labs, @sanlsrni from @ritualnet, @nick_emmons from @AlloraNetwork, @sgershuni from @cyberfund, @tommyeastman21 from @DCGco, @brendanplayford and @calanthiaaa from @getmasafi, @blockhiro from @_inceptioncap, @kkrrayy from @Presto_Labs, @0xMattness from @Dither_Solana, @sicko1993 and @rezoshm for their reviews.