Posthuman AI Marketplace V2 on custom hardware

Name of Project:

Posthuman AI Marketplace V2

Proposal in one sentence:

The aim of this proposal is to develop Posthuman AI Marketplace (v2) as a full fork of the ocean marketplace, with appropriate NVIDIA A100 GPU clusters for training and inference of advanced AI models.

Description of the project and what problem is it solving:

A full fork of the ocean market, which will be hosted as a dapp on www.posthuman.network . Our first priority is providing inference for large AI models, allowing AI engineers from across the world to sell verifiable inference on their models, without any risk of model parameter leakage. Currently, AI is developed privately by each organisation, and there is very little if any collaboration - causing reinvention of the wheel multiple times for each organisation, slowing down AI innovation drastically - as there is no direct way to sell AI inference in the open market, without leaking the parameters. We envision shattering the siloed development environment present in AI development today, in favour of a highly collaborative, incentivised and thriving open marketplace.

Grant Deliverables:

  • Successfully deploy fork of Ocean Market
  • Successfully deploy Ocean Market backend (aquarius/provider) on custom hardware
  • Successfully run inference from GPT-J (6B Parameters) using C2D on the custom marketplace.
  • Publish libraries to allow easy plug in play publication of any one of the 150+ transformer model types supported by Huggingface. [https://huggingface.co/models]

Which Project Category best describes your project? Pick one.

  • Build / improve applications or integrations to Ocean

Are you applying for an Earmark? Pick one.

  • General
    [We will be applying for core-tech - seamless training and inference publication of AI models for the next grant, if approved]

What is the final product?

A full fork of the ocean market, which will be hosted as a dapp on www.posthuman.network.

Which one or more of the criteria will your project focus on? Why do you believe your team will do well on those criteria?

  1. Usage of Ocean — Data Consume Volume, # assets published, TVL.
    We will be using these three metrics of usage of Ocean to track our value-add to the Ocean Ecosystem. As a full, separate marketplace, our success can be measured by looking at a combination of how many AI models are being published as assets (represents model variety and availability). Further, actual usage of the models can be measured by way of Data Consume Volume. Finally, investment in the AI models, which can be viewed as projections of potential future earnings, are represented by the TVL in each datatoken representing an AI asset. The cumulative TVL across all assets on Posthuman AI marketplace will represent the total value-add that Posthuman has for the Ocean Ecosystem. Our estimate for this metric is that between $500,000 to $1,500,000 in OCEAN will be locked in our markeplace in the first 3 months after launch.

Funding Requested: 12000 USD

Proposal Wallet Address:
0x21e06646433954aabace8e3d93d502e423249299

Have you previously received an OceanDAO Grant?

Yes, we’ve received 63,007 OCEAN as grants in OceanDAO R4, R5, R6, and R9.

Team Website (if applicable):

Twitter Handle (if applicable):

@posthumannetwo1 [https://twitter.com/PosthumanNetwo1]

Discord:
discord.gg/F76wFfcn

Part 2 - Team

2.1 Core Team

Team members

Dhruv Mehrotra

Role: Core developer - Python, Solidity, including ML backends.

Relevant Credentials:

GitHub: dhruvluci · GitHub 1

LinkedIn: https://www.linkedin.com/in/dhruv-mehrotra-luci/ 1

Gitcoin: @dhruvluci | Gitcoin

Background/Experience:

  • Prototyped first NEAR-ETH bridge in a hackathon in June, 2020. Today supports over $27 Billion in assets.
  • Built and launched 2 dApps, for NFT-collateralised lending and NFT-fractionalisation, with over $50k in revenue.
  • Co-founder/CEO, LUCI [AI information retrieval for enterprise]
  • Patented Bayesian Answer Encoding, state-of-the art in Open Domain QA in 2019.
  • Multiple hackathon winner and leading weekly earner, Gitcoin.

Hetal Kenaudekar

Role : Core developer - Python, Solidity, including ML backends.
Gitcoin: @aranyani01 | Gitcoin
GitHub : Aranyani01 · GitHub 1
LinkedIn : https://www.linkedin.com/in/hetal-kenaudekar-796715178/ 1

Background/Experience:

  • Co-founder/COO, LUCI [AI information retrieval for enterprise]
  • Built and launched 2 dApps, for NFT-collatralised lending and NFT-fractionalisation, with over $50k in revenue.
  • Building an AI Generative Metaverse with NEAR.
  • Solidity dev since early 2020, winner of multiple hackathons and grants.

Joshua N.

  • JS/Frontend dev since 2020, Has worked for leading mobile delivery apps in building key components of their frontend.

https://in.linkedin.com/in/joshuanazareth

Bhargav C.

  • Former white shoe attorney, now builds creative marketing and community building channels for DeFi products.

https://in.linkedin.com/in/bhargav-chakraborty-036548160

Part 3 - Proposal Details

3.1 Details of the proposal:

We started with the templates provided by ocean-py and customised them to our use-case. We completed multiple cycles of testing, using datatokens to run various ML based compute-to-data operations on our custom GPU-Based kubernetes cluster. We provide scripts to replicate our experiments, and to test zero-knowledge fine-tuning, inference, and evaluation in action.

Machine learning Libraries used

Our initial tests were performed using GPT-2; i.e. using the transformers.GPT2LMHead module. Note, the ‘LMHead’ extension allows evaluation by computing loss scores when labels are provided. We had to customize the huggiface scripts to automatically export the model to an s3 bucket and return the address. This output is used by ocean to publish the trained model as a new asset.

Testnet Deployment

For convenience, We’ve stored a pretrained GPT-2 model and the WikiText-2 dataset as downloadable assets on ocean. A person can perform ZK-FT by modifying the provided training algorithm to specify architecture, hyperparams, and training duration. We then set up a C2D service, and associated datatoken pool represented by step-1 in the use case. We utilised the datatoken to purchase training, inference and evaluation, representing use-case #2, #6, and #7 respectively.

Use-Case in Depth

Posthuman.py deployment developed last year as part of the shiphyard program, offers the following functionality. We will we developing further on the same proven functionality. More Info and tests at- GitHub - dhruvluci/Posthuman.py at updated

  1. Alice publishes a GPT-2 model in a compute to data environment.
  2. Bob buys datatokens and runs further training on the WikiText-2 dataset, using the train_lm.py algorithim. [compute_service_train]
  3. The updated model (M2)-

i) remains on alice’s machine;

ii) is published as an asset on ocean

iii) Bob is rewarded with datatokens of the newly trained model

  1. Charlie decides to train the model further, purchasing datatokens from Bob, creating demand.
  2. The second updated model (M3) is likewise published as an asset, and a datatoken reward issued to Charlie [compute_service_train]
  3. Derek finds M3 to be sufficiently trained for his commercial use-case. He buys access to the inference endpoints using the DataTokens in Chalie’s Possession, completing the demand loop. [compute_service_inference]
  4. Elena is unsure if the model she is using (M3) is worth what she is paying. She runs an ‘evaluation.py’ C2D request and learns that the model she’s using does indeed have better performance on her dataset than the published SoTA. [compute_service_inference]

Let’s take a closer look at the incentives at play in step-3. The output of the compute-to-data process run by Bob on Alice’s machine is a trained model (model.bin + config.json). These are together published as an asset with compute, owned by Bob. This allows a novel setup of ownership in zero-knowledge. Bob now ‘owns’ the updated model, in that he can mint datatokens that allow further access to this model. However, even Bob cannot see the parameters of this model he ‘owns’, allowing the chain to continue.

There are two other implications of Bob owning a compute asset running on Alice’s Machine:
First, Bob must trust Alice to not leak the model parameters as a separate asset. Second, Alice must somehow be remunerated for the compute requests on Bob’s model, since she is providing the compute.

Both these challenges are solved if Alice is also the Marketplace Owner. First, Bob can trust Alice as her credibility in not leaking models will determine the size of the marketplace and Alice’s eventual earnings. Second, as the marketplace owner, Alice receives a share even when Charlie buys and uses Bob’s datatokens; thus allowing her to continue to provide compute.

Assets

We provide metadata for S3 locations of an original pre-trained GPT-2 model, and for subsequent updated models. These metadata are utilised by the three custom compute_service scripts as needed, to ensure verifiable and blockchain-traceable lineage of any model used.

Training

We focused on developing a reliable yet flexible training script that allows further fine-tuning with minimum effort. Our challenges were to ensure incentives were correctly aligned to encourage people to invest in the development and training of models before they have reached commercial viability. To that end, it is of essence that the object of their investment, i.e. model params, cannot be leaked. By enforcing permissioned secrecy, we allow the value of a model to snowball as ever more people participate in improving it.

Our training script:

  1. loads the latest gpt-2 model, and
  2. trains it for 500 further steps,
  3. Saves the trained model
  4. Exports it to S3

In combination with compute_service_train.py, it also:
5. Exports the updated model as an Ocean Asset owned by the trainer

  1. Mints 200 datatokens of the new model to the trainer.

Evaluation

We developed two algorithms for evaluating the performance of a model:

  1. Algo_evaluation_wikitext.py

  2. Algo_evaluation_custom.py

The first allows users to calculate the loss score on 10 random samples takes from the Wikitext-2 dataset’s ‘test’ section. This serves as a general-purpose evaluation of the language capabilities of a model.

The second allows users to pass in their own custom prompt(s) and see how well the model predicts those. We see this being useful to easily test the level of domain adaptation in a model, i.e. how well it is tuned for its specific (commercial) use-case.

Inference

We provide a simple inference algorithm, algo_inference.py that accepts a string as an argument and returns a continuation. Hyper-params like top-p, top-k, and temperature can be easily modified. As with other examples, inference is entirely zero knowledge, the requester never learns the model parameters

An example of using compute-to-data for running zero-knowledge inference is demonstrated in compute_service_inference.py.

Verifiability/Integrity of Models

At any point, any user, say Derek, can make a compute-to-data request to adversarially evaluate the loss of the model on particular string(s). In this fashion, the real quality of the model being used can be confirmed without question - a feat impossible for current AI-as-a-service providers. The fact that the same model (i.e. the same ‘did’) is used at inference can easily be verified on-chain.

This functionality is demonstrated end-to-end by the compute_service_evaluate.py script.

Reward Mechanism

We explored reward mechanisms revolving around charging a higher marketplace fee, and using the funds to reward early training providers. Such an approach would revolve around issuing a fixed amount of ‘Marketplace tokens’ for every fixed improvement in loss score of the ML models. ‘Marketplace tokens’ would then be entitled to a share of the Marketplace revenue.

However we found that the zk-ownership of datatokens approach would allow pricing each trainer’s contribution automatically based on its market value, rather than posthuman marketplace having to assign anticipatory value on each trainer’s contribution - such as by fixing reward to improvement in loss scores.

This also accounts for all scenarios where loss score isn’t fully representative of learning - scenarios such as overfitting, insufficient randomisation, and superficial language tricks can all improve loss score without improving the utility of the model. Thus, in our approach, we allow an end user to adversarially evaluate the loss on any text(s), and use the results to assign a fair market, ‘utility value’ to each model.

Each trainer then gets a share of the utility value to end users, removing any incentives for trying to game the system by overfitting models or reducing loss by other, non-productive means. Thus, the trainer benefits if there is future demand for fine-tuning/inference of the model he trained. This gain happens in zero-knowledge as the model parameters are never revealed to him; thus compounding the value of the model over time.

Please given an overview of the technology stack.

We’re deploying the core ocean market stack for running C2D on a Ubuntu pytorch AMI with GPU-based hardware; where as the database servers including Aquarius will run on simple CPU-based ubuntu servers.

Along with implementing a separate fork of Ocean Marketplace, we have been developing libraries of algorithms optimised for plug-and-play with any Ocean C2D compatible marketplace. Below we describe the tech stack used in the development of these algorithms so far, as well as the updates we will be providing for the present grant.

Historical Development

We experimented with many reputed libraries for training large transformers, including DeepSpeed2 for very large models, and Reformer for large context sizes. In the end, we decided to utilise the huggingface-transformers library as it is the most versatile, offering hundreds of different kinds of transformer architectures under one library.

Our initial tests were performed using GPT-2; i.e. using the transformers.GPT2LMHead module. Note, the ‘LMHead’ extension allows evaluation by computing loss scores when labels are provided. We had to customize the huggiface scripts to automatically export the model to an s3 bucket and return the address. This output is used by ocean to publish the trained model as a new asset.

Current Development

We’re now going to expand our libraries to include DeepSpeed3, as it allows training models with upto 10 trillion parameters- or 50 times larger than GPT-3. It would serve as a perfect test case of collaborative training - however it requires ~100+ GPUs, while our initial deployment will be limited to clusters with 16 GPUs therefore the abilities will be limited.

3.9 Project Deliverables - Roadmap

Any prior work completed thus far? Details?

Over the previous grants, we’ve published algorithms and assets for use on the existing Ocean Market, to demonstrate our concept of monetising AI models using C2D based inference.

Algorithims

  1. QA-Commercial Algorithms
    Posthuman v1.2 introduced an entirely new library of algorithms, under the title “QA-Commercial”. This is meant to interact with proprietary AI models (developed by Posthuman, LUCI, or other parties), to enable AI question-answering on 1000s of corporate documentation in a matter of seconds. Some of the key features of QA-Commercial:

  2. Integrates with DrQA library for ngram and TF-IDF shortlisting of paragraphs, which are then fed into the pipeline with a custom trained DistilBERT model for question answering on 1000s of pages of documents (instead of just 1 para, as enabled by Posthuman v1 models).

  3. Calculates weighted probability of each pipeline answer to rank them across paragraphs
    into an absolute list of best answers. This required emperical experiments with different weights allotted to ‘answer_start’, ‘answer_end’, and ‘softmax’ probabilities outputed by the model for each answer.

  4. Provides enterprise grade server capabilities for continuous question answering for upto 24 hours at a time. This includes not just a django server, but celery + redis worker allocation for load balancing, enabling rapid, parallel processing of multiple queries at the same time without any errors.

In addition, we also provide custom tokenizers (spacy etc.) for rapid processing of large textual data volumes. With these features, we hope to drive the rapid adoption of these AI abilities by AI companies selling products to end users. The Next Steps section describes our strategies for leveraging the same into revenue.

Assets- If the project has already published data assets:
We’ve posted our initial models (DistilBERT QA + algo) on Polygon Network. Polygon is essential to our plans as each AI request must be comparable with a typical API call in terms of fess.
DistilBERT QA Model - https://market.oceanprotocol.com/asset/did:op:F5caB01B84fcf4d13a07CFE50d4a15CA87DBa869
DistilBERT QA Algo - https://market.oceanprotocol.com/asset/did:op:cb9BD8313b32298495699d6aAD796643Cb651f7d

What is the project roadmap?

  • API access to AI training and inference using Ocean Stack on custom hardware [Done]
  • Posthuman Market (gui) based Access to AI Inference. [March 2022]
  • Template algorithims for plug-and-play selling of AI models by model providers. [March 2022]
  • Lanch documentation and publication of relevant materias on media. [March 2022]
  • Posthuman Market (gui) based access to AI training [April 2022]
  • Direct workflow to publish trained AI models as C2D inference assets in 1 step [April 2022]
  • Useful AI model development & publication Bounties [May 2022]
  • Scaling experiments for models needing more than 8 A100 GPUs (including model paralellism and data parallelism in training and inference algos) [May 2022]

What is the team’s future plans and intentions?

  • Introducing collaborative trading as an extension of the C2D stack core technology stack.
  • When adding AI model training functions, we will also be able to target the EARMARK If I use C2D to train an algorithm, there’s an easy flow to re-publish it right away.
  • Direction of future development after all requisite AI training and inference functionalities detailed in the roadmaps have been implemented, will depend on community demand - for eg. if there is more demand for image, video, or code generation, we will be happy to provide targeted and optimised hardware and algorithms.

3.10 Additional Information

Any additional information to add?

Our team has received over $250,000 in funding, hackathon prizes and grants over the last 6 months. The funding was tranched, and given only upon completion of milestones. As a result, we have developed multiple DeFi applications and primitives on various chains, including NFT-Fractionalisation, novel stablecoins, and novel DeFi credit scoring solutions.

2 Likes

Hi @dhruvluci, thanks for your proposal for R15. Everything is in place except that I’m missing:

  • An e-mail address
  • Country of Recipient

Hi @idiom-bytes , here is the requested info:
Email: dhruv.luci9@gmail.com
Country: India

1 Like

Thank you @dhruvluci, everything is in place now.

All the best in R15!