DataUnion - OceanDAO round 7

Key Project Data

  • Name of project: DataUnion

  • Team Website (if applicable):

  • Proposal Wallet Address (*mandatory): 0x655eFe6Eb2021b8CEfE22794d90293aeC37bb325

  • Current country of residence (*mandatory) Germany

  • Contact Email (*mandatory)

  • Twitter Handle: @DataUnionA

  • Discord Handle: Robin (

  • The proposal in one sentence: The company creates a two-sided market and economy for crowdsourced data to enable long and short term benefits of AI for market and economy participants.

  • Which category best describes your project? Pick one or more.

  • [x] Build / improve applications or integrations to Ocean

  • [x] Unleash data

  • Funding Amount: 32.000 $OCEAN

  • Current Remaining Grant Treasury Balance: 0

  • Have you previously received an OceanDAO Grant?: Yes

Project Overview

Description of the project:

This proposal is for continued funding of DataUnion’s development.
Here is an update for OceanDAO round 7 to explain our current status (links to YouTube):
OceanDAO round 7 update
The funds of the proposal are used to reward our engineering team for their work.

What problem is your project solving?

This is the final app humanity needs - it gives us the ability to contribute to AI and robotics and enable humanity to profit from the data this technology needs. We want to give everyone the ability to use their data for a better future and their own profit.

Here is our complete pitch deck from the Token Engineering course in May 2021 which gives a good impression of our thought process after that month of training and thinking.

This project is about creating an image dataset that is created and owned by the people that contribute to it. These contributions include uploading the data, annotating it as well as verifying that the data and the annotations are correct. By doing this the contributors are motivated to contribute highest quality content as they will be rewarded with shares of the dataset they contributed to. We are baking this slowly and release the features separately to have the maximal amount of feedback from our community.

So this covers the creation part of the project. But to refinance this process we have to find customers. The customers can create Data Bounties to increase the incentives for the creation of specific machine learning ready images. Customers can also bring their own datasets and ask only for annotation and/or verification by the contributors.

In the most important option for customers they can train algorithms on parts of the data via Ocean Protocol’s compute-to-data technology. In this way they can train object recognition algorithms without the risk of the data to leak thus the project remains in full ownership of the data. This is done via a marketplace on top of Ocean that we create and contains previews and evaluation of parts of the data (via a search engine on tags and annotations) as well as algorithms. The sale of these assets will be done via the liquidity pool of the whole dataset.

So the contributors not just benefit from their contributions but also from the usage of their contributions. This creates long term incentives to maintain control over the created tokens. This project is a showcase of ownership economy based on the economy creation capabilities of Ocean Protocol.

As a next step on our journey we are now starting to incubate new data unions using our technology and concepts as well as simulation models. There are already VisionTherapy and SenseNation in the Ocean ecosystem that are using our advice and expertise to create their own data unions. Here we are also taking the approach to work bottom up with them and help them where they need us.

What is the final product (e.g. App, URL, Medium, etc)?

Our alpha is already online, please go ahead and check it out.

How does this project drive value to the Ocean ecosystem?

Let’s do a ROI calculation - this round we are focusing on facts rather than speculation:

Currently there are 610,553 $OCEAN locked (TVL) in the liquidity pool on the Ocean marketplace. Additionally we also have an active Twitter account, a Discord and a Telegram community to bring in additional users into the Ocean ecosystem. And we have 203 users contributing on our alpha website. The unleashing of data is happening continuously, check it out in our dashboard.

bang = 610,553 $OCEAN TVL

buck = 119350 $OCEAN - this is explained in more detail in the financial section of the proposal, we assume 32.000 $OCEAN for this proposal here

(% chance of success) = 100% as we are talking about the current situation

ROI = 610553 $OCEAN / 119350 $OCEAN * 1.0 = 5.11 which shows that the ROI of the project after this grant is above the demanded ROI of 1.0

Project Deliverables - Category

  • Parts of the software are open-source with a permissive license at:

  • Webapp is available at

  • Data will be made available on Ocean Market via our current dataset via a data sales portal

  • Mobile app will be live in the iOS and Google App Stores. There might be delays due to the app store checks.

Data asset statistics:

End of round 6:

Start of round 7 - we got a beautiful template now as well, thanks @blockchainlugano :

Project Deliverables - Roadmap

Any prior work completed thus far?

Here is a special update for OceanDAO round 7 to explain our current status (links to YouTube):

OceanDAO round 7 update

A lot of the work done in the project so far was recruitment of the team, organisational work, looking for funding and organising the work. We also published our alpha now and have made significant progress with our mobile application. The team now has 15 team members with different strengths to move the project forward. The work is organised via Trello and we have a communication infrastructure via Discord. We also have a social media presence and a website.

What is the project roadmap? That is: what are key milestones, and the target date for each milestone.

  • New annotation tools and mechanisms (ongoing)
  • Onboard new data unions to our tech and concepts (ongoing)
  • Release a mobile application that has the mechanisms of our web app and Swipe-AI (Q3 2021)
  • Release the Data Portal (Q3 2021)
  • Internationalize the mobile app (Q3 2021)
  • Simulate the token value flow using simulation tools (Q3 2021)
  • Include NLP to translate annotations and cater to a worldwide audience (Q4 2021)
  • Increase decentralisation of our solution by moving the control to smart contracts (Q4 2021)
  • Algorithm training and sales via our Data Portal - have the data providers become co-owners of that as well (Q4 2021)
  • Allow addition of data while it resides on different storage (2022)
  • Recruit more people and facilitate development via OceanDAO grants (ongoing)
  • Potentially launch a governance token (Q4 2021)

Please include the milestone: publish an article/tutorial explaining your project as part of the grant (eg medium, etc).

Our alpha website includes tutorial videos explaining the project.

Any maintenance?

These prototypes are the beginning of the journey for the project. There will be a lot more steps after that but we want to first validate our idea via the crypto enthusiasts of the Ocean community (via the web based version) and after that for other target audiences in Nigeria, South Amercia, India and Thailand (via the mobile version).

For us it is important to proceed in an agile manner and to proceed from working prototype to next working prototype to get validation and feedback from our community.

Foreseen or possible additions?

Big additions in the future will be an independent rating system for the quality of the data via state of the art machine learning algorithms. Another validation system for the quality will be algorithm creation competitions a la Kaggle where the winning algorithms will become part of our marketplace.

The contributors that worked on the data as well as the liquidity provider of our pool will get a share in these algorithms’ liquidity pools to benefit from the results of their hard work.

These algorithms will also enable pre labeling of freshly uploaded data to reduce annotation times (basically annotation becomes quality control) and the sale of pretrained models for transfer learning.

After this is setup and working we will move to the next data type - either text or audio data.

Are there any mockups or designs to date?

Check our alpha

An overview of the technology stack?

Frontend: React.JS (web), React Native (mobile)

Backend: Flask + Python libraries, CouchDB (+ PouchDB)

If the project includes community engagement:

Running the campaign on social media for how many weeks?

We will actively promote each phase with datatoken incentives for participants.

For the webapp we will focus on the Ocean community.


At the moment we are including more and more community members in the creation of content and the project itself. We work with an agile work model where tasks are distributed to team members and these are then rewarded. This opens our team up to a large number of participants.

Team members

The team members are listed on our webpage in the order of them joining the project.

Here is an introduction video by our engineering team - “Hey, OceanDAO :slight_smile:” (links to YouTube):
Introduction to the DataUnion engineering team

Additional Information

Market situation?

Currently there are no end-to-end solutions for the capturing, annotation and machine learning training available on the market. Especially not on this scale and with the involvement of a worldwide workforce.

Any grants or fundraising to date?

We got 74.915 $OCEAN in the funding rounds so far. 7435 $OCEAN from the Ocean Shipyard program and won 5000 $OCEAN in the Ocean Datatoken hackathon.

Other costs are bootstrapped by Robin, approximately 180.000 $OCEAN as a personal investment into the project. Most of this sits in the datatoken pool but we also invested a good amount to buy 100 QUICRA-0 from the pool to reward the contributors in our initial challenges.

Time and tokens spending

At the moment we do not reward anyone in the management team for their hours. Over the course of the project, we have been working on this for 7 month now, around 1300 hours were spent by the management team.

We do reward the engineering team and there have been around 4000 hours spent by the engineers on the product so far. This means that we are very cost efficient with an average hourly rate of ~20 $OCEAN for our engineers.

We started with the whole team being part timers but now we already have six full time engineers working towards our goals. A huge step forward for six month in the project.

Customer acquisition and business model?

Our team member Florian is currently the CTO of DELL Technologies for Unstructured Data and in this position he is very well connected to all the big players in the industry (e.g. Nvidia, Intel, Microsoft, VW, …). We will leverage his connections and skills as well as Robin’s technical sales/stakeholder skills as a product owner to onboard clients in the future. Additionally we also onboarded Mark who will help us with his experience in business development and market analysis.

Social implications of the solution?

If this project succeeds it will change the life of many people in dire situations on our planet. As soon as they get access to a mobile phone and internet they can start making a living. In Venezuela 200$ can feed a family for one month, we expect to be able to reward much, much more than that for a full month of contributions. So it is a global game changer that will enable these contributors to take care of their families right now but also retain datatokens for the future to create a passive income for retirement. Something that was not available to them ever before.
We connected to potential users in Caracas, Venezuela in May 2021 to get a first impression of their needs, limitations and willingness to help. The feedback is going to shape the project to fit to their needs.


Today we coordinated with the first data union that is building on top/together with DataUnion - VisioTherapy (they got a grant in OceanDAO round 6 - [Proposal] VisioTherapy: Building an exercise quality dataset using a community of physiotherapists at professional rugby and sports clubs).

Our common goal is to release mobile data collection and annotation apps this month and support VisioTherapy in building their own mobile app.

DataUnion has onboarded four more developers to push in this direction this week. This is possible due to the token support by VisioTherapy which also is the first income of DataUnion from another company than OceanDAO! A big step forward and not even for the data but for the technology that we are creating.

We are looking forward to develop a framework to help more and more data unions to fruition with our tech and to learn how to support small teams to focus on their core business while still being able to build an ecosystem and community around their data and algorithms.

1 Like

DataUnion is working with the university of applied sciences of TheHague in a UI/UX course to get insights, ideas, and feedback from a group of over ten students. Last week we gave feedback on their ideas and concepts. They were really refreshing, impressive, and will help us to shape the apps (mobile, web) in new ways.

Here is one proposal (LoFi draft version) by Antonina Sadovnikova which adds personal profiles, friends and more community features.

Additional ideas included the following concepts:

  • gamifying the validation process on the mobile phone by combining TicToc and Guitar Hero in one interface
  • adding the option to form friend circles and messaging among them
  • adding an overall championship mode about who contributed how much

And these are just some examples - the final presentations are due in the end of June. We will then be able to share more details about them.

Things are heating up in the project - we just onboarded four new trial developers. Let’s hope that we can use that additional workforce to push out more results!
Very exciting.

1 Like

Today we applied for a grant from Gitcoin as well - check it out here (you will have a DejaVu when you see the content).
We are looking forward to connect to a new community and to see what connections we can make through this grant proposal! (And if you want to chip in you can do that by funding us directly via the grant proposal)

Today we had the final presentations of our UI/UX student group from the applied university of TheHague - the ideas are pointing in very interesting directions but overall we want to continue with two of them to help us shape the next versions of our web and mobile apps. Here is a collection of their proposals:


We published a new version of our website in an updated design and added four more bounties:

  • NFT and Art Bounty
  • Optical Character Recognition Bounty
  • Meme Bounty
  • Products Bounty


And an update on our user statistics - we are now starting a community on Reddit as well. Let’s see if this community of communities can bring more users to our project as well.

Fantastic progress with your initiative – it would be great to explore how we can connect Solipay’s systems with yours via Ocean – perhaps with development work or data integrations!

1 Like

We probably should have call then to discuss this further :slight_smile:
Thank you for scheduling a call!

1 Like

I really like where this is going as archetype of a Data Union and I will support this project. Thanks for keeping it up and growing each day.

1 Like

Since undoubtably you’re the controller of 0x655eFe6Eb2021b8CEfE22794d90293aeC37bb325

and since this address controls 44% of OCEAN in the pool that you’re claiming has 610k OCEAN,

You should lower your BANG number by 44%:

610553 * 0.56 = 341909 OCEAN

I don’t think it makes sense to count your own money as a return of investment bang. It should only be the funds that you’ve achieved to gather from the community.

Thank you so much for your wonderful comment @TimDaub, it is great that you care so much about the community to comment on a few of the proposals to try to correct them with your opinions.

TVL means Total Value Locked - the total value locked here is 610.533 $OCEAN.
This is the metric we are going for here. Where the value locked comes from is not important here.

But thank you again for your concern. If you want to establish a new definition of TVL, then there is a plethora of projects out there in the DeFi space that would hopefully be happy to get evangelised by you to change the numbers in their TVL calculation to consider the creators of a pool to not be counted into their TVL calculation.

From my understanding, the bang number in the ROI calculation is supposed to represent the total market value that your project is capable of capturing. The point of the ROI calculation is to give voters an idea of your market’s size and your project’s potential. There are more details on this here.

I’m noticing that you’ve added (TVL) after I’ve written my post. But in any case, the problem with using TVL in the case where you count your own investment as ROI is problematic.

Imagine a project that receives an oceanDAO grant of 1000 OCEAN and then proceeds to invest these 1000 OCEAN into their liquidity pool. Then, in the next round they claim to have a:

For the calculation of ROI that is expected ROI = bang / buck * (% chance of success), it’ll always yield ROI >= 1 if bang = buck as 1000/1000 = 1 and chance of success = 1.

We are calculating this ROI always with taking into regard the current round of funding. As you can see in the calculation.

So if a project can get more TVL during one round and the next one than it requests in the next one that would satisfy the condition of having an ROI of >1.0

For the calculation of ROI that is expected ROI = bang / buck * (% chance of success) , it’ll always yield ROI >= 1 if bang = buck as x/x = 1 and chance of success = 1.

The goal is to have an ROI of > 1.0.

Even if there would be a deduction of our own share of $OCEAN the ROI would exceed 1.0 by a huge amount. So I really do not understand your point here. It seems like you are just trying to troll us over and over again.

This is a false accusation and I think you should take it back.

That makes sense and I encourage you to continue doing that. However:

Round 1:

  • Project X raises 1000 OCEAN
  • Project cannot make a ROI calculation based on TVL as project doesn’t exist yet.
  • Puts 1000 OCEAN in own liquidity pool

Round 2:

  • Project X wants to raise another 1000 OCEAN; buck = 1000+1000 = 2000
  • Project starts considering own 1000 OCEAN TVL (bang) from R1 in pool as ROI bang
  • Proposed calculation is: Bang = 1000, Buck = 1000+1000 OCEAN, change of success = 1, the ROI is 1000/2000 * 1 = 0.5

For ROI >= 1, the project would have to increase their TVL for each round by the amount of buck. Not saying that it’s impossible or whatever. But I think it’s important to understand this detail in this type of calculation.

Edit: In any case, I think the calculation would be significantly stronger if you just deducted your own investments, since as you said (even after -44%) it anyways returns a ROI >= 1.

Thank you again for your great advice.

We have been doing these calculations already in the last round in the same way. There would be no problem for us to pull out some numbers about a potential ROI or something but we are proud that we have a REAL ROI and not a theoretical ROI.

Round 1:

  • Project X raises 1000 OCEAN
  • Project cannot make a ROI calculation based on TVL as project doesn’t exist yet.
  • Puts 1000 OCEAN in own liquidity pool

Round 2:

  • Project X wants to raise another 1000 OCEAN; buck = 1000+1000 = 2000
  • Project starts considering own 1000 OCEAN TVL (bang) from R1 in pool as ROI bang
  • Proposed calculation is: Bang = 1000, Buck = 1000+1000 OCEAN, change of success = 1, the ROI is 1000/2000 * 1 = 0.5

Exactly! If a project would use that metric there would be an ongoing increase of TVL on the marketplace as the project gathers more outside interest for TVL than they are asking from the community. It is great that we can math together so effectively.

FYI, an index like the S&P500 is capitalization-weighted and only counts the shares outstanding:

A capitalization-weighted (or cap-weighted ) index , also called a market-value-weighted index is a stock market index whose components are weighted according to the total market value of their outstanding shares.


Shares outstanding are all the shares of a corporation that have been authorized, issued and purchased by investors and are held by them. They are distinguished from treasury shares, which are shares held by the corporation itself, thus representing no exercisable rights.

I doubt that it’s directly applicable in this case - but just to show that it’s being considered in finance.
I’ve come across this information when doing research for RPI and I thought it made sense. I know this is beyond the usual scope of ROI calculations but I guess that more sophisticated investors and analysts appreciate this attention to detail. After all, we’re one of the “older” projects.

But as I said, even with -44% deducted, your ROI is >= 1. So this is merely constructive feedback to improve your calculation.

1 Like

[Deliverable Checklist]
[X] What has been promised in this proposal has been delivered as we continued to push forward with our project between round 7 and 8.