DataUnion Foundation proposal for OceanDAO round 10

Part 1 - Proposal Submission

Name of project: DataUnion Foundation
The proposal in one sentence: Fuel data collaborations for data-centric AI

Description of the project and what problem is it solving:

Problem

Responsible AI requires high data quality in respect of bias and fairness. Yet the marginal costs to solve the long-tail problem in AI increase.
Over the past decade, the digital economy has become highly concentrated and prone to monopolization.
SDG 10 - Reduced inequalities
Target 10.1: By 2030, progressively achieve and sustain income growth of the bottom 40 percent of the population at a rate higher than the national average (Growth rates of household income).
From a B2B point of view the collaboration on data asset projects is complicated and ambigious - new tools are needed.

Solution

Using blockchain technology to establish DataUnions that enable local income, quality data, and data co-ownership.

Grant Deliverables:
[ ] Mobile App V2 & gamification concept
[ ] Ocean provider setup
[ ] Fork & customisation of Ocean marketplace
[ ] Medium blog post about the project and its potential
[ ] Concept for a RPG NFT for the Ocean ecosystem based on Ocean’s creatures
[ ] Payment API for datatokens in our backend

Which category best describes your project? Pick one.
Build / improve applications or integrations to Ocean

Which Fundamental Metric best describes your project?
Total Value Locked

What is the final product (e.g. App, URL, Medium, etc)?

Our alpha is already online, please go ahead and check it out.. The mobile app is available in the Google Play store. The Data Portal is available here. There will also be other data unions in our DataUnion ecosystem.

How does this project drive value to the “fundamental metric” and the overall Ocean ecosystem?

Let’s do a ROI calculation - we are using facts rather than speculation as we are ROI positive already:

Currently there are 612,948 $OCEAN locked (TVL) in the DataUnion.app liquidity pool on the Ocean marketplace. Additionally we also have an active Twitter account, a Discord and a Telegram community to bring in additional users into the Ocean ecosystem. And we have 431 users contributing on our alpha website. The unleashing of data is happening continuously, check it out in our dashboard.

bang = 612,948 $OCEAN TVL

buck = 208780 $OCEAN - this is explained in more detail in the financial section of the proposal, we assume 66.666 $OCEAN for this proposal here

(% chance of success) = 100% as we are talking about the current situation and the TVL is there

ROI = 612948 $OCEAN / 139350 $OCEAN * 1.0 = 2.93 which shows that the ROI of the project after this grant is above the demanded ROI of 1.0

Funding Amount: 50,000.00 USD
Proposal Wallet Address: 0xF9ac73f30dBe52c10e3d5950db66357f9d0be44D
Have you previously received an OceanDAO Grant?: Yes

Team Website (if applicable): https://dataunion.app
Twitter Handle: @DataUnionA
Discord Handle: DataUnion Foundation#1870
Contact Email: info@dataunion.app
Current country of residence: Singapore

Part 2 - Team:

Core Team:

Robin Lehmann

Dr. Mark Siebert

  • Role: partnerships, business development
  • Relevant Credentials (e.g.):
  • Data publishing (10yrs)
  • Business Development for Data Markets
  • Owning and driving global executive engagements and partnerships, Data and Open Science
  • Web3 experience: < 1 Year
  • Positioning businesses in emerging markets or innovative fields with focus on data and AI-driven solutions.
  • https://www.linkedin.com/in/dsiebert/

Sarah Kay

Akshay Patel

  • Role: developer
  • Relevant Credentials (e.g.):
  • Background/Experience:
    • Backend developer in the finance industry
    • Ocean Protocol Ambassador
    • Ocean Protocol Bounty hunter

Okpo Ekpenyong

Part 3 - Proposal Details

What problem is your project solving?

This is the final app humanity needs - it gives us the ability to contribute to AI and robotics and enable humanity to profit from the data this technology needs. We want to give everyone the ability to use their data for a better future and their own profit.

Here is our new pitch deck for the DataUnion ecosystem.

In our pitch deck we are describing how new data unions can be created using the DataUnion tools and methodology on top of Ocean Protocol. This enables more and more data unions to be created and join forces to develop a common stack of tools and help each other to be successful. This reduces the overhead costs for the individual data union and allows them to focus on their domain specific topics and their community instead of having to develop the same set of tools and processes again.

As an example data union we are building an image dataset that is created and owned by the people that contribute to it. These contributions include uploading the data, annotating it as well as verifying that the data and the annotations are correct. By doing this the contributors are motivated to contribute highest quality content as they will be rewarded with shares of the dataset they contributed to. We are baking this slowly and release the features separately to have the maximal amount of feedback from our community.

So this covers the creation part of the project. But to refinance this process we have to find customers. The customers can create Data Bounties to increase the incentives for the creation of specific machine learning ready images. Customers can also bring their own datasets and ask only for annotation and/or verification by the contributors.

In the most important option for customers they can train algorithms on parts of the data via Ocean Protocol’s compute-to-data technology. In this way they can train object recognition algorithms without the risk of the data to leak thus the project remains in full ownership of the data. This is done via a marketplace on top of Ocean that we create and contains previews and evaluation of parts of the data (via a search engine on tags and annotations) as well as algorithms. The sale of these assets will be done via the liquidity pool of the whole dataset.

So the contributors not just benefit from their contributions but also from the usage of their contributions. This creates long term incentives to maintain control over the created tokens. This project is a showcase of ownership economy based on the economy creation capabilities of Ocean Protocol.

As a next step on our journey we are now starting to simulate the ecosystem and individual pools. There are already VisionTherapy and SenseNation in the Ocean ecosystem that are using our advice and expertise to create their own data unions. Here we are also taking the approach to work bottom up with them and help them where they need us.

Project Deliverables - Category

Data asset statistics:

Start of round 9:


Start of round 10:

Project Deliverables - Roadmap

Any prior work completed thus far?

Check our alpha web app, alpha data portal and alpha mobile app.

A lot of the work done in the project so far was recruitment of the team, organisational work, looking for funding and organising the work. We also published our alpha now and have made significant progress with our mobile application. The team now has 9 team members with different strengths to move the project forward. The work is organised via Trello and we have a communication infrastructure via Discord. We also have a social media presence and a website.

We managed to get into the Celo camp accelerator as one of 31 teams from ~300 applications. This helps us to access the mobile first user base of Celo which is targeted towards unbanked people all around the world - exactly whom we want to contribute to our datasets.

We started prototyping avatars for a role playing game NFT based on Ocean’s creatures for the Ocean ecosystem - items can be earned through actions e.g. via voting, providing liquidity, creating outreach videos, holding tokens or contributing to DataUnions. The purpose of this is to bring a piece of NFT culture into the ecosystem and to enable the Ocean ecosystem as well as DataUnion to reward users with item NFTs for contributing. A full concept will be a delivery for this month’s grant. Here is a prototype with a sample item set:

What is the project roadmap? That is: what are key milestones, and the target date for each milestone.

  • New annotation tools and mechanisms (ongoing)
  • Onboard new data unions to our tech and concepts (ongoing)
  • Recruit more people and facilitate development via OceanDAO grants (ongoing)
  • Release Swipe-AI (Q4 2021)
  • Simulate the token value flow using simulation tools (Q4 2021)
  • Increase decentralisation of our solution by moving the control to smart contracts (Q4 2021)
  • Algorithm training and sales via our Data Portal - have the data providers become co-owners of that as well (Q4 2021)
  • Allow addition of data while it resides on different storage (2022)
  • Include NLP to translate annotations and cater to a worldwide audience (2022)
  • Launch a governance token (2022)

Please include the milestone: publish an article/tutorial explaining your project as part of the grant (eg medium, etc).

Our web app and Data Portal include tutorial videos explaining the project.

Any maintenance?

These prototypes are the beginning of the journey for the project. There will be a lot more steps after that but we want to first validate our idea via the crypto enthusiasts of the Ocean community (via the web based version) and after that for other target audiences in Nigeria, South America, India and Thailand (via the mobile version).

For us it is important to proceed in an agile manner and to proceed from working prototype to next working prototype to get validation and feedback from our community.

Foreseen or possible additions?

Big additions in the future will be an independent rating system for the quality of the data via state of the art machine learning algorithms. Another validation system for the quality will be algorithm creation competitions a la Kaggle where the winning algorithms will become part of our marketplace.

The contributors that worked on the data as well as the liquidity provider of our pool will get a share in these algorithms’ liquidity pools to benefit from the results of their hard work.

These algorithms will also enable pre labeling of freshly uploaded data to reduce annotation times (basically annotation becomes quality control) and the sale of pretrained models for transfer learning.

After this is setup and working we will move to the next data type - either text or audio data.

Are there any mockups or designs to date?

Check our alpha web app, alpha data portal and alpha mobile app.

An overview of the technology stack?

Frontend: React.JS + Vue.JS (web), React Native (mobile)
Backend: Flask + Python libraries, CouchDB (+ PouchDB), MongoDB

If the project includes community engagement:

Running the campaign on social media for how many weeks?

We will actively promote each phase with datatoken incentives for participants.

For the web app we will focus on the Ocean community, the mobile app will be promoted more widely. We joined the Celo camp to expand to a mobile first blockchain and get access to a user base in developing countries.

Other?

At the moment we are including more and more community members in the creation of content and the project itself. We work with an agile work model where tasks are distributed to team members and these are then rewarded. This opens our team up to a large number of participants.

Additional Information

Market situation?

Currently there are no end-to-end solutions for the capturing, annotation and machine learning training available on the market. Especially not on this scale and with the involvement of a worldwide workforce.

Any grants or fundraising to date?

We got 129.679 $OCEAN in the funding rounds so far. 7435 $OCEAN from the Ocean Shipyard program and won 5000 $OCEAN in the Ocean datatoken hackathon. We are in the process of aqcuiring additional funding from other sources but the funds have not arrived with us yet.

Other costs are bootstrapped by Robin, approximately 180.000 $OCEAN as a personal investment into the project. Most of this sits in the datatoken pool but we also invested a good amount to buy 70 QUICRA-0 from the pool to reward the contributors in our initial challenges.

Financial and time spent

At the moment we do not reward anyone in the management team for their hours. Over the course of the project, we have been working on this for 11 month now, around 4500 hours were spent by the management team.

We do reward the engineering team and there have been around 12000 hours spent by the engineers on the product so far. This means that we are very cost efficient with an average hourly rate of ~10 $OCEAN for our engineers.

To see what the tokens were spent for, please check our wallets here and here - all transactions have been made from there. And except the buying of the $QUICRA-0 as rewards and some money for Token Engineering courses (~600$) every other token has been spent on the engineering team.

Social implications of the solution?

If this project succeeds it will change the life of many people in dire situations on our planet. As soon as they get access to a mobile phone and internet they can start making a living. In Venezuela 200$ can feed a family for one month, we expect to be able to reward much, much more than that for a full month of contributions. So it is a global game changer that will enable these contributors to take care of their families right now but also retain datatokens for the future to create a passive income for retirement. Something that was not available to them ever before.
We connected to potential users in Caracas, Venezuela in May 2021 to get a first impression of their needs, limitations and willingness to help. The feedback is going to shape the project to fit to their needs.

4 Likes

Steep roadmap, in my opinion adequate for the amount raised! I like the NFT idea, it’s cool & different.

Maybe we can link this with the proposed “Ocean Market Impression Mining” as a top prize for the highest engagements on educational Ocean Market content.

Also, the objective to drive adoption of partnering with other communities is promising. We’ve experienced a great engagement from similar activities and believe it’s the right way forward.

On another note (in my opinion), maybe it’ll be better to include swap fees generated in your pool to the ROI and exclude publisher liquidity. I think that would be a better representation of “network value”.

Cheers!

3 Likes

Thank you for the kind words and suggestions. For some of the items for the RPG NFT we were actually thinking about Data Whales involvement in the ecosystem e.g. a video camera for your character because you are creating outreach videos for all of us.

We are going to add more data assets to Ocean Protocol in the future and are also actively looking for consumers but we don’t think that TVL can be replaced with swap fees in the near future - in the long run it will of course outscale it.

Love to see the progress and where this project is evolving into.
I hope one day when our product is ready for production we can offer SSI users a choice, to either have A.I. come to them, or earn money by feeding into your Unions! (or perhaps even both)

Good to hear extending your team was successful, as it is not always easy as we all know to attract talent.
I like the idea of the NFT’s combined with education about the ocean ecosystem, is this a game open to anyone or only to those contributing large amounts of data? (would love to get myself a seahorse)

As for a special character for Data Whale I have to say I could not agree more. For the short amount of time we have been part of the Ocean ecosystem he has already challenged us (in a good way) and I see he replies to almost every project out there, keeping the engagement up and increasing the quality of any project he interacts with by giving constructive feedback!

2 Likes

Thank you for your proposal @Robin.

WRT ROI Calculations

I was just reviewing the ROI calculations for projects and am very happy to see you are summing all Ocean granted over time as part of your calculation.

I believe part of the TVL includes funds that have been raised initially by you, and that is great! I know you have also invested a significant amount of money into the project, and I believe this will grow over time.

However, I think it would be constructive to make this information more easily distinguishable. How?

this is explained in more detail in the financial section of the proposal

In your ROI/Fundamental Metric section, you mention there is a financial section of the proposal (which I found at the bottom) that explains how much you’ve risen, invested, etc, but this section isn’t very well labelled. If I search for financial or something similar, I only get 1 hit.

In addition, it doesn’t go into detail WRT the TVL, which I’m very interested in. Mostly because I’m confident you’ll be successful, and am curious to see this growing/tracked.

Perhaps the proposal could be structured a little bit better so I can more easily process this information.

WRT NFTs

I’m excited to see what you’re developing here, especially as we’ve spoken about this.

I understand NFTs can potentially help onboarding many participants, increasing DCV (maybe the NFT is an Ocean ERC-721), and could perhaps drive various metrics forward.

Having said that, I think it would be constructive to very clearly separate the funds that will be used for this project, or perhaps start another project altogether with the team that will help drive this forward.

Otherwise, perhaps a simple plan/estimate for how the NFT project will have an impact could be communicated, so I understand a bit more of the timeline, or how it fits into the overall picture. It seems like there are some mission-critical deliverables in your roadmap (Ocean Provider, Marketplace & Payment API), and would like to understand how things fit a bit better.

Finally

I’m really excited to be following your journey, and am very eager about the future of DataUnion Foundation inside of the Ocean Ecosystem!

All the best in R10!

Re: WRT ROI Calculations

We renamed the section “Time and tokens spending” into “Financial and time spent” to make it more clear how the grants are spent. But we also think that it is time to create a spreadsheet overview like e.g. Datawhale is maintaining for their finances as now other expenses are coming into play. So far 99% of the tokens were purely spent on rewarding the collaborators for their time to engineer our products. This will be an additional deliverable.

Robin’s tokens are now invested into DataUnion Foundation which now also owns the liquidity pool. The initial invest done was to provide the initial liquidity of 117.000 $OCEAN to show that the liquidity pool is serious a large sum was added back then. None of this liquidity has been withdrawn to show curators that the pool is not at all in risk of being rug pulled.

As the metric we are going for is TVL we see any token in the pool as counting towards total value locked in a pool on Ocean Protocol, no matter where it originates. Removing any tokens would result in impermanent loss for the other pool participants which we want to avoid. Discounting our tokens seems not logical as we can’t remove them to not hurt the remainder of the pool participants so they are truly value locked.

V4 is not backwards compatible with V3 so latest at that point a recreation of the pool is required, also because we wanted to switch to compute-to-data from dataset but this enhancement was also not backwards compatible. Robin tried to find a solution for the liquidity migration but paid back the grant due to not being able to find a fair solution for pool participants in a six month period. We are still not sure how to migrate to the new pools with additional features without creating a race about leaving the pool. Hopefully simulations with tokenSPICE can yield more insights.

Re: WRT NFTs

The concept that we added as a deliverable for this grant will clarify why we think that this can be part of our project and does not require an additional grant/project. But we will clearly state how much time and finances will be required to go from concept to final product. We are happy that you want to know more about it and will provide this information as our deliverable - giving some quick answers here does not do the idea merit, there has to be more thought and more research involved during the upcoming month. But your input is already very valuable and we will share a draft with you as soon as we have one.

Thank you for the kind words and wishes.

1 Like

Thank you for the response @DataUnion!

Re: WRT ROI Calculations

We renamed the section “Time and tokens spending” into “Financial and time spent” to make it more clear how the grants are spent.

Thank you for the response and quick turn around! I feel it’s much easier now to find that info, and keep track of it.

But we also think that it is time to create a spreadsheet overview like e.g. Datawhale is maintaining for their finances as now other expenses are coming into play. So far 99% of the tokens were purely spent on rewarding the collaborators for their time to engineer our products. This will be an additional deliverable.

I’m fairly confident with how the team is managing expenses, but appreciate the willingness to share more details. I’m sure this will only improve the overall understanding of the finances.

The initial invest done was to provide the initial liquidity of 117.000 $OCEAN to show that the liquidity pool is serious a large sum was added back then. None of this liquidity has been withdrawn to show curators that the pool is not at all in risk of being rug pulled.

Awesome to see that number! I actually think it’s awesome to see Robin invest so much of his own assets into the platform, and can’t think of a more engaging metric! Thank you!

As the metric we are going for is TVL we see any token in the pool as counting towards total value locked in a pool on Ocean Protocol, no matter where it originates.

I completely agree! I just think it’s good to make that distinction, especially if Robin or anyone else internally continues to re-invest back into the LP. Again, I can’t think of a more bullish metric than the team investing back on themselves, for the long run

V4 is not backwards compatible with V3 so latest at that point a recreation of the pool is required,

Completely understand, and am very patient and understanding of any required steps to eventually resolve the migration. I look forward to the community collaborating around the evolution from V3 to V4.

Re: WRT NFTs

The concept that we added as a deliverable for this grant will clarify why we think that this can be part of our project and does not require an additional grant/project.

Thank you for this comment and providing more clarity. I feel that it’s good to over-emphasize this, as to not confuse anyone. My thoughts here is that this will catch people’s attention, and perhaps confuse/concern them WRT how funds are being utilized.

Although I’m really eager to know more about it, I just wanted to get some clarity WRT to how this track overlaps w/ DataUnion app, and to ask the team to perhaps be a bit clearer WRT to resources from the grant, so that it doesn’t muddle the very exciting journey that DataUnion Foundation is undertaking.

Thanks again for your response, and for your proposal for R10!

All the best!

Hey DataUnion Team,

We just submitted our vote for this proposal. Good luck on the journey and most definitely we’d love to have you guys over on YouTube very soon.

Thanks also for taking the time to review our Proposal on the Port. Let us know whether you have any comments.

Cheers,
Data Whale

1 Like

Thumbs up, voted for it and crossing fingers! Keep up the work. :slight_smile:

1 Like