Coral: A self-sovereign back-end for research data-management powered by Ocean Protocol
Part 1 - Proposal Submission
Name of Project
Coral
Proposal in one sentence
Grant funds are requested to support the development of an open-source application for GDPR-compliant self-sovereign scientific data management and peer-to-peer sharing.
Description of the project and what problem is it solving
Thousands of petabytes of data on human health, economic activity, social dynamics, and scientific observations of the universe and our impact on it are siloed in legacy institutional web infrastructure.
Key challenges for unlocking scientific data include workflow gaps, infrastructural capacities, and cultural inertia, such as:
- Expensive ingress/egress fees with traditional cloud storage
- Insufficient tooling for dataset management, preprocessing, and archival
- Lack of easy-to-use interoperable workflows & protocols for sharing data
- Needs for dataset provenance that ensure requested content is the content that is received
- Institutional compliance and regulatory protocols that gate-keep sensitive data based on academic credentials
- Cultural inertia for laboratory procedures and protocols
- No rewards for sharing data, enhanced risk of being “scooped”
(Reference: Science Research Data Management Needs Report 1)
The emergence of peer-to-peer data storage and standards for decentralised identifiers (DIDs) makes it possible to permanently establish a public records archive in a common web infrastructure accessible to all, regardless of professional/academic status, nationality, language, or age, while respecting the intrinsic self-sovereignty of data providers.
Our team is collaborating with Protocol Labs and Textile to build open source dataset processing and archival tools for archiving up to 250TB of Open Science Data on the decentralised file storage network, Filecoin, free of charge for 1.5 years.
Opscientia plans to establish the largest collection of high quality open source data that can be found on Web3 - a decentralised data commons that preserves critical knowledge for future generations via sustainable Web 3 incentive loops.
Our mission is to make fundamental scientific observations and insights open to global citizens that are united by a vision for collective scientific discovery. A key missing piece of this vision is a data marketplace that allows curious netizens to search, find, and execute computation on datasets defined with standard specifications that support interoperable workflows.
We are requesting the support of the OceanDAO to build Web3’s first Open Science Data Marketplace, Coral. Ocean Protocol will constitute the middleware for publishing, searching, and executing workflows for standard specification datasets.
The base specification for Ocean data tokens provides a flexible scaffold to build elaborations on machine-readable scientific research objects, such as datasets, experiments, and executable notebooks. Ocean Market tokenomics present a potential solution for supporting the overhead of public science goods and may provide an alternative to traditional academic funding schemes. Our team is focusing on building on these blocks, one at a time, while simultaneously assessing the needs and perceptions of the scientists and community members that use the platform.
Aside from product development, funding indirectly supports the following Web3 Open Science programs:
-
The Open Web Fellowship - a “permissionless” scientific grant mechanism that rewards curiosity, basic research, and encourages building on Ocean Protocol
-
The Opsci DeSci Collider - a DAO-Affiliate program that serves to crowd-source scientific data and expertise, building a stronger network of DAO2DAO relationships that go beyond art and decentralised finance. Some existing affiliates include vitaDAO, Active Inference Lab, Planetary Resilience DAO, dClimate, BrainsAtPlay, and DeSchooled.
Grant Deliverables
Product & Services
- [ ] Version 1 redesign of front-end with a landing page for scientist-friendly documentation
- [ ] Version 1 tutorials for publishing assets on Coral
User Research & Design Requirements
- [ ] Survey materials for scientist feedback on data publishing workflow, user experience, and pain points
- [ ] Continuing user requirement design and feedback on Coral architecture
Technical Research & Design Reports
- [ ] Technical research for custom back-end architecture running on a test network with support for scalable data storage and computation
- [ ] Research and analysis of decentralised file storage integration with Ocean Protocol stack
- [ ] Research report on sustainable tokenomics for an inclusive open data marketplace
Which category best describes your project?
Build/improve applications or integrations to Ocean
Which Fundamental Metric best describes your project?
Other - Our primary goal is to empower the scientific community to share, analyse, and review data with web3 tools. To reflect this goal, we propose the number of scientific research objects that are created on our platform as a new metric. These projects will be composed of datasets, algorithms, models, and other scientific digital objects, some of which are data tokens. We seek to replicate the OceanDAO model for funding scientific projects pre-registered on our platform. Our product directly influences Ocean’s ROI through the Ocean Marketplace because our data objects will utilise the Ocean Protocol ecosystem.
What is the final product?
The Coral Market will be a fully functioning Ocean Market fork with an intrinsic token that is used to create data tokens that correspond to digital research objects such as datasets, pre-computed weights for ML, and scientific protocols/experiments. Our team will expand the data token metadata specification to include compatibility with a wide set of scientific self-descriptions following the Open Science Framework. The final product will allow researchers all over the world to pre-register their hypotheses, and tokenise their intellectual assets.
How does this project drive value to the “fundamental metric” (listed above) and the overall Ocean ecosystem?
The Coral Market is a critical back-end component of the “Opsciverse” - a one-stop-shop for scientists to access cloud services for reproducible open science with big data sets. We will use this time and money to develop comprehensive research and specifications for the building we have started this month.
As part of our grant deliverables, we will generate critical feedback for the OceanDAO to understand current data needs and problems from scientists’ perspectives. We currently have a research agreement with Textile, Filecoin, and MIT/Dartmouth to identify decentralised file storage challenges and solutions for big data neuroscience laboratories. We are building tools to migrate up to 250TB of neuroscience data unto decentralised file storage, and making this open data directly available to over 20,000 neuroscience researchers around the world. We expect exponential growth of research projects over a period of 5 years beginning in Q1 2022 as we tailor cloud service needs to researchers to scale adoption.
ROI: buck
If awarded, this project will have received a total of 51320 $OCEAN. Our current deliverables for this round include establishing a Coral data market place demonstrator for community and scientist review, continuing our work to establish data bridges with scientific research institutions (MIT/Dartmouth), and expanding user research with potential scientific data providers and consumers.
ROI: bang
We expect our efforts will provide a template for other teams of researchers and scientists looking to build on Ocean. If we capture 10% of the 800+ neurotech labs we have identified, we can expect them to follow our template to unleash their data. Each research project will include multiple data token objects such as data, models, protocols etc. An example of the average cost of gathering a neuroimaging dataset is ~$800 USD per participant. A research laboratory collects upwards of 100 participants per project (lower bound based on typical statistical power required for neuroimaging sample size analyses). We can expect each dataset to be worth 133.333 $OCEAN at current prices. For example, if 10% of the 800+ labs identified follow our template to contribute data to the Ocean Market, we can expect a total value of 10.666.640 $OCEAN (bang) in neuroimaging data staked on the marketplace.
The hypothetical ROI following this model results in a value of >200 (bang/buck) with a 100% chance of success, >100 with a 50% chance, >40 with a 20% chance, >10 with a 10% chance, >1 with a 1% chance. We believe the chance of success of the realized outcome described above grows significantly based on when in time success is assessed, specifically increasing significantly towards the end of the grant’s roadmap for deliverables.
ROI: [bang / buck] x p(success)
Therefore, bang for buck for this proposal can only stand to benefit the Ocean DAO.
Funding Requested
USD$20,000
Proposal Wallet Address
0x057c9a25f1302484Bb34C9CEB6d3BC69Bd319e01
(opscientia.eth)
Have you previously received an OceanDAO Grant?
Yes
Team Website
Twitter Handle
@opscientia
Discord
Email address
Current Country of Residence
Opscientia LTD. is a Singapore registered company.
Part 2: Team
Core Team:
Shady El Damaty , M.Sc., Ph.D.
- Role: Cognitive Neuroscientist, Project Lead, Opscientia Founder
- Github: https://github.com/seldamat
- Website: https://seldamat.github.io
- Linkedin: https://www.linkedin.com/in/seldamat/
- Past Experience: Neuroscientist & Big Data Engineer at Georgetown University
Fellows:
Kinshuk Kashyap, Fellow
- Role: Software Engineer
- Github: kinshukk (Kinshuk Kashyap) · GitHub
- Linkedin: https://www.linkedin.com/in/kinshuk-kashyap-32a4747b/
- Past Experience: Google Summer of Code Scholar
Achintya Kumar, Fellow
- Role: Software Engineer
- Github: Ackintya (Achintya Kumar) · GitHub
- Linkedin: https://www.linkedin.com/in/achintya-kumar1/
- Past Experience: Opscientia Open Web Fellowship
Caleb Tuttle, Fellow
- Role: Software Engineer
- Github: calebtuttle · GitHub
- Website: https://calebtuttle.github.io
- Linkedin: https://www.linkedin.com/in/caleb-tuttle-20bbb2126/
- Past Experience: Software Engineer at Startup, TaxSlayer
Jakub Smekal, Fellow
- Role: Researcher
- Github: smejak (Jakub Smékal) · GitHub
Part 3: Proposal Details
Project Deliverables - Category
- The app will be live, at: https://market.opsci.io
- The project is open-source and can be found (with a permissive license if necessary) at: Opscientia · GitHub
Software overview:
We will be forking Ocean Market, utilising the current middleware. We will also develop documentation to ensure industry best practices are uniform throughout development. The main aim of this project is to research and architect the backend of an Open Data marketplace and carry out user research in order to develop accurate user requirements for our build phase.
Community engagement:
- Qualitative user research will be conducted with 5+ scientists. This will be on-going as we continue to interview scientists and collect feedback.
- Social media (Twitter & Linkedin) will be used to disseminate our message and engage with the wider community.
Project Deliverables - Roadmap
Any prior work completed thus far?
- Preliminary community outreach has been completed thus far. Basic research on architecture, self-descriptions, and market place branding has been completed in Round 9.
What is the project roadmap?
This grant will bootstrap the October, November, and December phases of our development lifecycle. Articles will be published to the news page of our website to update the community on our happenings.
Team’s future plans and intentions
We plan to request funding from OceanDAO to complete our product development pipeline from September to December. We will post updates and links to deliverables for community feedback at the end of each month. Our goal is to build the critical infrastructure for our Open Science research platform, starting with the Web3 back-end.
Additional Information
This grant is a first step in building a decentralised science platform running self-governed, owned, and automated science activities on-chain.