An Open Science Bay for Self-Sovereign Data Flows from Lab to Market
Website
Proposal Wallet Address
0x33359285f30e7b3386de70ca500f4fe27853765b
(opscientia.eth)
One-Sentence Summary
Grant funds are requested to support development of an open-source application for GDPR-compliant self-sovereign scientific data management and peer-to-peer sharing.
Categories Describing the Project
[ ✓ ] - Build / improve applications or integrations to Ocean
Project Overview
An estimate of thousands of petabytes of data on human health, economic activity, social dynamics, and scientific observations of the universe and our impact on it are siloed in legacy institutional web infrastructure.
A key challenge for unlocking scientific data is managing controlled access to sensitive data, such as:
- Personal data provided by a participant in a research study
- Sensitive data that can be used adversarially or for harm
- Proprietary data collected at great expense by an investigator
The emergence of peer-to-peer data storage and standards for decentralized identifiers (DIDs) makes it possible to permanently establish a public records archive in a common web infrastructure accessible to all, regardless of professional/academic status, nationality, language, or age, while respecting the intrinsic self-sovereignty of data providers.
This grant will support the development of a web application for unified permissions management and peer-to-peer sharing of scientific data, directly lowering the barrier to deployment on Ocean markets. The application will be powered by Interplanetary File System (IPFS) for content-addressable peer-to-peer data storage, the IDX API to manage DID indices, and the Ceramic protocol for manipulating encrypted records on IPFS.
Problem:
The current status of scientific data sharing is largely determined by concerns regarding privacy of research participants and the motivation to protect intellectual property.
There currently exists no universal permissions management system associated with a persistent decentralized identifier for scientific data. Scientific access control is currently achieved with centralized storage systems that elevate the risk for single point of failure and significantly limit the free flow of information.
Solution:
DID schemas integrated with standard specifications for scientific data interoperability allow for persistent permissions that grant self-sovereignty to the data provider:
- Participants sharing sensitive personal data can choose the terms of their participation in a research study.
- Researchers can choose when to make their findings publicly available.
- Curators can be empowered to assemble custom collections to address challenging scientific problems.
This grant will support the integration of DID schemas with standardized scientific data specifications and peer-to-peer decentralized storage to unlock scientific data while respecting data provider self-sovereignty.
Neuroimaging data management will be used as the first use-case pilot for integrating DID schemas with the Brain Imaging Dataset Structure (BIDS). The web application accepts new or existing BIDS datasets and assigns DIDs to consenting participants, investigators, or research curators uploading data. Users can see a dashboard of their data and associated permissions, studies they are currently participating in, and open calls for research participation. Consenting research participants retain superseding privileges on their data permissions and can opt-in or -out of a study at any time. Researchers and curators can only see the datasets with consenting participants and can deploy encrypted datasets to IPFS with preset permissions for data sharing. Any researcher can request access to datasets published by an investigator, utilizing the BIDS specification not build sophisticated queries.
A flow chart of the process for deploying a dataset with permissions management on our platform appears below:
What is the Expected ROI?
The proposed application will provide a portal for researchers to onboard participants unto the self-sovereign web, maintaining control over their data and how it’s used. The end goal of this project is to develop a workflow from data collection, analysis, to publication on the Ocean Markets. The expected ROI is anticipated to be >1 as scientific data providers are activated to deploy datasets unto Ocean Markets following the Web 3.0 Sustainability Loop.
Grant Deliverables
This grant requests funds to support:
- Cloud services for storage and hosting
- Development bounties for front-end design, back-end IDX database
- User onboarding & testing
The web application will be live at https://www.openbay.science
Code will be made publicly accessible on Github.
Roadmap
Below is a tentative roadmap for the project, dates should serve as guidance and not as hard deadlines. There are multiple moving parts and the project will remain on a swivel to adapt to evolving circumstances. We anticipate building a simple front-end within the first month of funding, integrations will be added as community events dovetail into the development timeline.
April 1-21 : Neuroimaging community discussion on BIDS-DID standard
April 14: IDX schema specification and documentation
April 30: Live alpha demo with basic front-end
May 1: Publish documentation + white paper for application
May 14: BIDS-DID Extension Proposal (BEP)
June 25: Google Summer of Code Hack-a-Thon: Ceramic API back-end
July 31: Integrate BIDS-DID spec with identity index schema
August : UX Integration & Testing
Sept 1: Closed beta-testing with neuroimaging and behavior labs
Team Members
Shady El Damaty , M.Sc., Ph.D.
- Role : Cognitive Neuroscientist, Founder of Opscientia
- Github : github.com/seldamat
- info on projects at : seldamat.github.io
- Linkedin : linkedin.com/in/seldamat/
Achintya Kumar
- Role : Front-end developer
- Github: https://github.com/Ackintya
Anibal Solon
- Role : Developer
- Github: https://github.com/anibalsolon
Opscientia is a company providing onboarding services and software infrastructure for launching decentralized autonomous organizations on distributed cloud public networks and smart contract blockchains.
The team lead is an active member and contributor of the INCF, OHBM, and Brainhack Community.