Open Science Bay for Self-Sovereign Data Flows from Lab to Market

An Open Science Bay for Self-Sovereign Data Flows from Lab to Market

Website

https://opscientia.com

Proposal Wallet Address

0x33359285f30e7b3386de70ca500f4fe27853765b
(opscientia.eth)

One-Sentence Summary

Grant funds are requested to support development of an open-source application for GDPR-compliant self-sovereign scientific data management and peer-to-peer sharing.

Categories Describing the Project

[ ✓ ] - Build / improve applications or integrations to Ocean

Project Overview

An estimate of thousands of petabytes of data on human health, economic activity, social dynamics, and scientific observations of the universe and our impact on it are siloed in legacy institutional web infrastructure.

A key challenge for unlocking scientific data is managing controlled access to sensitive data, such as:

  1. Personal data provided by a participant in a research study
  2. Sensitive data that can be used adversarially or for harm
  3. Proprietary data collected at great expense by an investigator

The emergence of peer-to-peer data storage and standards for decentralized identifiers (DIDs) makes it possible to permanently establish a public records archive in a common web infrastructure accessible to all, regardless of professional/academic status, nationality, language, or age, while respecting the intrinsic self-sovereignty of data providers.

This grant will support the development of a web application for unified permissions management and peer-to-peer sharing of scientific data, directly lowering the barrier to deployment on Ocean markets. The application will be powered by Interplanetary File System (IPFS) for content-addressable peer-to-peer data storage, the IDX API to manage DID indices, and the Ceramic protocol for manipulating encrypted records on IPFS.

Problem:

The current status of scientific data sharing is largely determined by concerns regarding privacy of research participants and the motivation to protect intellectual property.

There currently exists no universal permissions management system associated with a persistent decentralized identifier for scientific data. Scientific access control is currently achieved with centralized storage systems that elevate the risk for single point of failure and significantly limit the free flow of information.

Solution:

DID schemas integrated with standard specifications for scientific data interoperability allow for persistent permissions that grant self-sovereignty to the data provider:

  • Participants sharing sensitive personal data can choose the terms of their participation in a research study.
  • Researchers can choose when to make their findings publicly available.
  • Curators can be empowered to assemble custom collections to address challenging scientific problems.

This grant will support the integration of DID schemas with standardized scientific data specifications and peer-to-peer decentralized storage to unlock scientific data while respecting data provider self-sovereignty.

Neuroimaging data management will be used as the first use-case pilot for integrating DID schemas with the Brain Imaging Dataset Structure (BIDS). The web application accepts new or existing BIDS datasets and assigns DIDs to consenting participants, investigators, or research curators uploading data. Users can see a dashboard of their data and associated permissions, studies they are currently participating in, and open calls for research participation. Consenting research participants retain superseding privileges on their data permissions and can opt-in or -out of a study at any time. Researchers and curators can only see the datasets with consenting participants and can deploy encrypted datasets to IPFS with preset permissions for data sharing. Any researcher can request access to datasets published by an investigator, utilizing the BIDS specification not build sophisticated queries.

A flow chart of the process for deploying a dataset with permissions management on our platform appears below:

What is the Expected ROI?

The proposed application will provide a portal for researchers to onboard participants unto the self-sovereign web, maintaining control over their data and how it’s used. The end goal of this project is to develop a workflow from data collection, analysis, to publication on the Ocean Markets. The expected ROI is anticipated to be >1 as scientific data providers are activated to deploy datasets unto Ocean Markets following the Web 3.0 Sustainability Loop.

Grant Deliverables

This grant requests funds to support:

  • Cloud services for storage and hosting
  • Development bounties for front-end design, back-end IDX database
  • User onboarding & testing

The web application will be live at https://www.openbay.science

Code will be made publicly accessible on Github.

Roadmap

Below is a tentative roadmap for the project, dates should serve as guidance and not as hard deadlines. There are multiple moving parts and the project will remain on a swivel to adapt to evolving circumstances. We anticipate building a simple front-end within the first month of funding, integrations will be added as community events dovetail into the development timeline.

April 1-21 : Neuroimaging community discussion on BIDS-DID standard

April 14: IDX schema specification and documentation

April 30: Live alpha demo with basic front-end

May 1: Publish documentation + white paper for application

May 14: BIDS-DID Extension Proposal (BEP)

June 25: Google Summer of Code Hack-a-Thon: Ceramic API back-end

July 31: Integrate BIDS-DID spec with identity index schema

August : UX Integration & Testing

Sept 1: Closed beta-testing with neuroimaging and behavior labs

Team Members

Shady El Damaty , M.Sc., Ph.D.

Achintya Kumar

Anibal Solon

Opscientia is a company providing onboarding services and software infrastructure for launching decentralized autonomous organizations on distributed cloud public networks and smart contract blockchains.

The team lead is an active member and contributor of the INCF, OHBM, and Brainhack Community.

2 Likes

@AlexN

Deliverable Checklist Update
OCEAN DAO ROUND 4

[x] Cloud Services for Storage and Hosting

  • We’ve secured a partnership with Textile to host up to 250TB of data on Filecoin free of charge for up to 1.5 years. Project description here.

[x] Development Bounties for Front-End Design, Back-End IDX database

[x] User onboarding & testing

  • We gathered user-research feedback from 5 scientists at the OHBM Brainhack. From this, we validated current pain points around academia and data. We gathered initial feedback on our Data Wallet proof-of-concept and sourced questions and comments to flesh out our FAQ’s.

  • Further to the OHBM hackathon we carried out user-research in the community and have gathered 12 responses which have validated pain points, levels of understanding, and feedback on our vision. This further helped us to generate our FAQ’s and generate high-level user-requirements for the Opsciverse.