Datatera Inspector Functions | Round 15

Part 1 - Proposal Submission

Name of Project: Datatera Inspector Functions

Proposal in one sentence: Datatera is a global marketplace to connect Data Providers and Data Consumers by making larger samples of the high-quality medical datasets available.

Description of the project and what problem is it solving: HealthTech AI companies are facing challenges to get access to qualitative medical datasets while they are building AI models which result in bias and other errors and that takes a lot of time and money to maintain and manage. Datatera will provide a global data computing marketplace where Data Scientists will have the opportunity to train their AI models on high-qualitative and diverse training datasets while preserving privacy.

Grant Deliverables: (Target deliverables for the funding provided.)

  • Sensitive Data Inspector Function powered by AI
    • Output a result of which columns could have sensitive data with a ratio
  • Qualitative Data Inspector Function powered by AI
    • Output a result of how a Qualitative dataset is based on the used KPIs with a ratio

Which Project Category best describes your project? Build/improve applications or integrations to Ocean

Are you applying for an Earmark? Yes, new project

What is the final product? A service data computing marketplace powered by Blockchain and inspection module powered by AI on top to be able to provide quality in training datasets and trustworthy algorithms.

Question on “value add” criteria: which one or more of the criteria will your project focus on? Why do you believe your team will do well on those criteria?

Usage of Ocean and Viability - We believe we can improve and develop the C2D concept with our add-on module to be able to provide full awareness of data sensitivity and data quality. We are a pure tech team specialized mainly in system development and data science so we have the necessary expertise and knowledge to make this happen.

Funding Requested: $3K

Proposal Wallet Address: 0xEB023A03cfebd0a58214CA018c3f25F0c8b96000

Have you previously received an OceanDAO Grant? No

Team Website: http://www.datatera.se

Twitter Handle: https://twitter.com/DatateraTech

Discord Handle: N/A

Project lead full name: Tugce Ozdeger

Project lead email: tugce@datatera.se

Country of Residence: Sweden

Part 2 - Team

2.1 Core Team

Tugce Ozdeger

Role: Developer, CTO, Lead Developer, Architect
Relevant Credentials:
GitHub: TugceOzdeger (Tugce) · GitHub
LinkedIn: https://www.linkedin.com/in/tugceozdeger
Other:
Background/Experience:
Founder at Datatera
10+ years of professional experience as a senior system developer

Pranav Kumar

Role: Developer, Architect
Relevant Credentials:
GitHub: pranavstark79 · GitHub
LinkedIn: https://www.linkedin.com/in/pranavstark/
Other:
Background/Experience:
Co-Founder at Datatera
6+ years of experience as a software developer & software consultant

Tugrul Bayrak

Role: CPO
Relevant Credentials:
GitHub: tbayrak (Ahmet Tuğrul Bayrak) · GitHub
LinkedIn: https://www.linkedin.com/in/ahmet-tugrul-bayrak/
Other:
Background/Experience:
Co-Founder at Datatera
10+ years of experience as a software developer and data scientist

2.2 Advisors

Christina Jenkins
Role: Advisor
Relevant Credentials:
GitHub: cejjenkins (CJ) · GitHub
LinkedIn: https://www.linkedin.com/in/christina-jenkins/
Other:
Background/Experience:
Advisor at Datatera
+14 years experience in data, covering machine learning, mlops, statistics, data analytics and visualization, and leadership.

Part 3 - Proposal Details

3.1 Details

Details of the proposal:

We would like to add a feature where we inspect the dataset and detect the sensitive data by leveraging AI Rule Engine. The corresponding columns in the CSV file format that was detected as sensitive data will be ignored when we run the Compute Job by reading the results of the Sensitive Data Inspector Module in JSON when we configure the dataset path for the given algorithm. In this way, we will provide complete sensitive data security and also the “training data” concept. We will also assess the quality of the data by scanning through the data points to make sure that the main dimensions of data quality exist based on the relevant KPI that was used in the AI Model.

3.2 If in Category “Build/improve applications or integration to Ocean”: App will be live at: Build, Collaborate & Integrate APIs | SwaggerHub

Is the software open-source? We have commercial intentions for this software.

Project software can be found at https://github.com/DatateraTechnology

3.3 If the project includes software:

Are there any mockups or designs to date? If yes, please share details/links.

Datatera Inspection Module.jpg

Tech Stack:

  • Inspector Module Functions in Python
  • Inspector decision making intelligence by AI Rule Engine
  • PyCharm will be used as IDE
  • Inspector Result will be generated in JSON
  • Functions will be published on SwaggerHub

3.4 Project Deliverables - Roadmap

Any prior work completed thus far? Details?

The system architecture, result data structure, and the tech stack details have been decided.

What is the project roadmap? That is: what are key milestones, and the target date for each milestone. Please make sure that one milestone is about publishing your results, e.g. as a medium blog post.

  • Sensitive Data Inspector Function powered by AI development completed & System test started - Apr 15, 2022
  • Qualitative Data Inspector Function powered by AI development completed & System test started - Apr 29, 2022
  • System test by developer completed & publish functions on SwaggerHub - May 13, 2022
  • Test cases and sample datasets will be provided for Acceptance Test - May 20, 2022
  • Acceptance test on Swagger - May 27, 2022
  • Publishing on a social media that we release the beta version - June 30, 2022
  • Beta testers will be informed - July 1, 2022

What are the team’s future plans and intentions? Is there maintenance? Possible extensions to the work?

Yes, we will maintain and develop further and fix bugs/errors since this module will be part of our Datatera solution.

  • Dataset format will be in CSV only from the beginning and we can definitely support more formats e.g. XML, xls, etc.
  • We will probably add more KPI and metrics to be able to better detect the sensitive and qualitative data.

3.5 Additional Information

We are fundraising at the moment for pre-seed and we are also part of a Swedish VC called Antler.

Hi @Tugce, I have registered your proposal however you do not have the 500 OCEAN required for the proposal to be accepted.

You can hold it in BSC or Poly, in addition to ETH Mainnet. You will need this in your 0x address before voting starts on Mar 3rd 23:59 GMT

Your proposal should be automatically accepted once you move 500 OCEAN to the wallet you provided.

Cheers!

I have now 500 Ocean on the wallet I provided. Can you pls confirm?

Hi @Tugce,

I am looking at this wallet:
0x22Ef3F9E2D9cF3f1c237d17E33EDD26cE54A05b2

Eth mainnet - https://etherscan.io/address/0x22Ef3F9E2D9cF3f1c237d17E33EDD26cE54A05b2
BSC - https://www.bscscan.com/address/0x22Ef3F9E2D9cF3f1c237d17E33EDD26cE54A05b2
POLY - Address 0x22Ef3F9E2D9cF3f1c237d17E33EDD26cE54A05b2 | PolygonScan

I can’t spot the 500 Ocean. Our system isn’t able to find it either.

Can you please let me know what I’m doing wrong? Thank you.

I don’t know what is wrong. I have 500 Ocean tokens on my ERC20 Ocean wallet. Can you try to contact me in private?

I have DM’d you. Please find me on Discord if you do not hear from me.

Idiom | Ocean#8791

Hi!
I provide you with more info about the wallet so please check your DM on Ocean Discord.

Hi @Tugce

Thank you for submitting your proposal for R-15!

I am a Project-Guiding Member and have assigned myself to help you. I also confirmed with @idiom-bytes about the wallet issue, that problem is not a blocker anymore. Cheers

I have reviewed your proposal and would like to thank you for your participation inside of the Ocean Ecosystem!

Your project looks promising and I believe it’s aligned with our evaluation criteria of generating positive value towards the Ocean Ecosystem and the W3SL. Healthcare data has the most significant impact on the world due to AI/ML. I appreciate your effort to bringing features that enhance the ocean marketplace experience.

The timelines for deliverables are also noted and i commend on the specificity provided on timelines, tech stack among others as well. I encourage you to stay active in the ocean discord channels during the development process, we are all very excited and looking forward to using the feature!!

Based on the reasons above, I am in support of your project and proposal. I look forward to continuing providing support and feedback to your project.

All the best!
-Trishul, PGWG Guide

Thank you for your message. Looking forward to connecting with you on Ocean Discord.

1 Like

Here were all the wallets shared, and requested by @Tugce for the sake of transparency:

  1. Original - From Crypto.com - 0x22Ef3F9E2D9cF3f1c237d17E33EDD26cE54A05b2
  2. From Robin - 500 Ocean - 0xF40b005FFE2Db0197b8c301e1C966C2cb3B59A08
  3. Requested from Tugce @ end of R15 - 0xEB023A03cfebd0a58214CA018c3f25F0c8b96000 as funding wallet

I’m updating Airtable w/ #3 to save a tx in the funding workflow and sharing here for prosperity.

Project submitted deliverables:

Grant Deliverables:

  • Sensitive Data Inspector Function powered by AI
  • Output a result of which columns could have sensitive data with a ratio
  • Qualitative Data Inspector Function powered by AI
  • Output a result of how a Qualitative dataset is based on the used KPIs with a ratio

We have been working on the architecture of the solution and how possibly we can integrate it with C2D flow on backend. We built the sync between SwaggerHub, GitHub and Azure Web App where we’ll be hosting the API.

We are building Sensitive Data Inspector Function in python where examine what portion of the dataset will be included in the computing and the algorithm will get the access/be authorized based on the Sensitive Data Analysis.

Check GitHub - DatateraTechnology/Datatera at beta for more info and documentation.

Admin: Hi @Tugce,

thank you for submitting an update for your previous proposal!

Your Grant Deliverables have been reviewed and look to be in good condition. I have also looked at your Project Standing, it looks to be in good condition and ready to apply for another grant.

I would like to thank you for your positive contributions to the Ocean Ecosystem and I look forward to reviewing future proposals from your project.

All the best!

-Your PGWG Guide