Part 1 - Proposal Submission
Name of Project: Datatera Inspector Functions
Proposal in one sentence: Datatera is a global marketplace to connect Data Providers and Data Consumers by making larger samples of the high-quality medical datasets available.
Description of the project and what problem is it solving: HealthTech AI companies are facing challenges to get access to qualitative medical datasets while they are building AI models which result in bias and other errors and that takes a lot of time and money to maintain and manage. Datatera will provide a global data computing marketplace where Data Scientists will have the opportunity to train their AI models on high-qualitative and diverse training datasets while preserving privacy.
Grant Deliverables: (Target deliverables for the funding provided.)
- Sensitive Data Inspector Function powered by AI
- Output a result of which columns could have sensitive data with a ratio
- Qualitative Data Inspector Function powered by AI
- Output a result of how a Qualitative dataset is based on the used KPIs with a ratio
Which Project Category best describes your project? Build/improve applications or integrations to Ocean
Are you applying for an Earmark? Yes, new project
What is the final product? A service data computing marketplace powered by Blockchain and inspection module powered by AI on top to be able to provide quality in training datasets and trustworthy algorithms.
Question on “value add” criteria: which one or more of the criteria will your project focus on? Why do you believe your team will do well on those criteria?
Usage of Ocean and Viability - We believe we can improve and develop the C2D concept with our add-on module to be able to provide full awareness of data sensitivity and data quality. We are a pure tech team specialized mainly in system development and data science so we have the necessary expertise and knowledge to make this happen.
Funding Requested: $3K
Proposal Wallet Address: 0xEB023A03cfebd0a58214CA018c3f25F0c8b96000
Have you previously received an OceanDAO Grant? No
Team Website: http://www.datatera.se
Twitter Handle: https://twitter.com/DatateraTech
Discord Handle: N/A
Project lead full name: Tugce Ozdeger
Project lead email: tugce@datatera.se
Country of Residence: Sweden
Part 2 - Team
2.1 Core Team
Tugce Ozdeger
Role: Developer, CTO, Lead Developer, Architect
Relevant Credentials:
GitHub: TugceOzdeger (Tugce) · GitHub
LinkedIn: https://www.linkedin.com/in/tugceozdeger
Other:
Background/Experience:
Founder at Datatera
10+ years of professional experience as a senior system developer
Pranav Kumar
Role: Developer, Architect
Relevant Credentials:
GitHub: pranavstark79 · GitHub
LinkedIn: https://www.linkedin.com/in/pranavstark/
Other:
Background/Experience:
Co-Founder at Datatera
6+ years of experience as a software developer & software consultant
Tugrul Bayrak
Role: CPO
Relevant Credentials:
GitHub: tbayrak (Ahmet Tuğrul Bayrak) · GitHub
LinkedIn: https://www.linkedin.com/in/ahmet-tugrul-bayrak/
Other:
Background/Experience:
Co-Founder at Datatera
10+ years of experience as a software developer and data scientist
2.2 Advisors
Christina Jenkins
Role: Advisor
Relevant Credentials:
GitHub: cejjenkins (CJ) · GitHub
LinkedIn: https://www.linkedin.com/in/christina-jenkins/
Other:
Background/Experience:
Advisor at Datatera
+14 years experience in data, covering machine learning, mlops, statistics, data analytics and visualization, and leadership.
Part 3 - Proposal Details
3.1 Details
Details of the proposal:
We would like to add a feature where we inspect the dataset and detect the sensitive data by leveraging AI Rule Engine. The corresponding columns in the CSV file format that was detected as sensitive data will be ignored when we run the Compute Job by reading the results of the Sensitive Data Inspector Module in JSON when we configure the dataset path for the given algorithm. In this way, we will provide complete sensitive data security and also the “training data” concept. We will also assess the quality of the data by scanning through the data points to make sure that the main dimensions of data quality exist based on the relevant KPI that was used in the AI Model.
3.2 If in Category “Build/improve applications or integration to Ocean”: App will be live at: Build, Collaborate & Integrate APIs | SwaggerHub
Is the software open-source? We have commercial intentions for this software.
Project software can be found at https://github.com/DatateraTechnology
3.3 If the project includes software:
Are there any mockups or designs to date? If yes, please share details/links.
Datatera Inspection Module.jpg
Tech Stack:
- Inspector Module Functions in Python
- Inspector decision making intelligence by AI Rule Engine
- PyCharm will be used as IDE
- Inspector Result will be generated in JSON
- Functions will be published on SwaggerHub
3.4 Project Deliverables - Roadmap
Any prior work completed thus far? Details?
The system architecture, result data structure, and the tech stack details have been decided.
What is the project roadmap? That is: what are key milestones, and the target date for each milestone. Please make sure that one milestone is about publishing your results, e.g. as a medium blog post.
- Sensitive Data Inspector Function powered by AI development completed & System test started - Apr 15, 2022
- Qualitative Data Inspector Function powered by AI development completed & System test started - Apr 29, 2022
- System test by developer completed & publish functions on SwaggerHub - May 13, 2022
- Test cases and sample datasets will be provided for Acceptance Test - May 20, 2022
- Acceptance test on Swagger - May 27, 2022
- Publishing on a social media that we release the beta version - June 30, 2022
- Beta testers will be informed - July 1, 2022
What are the team’s future plans and intentions? Is there maintenance? Possible extensions to the work?
Yes, we will maintain and develop further and fix bugs/errors since this module will be part of our Datatera solution.
- Dataset format will be in CSV only from the beginning and we can definitely support more formats e.g. XML, xls, etc.
- We will probably add more KPI and metrics to be able to better detect the sensitive and qualitative data.
3.5 Additional Information
We are fundraising at the moment for pre-seed and we are also part of a Swedish VC called Antler.