Datatera | Datatera Metadata Functions Powered by AI | Round 23

Project Name

Datatera


Project Category

Build & Integrate


Proposal Earmark

General


Proposal Description

We would like to inspect the medical image data e.g. MR, roentgen, dermatological images and detect the sensitive data by leveraging AI (Convolutional Neural Network (CNN)) and Python libraries such as OpenCV and PyTesseract.

Before we use provided image data, we want to measure whether the data is in the appropriate format and qualified enough. If data needs to be brought into the appropriate format and/or if the data is low-quality data, image processing methods need to be used.

Image processing is the process of transforming an image into a digital form and performing certain operations to get some useful information from it. Images can be RGB images or Grayscale images. In this way, we will provide complete sensitive image security and also the “training medical image” concept.

CNN is mainly used in extracting features from the image with the help of its layers. CNNs are widely used in image classification where each input image is passed through a series of layers to get a probabilistic value between 0 and 1.

CNN is a computer vision deep learning network that can recognize and classify picture features. After processing (if needed) images will be ready to be used in Computer Vision models.

CNN model works as follows;

Pytesseract or Python-tesseract can read and recognize text in images and is commonly used in python Optical Character Recognition (OCR) image-to-text use cases.

OpenCV is a Python library that allows you to perform image processing and computer vision tasks.


Grant Deliverables

Grant Deliverable: Sensitive Medical Image Inspector powered by AI (Convolutional Neural Network (CNN), PyTesseract, OpenCV).

Output a result with a ratio indicating the text data and/or patterns that possibly contain sensitive image data.

Project Roadmap:

Grant Deliverables: Sensitive Medical Image Inspector powered by AI (Convolutional Neural Network (CNN), PyTesseract, OpenCV) - Development completed & System test started - Oct 28, 2022

System test by developer completed & endpoint will be published on SwaggerHub - Nov 10, 2022

Test cases and sample images will be provided for Acceptance Test - on Nov 17, 2022

Acceptance test on Swagger - Nov 24, 2022

Publishing on social media that we release the beta version -

Nov 28, 2022

Beta testers will be informed - on Dec 1, 2022

Tech Stack:

Medical Image Inspector in Python (PyTesseract, OpenCV)

GitHub will be used for Code & Version Control

Inspector decision-making intelligence by Convolutional Neural Network (CNN)

Visual Studio Code will be used as IDE

Inspector results will be generated in JSON

The endpoint will be published on SwaggerHub

We will maintain and develop further and fix bugs/errors since this module will be part of our Datatera solution.

Medical image format will be in MRI, DICOM, and dermatological images only from the beginning and we can definitely support more formats e.g. roentgen, etc., and even medical images.

We will add possible extensions to this work to be able to provide more relevant AI insights on the metadata feature.


Project Description

Datatera is a global marketplace to connect HealthData Providers with HealthTech companies by making larger samples of the high-quality real-world datasets available.


Final Product

HealthTech AI companies are facing challenges to get access to qualitative healthcare datasets while they are building AI models which result in bias and other errors and that take a lot of time and money to maintain and manage. Datatera will provide a global data computing marketplace where Data Scientists will have the opportunity to train their AI models on high-qualitative and diverse training datasets while preserving privacy.


Value Add Criteria

We have the intention to enrich the metadata feature with very valuable AI insights to be able to help Data Consumers to choose the right dataset for their needs to consume. Image processing especially in medicine has been highly demanded and utilized. 


Medical image processing aims to build solutions that use computerized technologies as we do at Datatera secure the sensitivity to tackle medical diagnosis challenges.


We believe we can improve and develop the Ocean protocol's C2D concept with a richer metadata feature to be able to provide full awareness of the image data sensitivity we provide in our platform.


It is equally important to ensure that all datasets that are available on our platform are already inspected and they contain certain value to Data Consumers when they choose to train their AI models.



Core Team

Tugce Ozdeger

Role: CEO, Acting CTO, Lead Developer, Architect

Relevant Credentials:

GitHub: https://github.com/TugceOzdeger

LinkedIn: https://www.linkedin.com/in/tugceozdeger

Other:

Background/Experience:

Founder at Datatera

10+ years of professional experience as a senior system developer

Marcin Gornicki

Role: CPO, CFO

Relevant Credentials:

GitHub: https://github.com/marcingornicki

LinkedIn: https://www.linkedin.com/in/marcingornicki

Other:

Background/Experience:

Co-Founder and CPO at Datatera

Ayantha Weerathunga

Role: Software Developer, Architect

Relevant Credentials:

GitHub: https://github.com/ayantha80

LinkedIn: https://www.linkedin.com/in/ayantha-weerathunga

Other:

Background/Experience:

CTO at Datatera

Zeki Gultekin

Role: Senior Data Analyst

Relevant Credentials:

GitHub: https://github.com/Gltknzk

LinkedIn: https://www.linkedin.com/in/zeki-gultekin

Other:

Background/Experience:

Head of Data Science at Datatera

Senior Data Analyst

Stanley Udeh

Role: Data Scientist

Relevant Credentials:

GitHub: https://github.com/Standinho-Strapp

LinkedIn: https://www.linkedin.com/in/stanleyudeh

Other:

Background/Experience:

Healthcare Data Strategist at Datatera


Advisors

Ruslan Gasimli

Role: Advisor

Relevant Credentials:

GitHub: https://github.com/RG-911

LinkedIn: https://www.linkedin.com/in/ruslangasimli

Other:

Background/Experience:

Data Advisor at Datatera

Senior BI Data Scientist

Projat Banerjee

Role: Advisor

Relevant Credentials:

GitHub: https://github.com/Pro-Novice

LinkedIn: https://www.linkedin.com/in/projat-banerjee-468b6b125/

Other:

Background/Experience:

Blockchain Specialist at Datatera

Blockchain Researcher

Magnus Fredriksson

Role: Advisor

Relevant Credentials:

GitHub: https://github.com/Magfred

LinkedIn: https://www.linkedin.com/in/magnus-f-447a612

Other:

Background/Experience:

Medical Data Specialist at Datatera


Funding Requested
3000


Minimum Funding Requested
1


Wallet Address
0xEB023A03cfebd0a58214CA018c3f25F0c8b96000