Morzsák

Oldal címe

Database for storing and accessing diabetes related data in a standardized way

Címlapos tartalom

Real-world data has a major importance in diabetes related research, especially considering the widespread applications of machine learning algorithms. There are several existing datasets of real-world data in the literature; however, they all have their specific formats, applied devices and file structures. The different charachteristics make it cumbersome to use multiple datasets for research purposes. We developed a pipeline for efficiently storing and accessing diabetes related data in a standardized way. We defined a standardized JSON format for the records and stored them in a MongoDB database. The pipeline is capable of uploading data from various sources by implementing a corresponding transformer script; also it is possible to extend it with algorithms which are called automatically for new entries in the database. The source code for constructing the pipeline is given in https://github.com/NeuroDiab/CINTI2023Diabdatabase.