The MIMIC Code Repository: enabling reproducibility in critical care research

AEW Johnson, DJ Stone, LA Celi… - Journal of the American …, 2018 - academic.oup.com
Journal of the American Medical Informatics Association, 2018academic.oup.com
Objective Lack of reproducibility in medical studies is a barrier to the generation of a robust
knowledge base to support clinical decision-making. In this paper we outline the Medical
Information Mart for Intensive Care (MIMIC) Code Repository, a centralized code base for
generating reproducible studies on an openly available critical care dataset. Materials and
Methods Code is provided to load the data into a relational structure, create extractions of
the data, and reproduce entire analysis plans including research studies. Results Concepts …
Objective
Lack of reproducibility in medical studies is a barrier to the generation of a robust knowledge base to support clinical decision-making. In this paper we outline the Medical Information Mart for Intensive Care (MIMIC) Code Repository, a centralized code base for generating reproducible studies on an openly available critical care dataset.
Materials and Methods
Code is provided to load the data into a relational structure, create extractions of the data, and reproduce entire analysis plans including research studies.
Results
Concepts extracted include severity of illness scores, comorbid status, administrative definitions of sepsis, physiologic criteria for sepsis, organ failure scores, treatment administration, and more. Executable documents are used for tutorials and reproduce published studies end-to-end, providing a template for future researchers to replicate. The repository’s issue tracker enables community discussion about the data and concepts, allowing users to collaboratively improve the resource.
Discussion
The centralized repository provides a platform for users of the data to interact directly with the data generators, facilitating greater understanding of the data. It also provides a location for the community to collaborate on necessary concepts for research progress and share them with a larger audience. Consistent application of the same code for underlying concepts is a key step in ensuring that research studies on the MIMIC database are comparable and reproducible.
Conclusion
By providing open source code alongside the freely accessible MIMIC-III database, we enable end-to-end reproducible analysis of electronic health records.
Oxford University Press