MoDRIA Data Dictionary - Medical Imaging & Ophthalmology Research

Dataset Overview

Dataset Description

The Multimodal Database of Retinal Images in Africa (MoDRIA) is a comprehensive dataset developed to drive advancements in machine learning and artificial intelligence for ophthalmological research. It addresses a critical gap in the availability of diverse and representative retinal imaging datasets from African populations, helping to reduce bias and improve the accuracy of AI-driven eye disease diagnosis. MoDRIA combines high-quality retinal images with detailed clinical data, reflecting real-world clinical environments across the African continent. This resource provides a robust foundation for training, validating, and benchmarking AI models, ultimately supporting the delivery of more equitable and effective ophthalmological care.

Study Population

The dataset comprises high-resolution retinal images collected from patients across multiple eye care centers in Africa, accompanied by comprehensive metadata that includes demographics, clinical history, and diagnostic details. Each image has been expertly labelled by certified ophthalmologists, ensuring both diagnostic accuracy and research reliability. By capturing data from diverse patient populations across different African regions, the dataset provides a representative foundation for developing inclusive and unbiased AI models in ophthalmology.

Data Components

Retinal Images: High-resolution fundus photography
Visual Acuity: Comprehensive vision assessment data
Clinical Assessments: Diabetic retinopathy and other eye condition evaluations
Clinical Metadata: Patient demographics and medical history
Laboratory Results: Blood glucose
Treatment Records: Medication history and therapeutic interventions

Quality Assurance

All images undergo rigorous quality control procedures including standardized imaging protocols, expert validation, and comprehensive metadata verification. The dataset adheres to international standards for medical imaging research and patient privacy protection. Quality metrics include image resolution standards, annotation consistency, and clinical data validation protocols.

Research Applications

This dataset supports the development of AI-driven diagnostic tools for diabetic retinopathy, glaucoma, age-related macular degeneration, and other eye diseases. The focus on African populations helps address healthcare disparities and supports the development of more inclusive AI models for global ophthalmological care, particularly in resource-limited settings where early detection can prevent blindness. MoDRIA enables multi-modal research by linking retinal imaging, clinical assessments, laboratory results, and treatment outcomes for comprehensive ophthalmological research.

Data Dictionary - All 38 Variables

A list of all variables from the MoDRIA dataset with detailed specifications for ophthalmological research

Variable Name	Description	Data Encoding/Codes
IMG_ID	Image ID number	Image identifier
Pt_ID	Patient ID number	Patient identifier
Camera	Camera type
IMG_OD_or_OS	Image of the Right or Left eye	OD= right eye, OS = left eye
IMG_Disc_or_Mac_Center	Image with disc or macula centered	D= disc center, M= macula center
Age	Age
Gender	Gender	0= male, 1= female, nc = not collected
VAOD_Numerator	Numerator of visual acuity, right eye e.g., 6/18	e.g., 6/12 or text: HM = hand motion PL = perception of light, NPL = no perception of light, nc = not collected
VAOD_Denominator	Denominator of visual acuity, right eye, e.g., 6/18	e.g., 6/12 or text: HM = hand motion PL = perception of light, NPL = no perception of light, nc = not collected
VAOS_Numerator	Numerator of visual acuity, left eye e.g., 6/18	e.g., 6/12 or text: HM = hand motion PL = perception of light, NPL = no perception of light, nc = not collected
VAOS_Denominator	Denominator of visual acuity, left eye, e.g., 6/18	e.g., 6/12 or text: HM = hand motion PL = perception of light, NPL = no perception of light, nc = not collected
IOP_OD	Intraocular pressure, right eye	ICARE
IOP_OS	Intraocular pressure, left eye	ICARE
Weight	Weight
Height	Height
BMI	BMI
SBP	Systolic blood pressure	type of device
DBP	Diastolic blood pressure	type of device
Random_BS	Random blood sugar	finger stick, device
HTN_Medication	Does patient take medication for hypertension?	0 = No, 1=Yes
DM_Medication	Does patient take medication for diabetes mellitus?	0 = No, 1=Yes
HIV_Medication	Does patient take medication for HIV?	0 = No, 1=Yes
DM_Medication	Does patient take medication for diabetes mellitus?	0 = No, 1=Yes
No_Medication	Does patient take no medications?	0 = No, 1=Yes
Current_Smoker	Is patient a current cigarette smoker?	0 = No, 1=Yes, nc = not collected
Current_Alcohol	Is patient a current drinker of alcohol?	0 = No, 1=Yes, nc = not collected
Occupation	What is patient's current occupation?	0-10, nc = not collected
Image_Include?	Should image be included or not included based on quality assessment?	0 = No, 1=Yes, nc = not collected
Img_Quality_Issues?	Does image have quality issues?	0 = No, 1=Yes
IMG_Poor_Focus?	Does the image have poor focus?	0 = No, 1=Yes, -99 = intentionally not collected
IMG_Too_Dark_or_Bright?	Does the image have illumination defects?	0 = No, 1=Yes, 99 = intentionally not collected
IMG_Artifacts?	Does the image have artifacts?	0 = No, 1=Yes, 99 = intentionally not collected
IMG_ICDR_Score	International Classification of Diabetic Retinopathy Score	0 = no DR, 1 = micros only, 2 = microaneurysm, cws, or retinal hemorrhages (<25/quadrant), 3 = >25 hemes per quadrant, 4 = proliferative diabetic retinopathy, 5 = unable to determine
IMG_Macular_Edema	Presence of Diabetic Macular Edema	0 = no DME, 1 = DME, 2 = unable to determine, 99 = intentionally not collected
C_D_Ratio	Cup-to-disc ratio	1 = 0.0-0.6, 2 = 0.7-0.8, 3 = 0.9-1.0, -99 intentionally not collected
Optic_Nerve_Normal?	Is optic nerve normal?	0 = No, 1=Yes, -88 = not collected, -99 = intentionally not collected
Quick_Qual_Metric	Image quality metric from Quick Qual, probability of the image being "bad"	link to QQ, 99 = intentionally not collected (image defect)
Fractal_Dimension_Metric	Fractal dimension metric from DART	link to DART, 100 = intentionally not collected (image defect)

Data Dictionary

This represents all 38 variables from the MoDRIA dataset. The complete dataset includes variables for retinal imaging, clinical assessments, laboratory results, diabetes management, and research protocols. All variables are designed to support comprehensive ophthalmological research and clinical care, with particular focus on diabetic retinopathy screening and management in African populations.

MoDRIA Data Codebook

Multimodal Database of Retinal Images in Africa