All of Us Research Program, Curated Data Repository (CDR)
All of Us Data and Research Center
Access the All of Us dataset - a uniquely large and multimodal longitudinal dataset from over half a million participants across the United States. Verily partners with Vanderbilt University Medical Center (VUMC) as the All of Us DRC to support the National Institutes of Health (NIH) All of Us Research Program’s Researcher Workbench platform.
This resource integrates participant-provided information (surveys), electronic health records, physical measurements, extensive genomic data (whole genome sequences, arrays), and wearable device data. It is designed to support research across nearly all health conditions, enabling exploration of individual differences in biology, lifestyle, and environment that influence health and disease. Data is accessible via the Researcher Workbench following All of Us researcher registration requirements.
Datasets
All of Us CDR
Data Collection Time Frame
2018 - Present (EHR records may contain
data for significantly longer timeframes)
Participants
Current: 633,000
Goal: 1,000,000
Geographical Coverage
United States
Usage Examples
- Developing precision medicine approaches (e.g., personalized risk prediction, prevention, treatment).
- Studying the combined influences of genetics, environment, lifestyle, and social determinants on health outcomes.
- Identifying novel biomarkers for disease detection, diagnosis, or prognosis.
- Researching factors contributing to wellness, resilience, and healthy aging.
- Conducting pharmacogenomic research to understand variability in drug responses.
- Accelerating discoveries across a wide range of diseases (e.g., cancer, cardiovascular disease, diabetes, neurological disorders).
Data Modalities
Clinical Data
- Electronic Health Record (EHR) Data
- Demographics
- Medical History
- Clinical Laboratory Tests
- Clinical Procedures
Functional Measures
- Physical Measurements
Genomics
- Whole Genome Sequencing (WGS)
- Genotyping Arrays / WGG
- Structural Variant (SV) Data
- Genetic Ancestry Data
- Pharmacogenomics (PGx) Data
- Long-read sequence Data
Wearables / Digital Health
- Sensor / Wearable Device Data
- Activity
- Activity daily summary
- Activity intraday steps (minute-level)
- Sleep
- Sleep daily summary
- Sleep level (sequence by level)
- Heart Rate
- Heart rate by zone summar
- Heart rate (minute-level)
Biosamples (Source of Data)
- Blood
- Saliva
Participant-Reported Data
- Surveys / Questionnaires
- Demographics
- Medical History (Personal and Family Health History)
- Lifestyle and Behavioral Health
- Mental Health
- Health Care Access and Utilization
- Social Determinants of Health
Other
- Linkable External Data