Cardiometabolic Cohort

Verily dataset

Go beyond the limitations of traditional real-world data. Standard real-world datasets, derived from fragmented EHRs, leave critical gaps in the patient story. They lack the high-resolution lifestyle, behavioral, and continuous monitoring data needed to truly understand cardiometabolic disease. Verily’s Cardiometabolic Cohort dataset is built differently. Sourced from our engaged Lifelong Health Study community, we connect directly with consented participants to fill in those gaps — capturing the rich, longitudinal data, from patient-reported GLP-1 side effects to continuous readings from devices (e.g. CGMs, consumer wearables), that EHRs or claims data alone simply cannot provide.

This unified view empowers you to evaluate real-world treatment effectiveness, predict disease progression, derive novel variables from unstructured data, and answer prospective research questions through participant re-contact — ultimately informing the development of more personalized and effective therapies.

Verily datasets
Lifelong Health Study
Cardiometabolic Cohort
Verily logo

Dataset

Cardiometabolic Cohort

Data Collection Time Frame

2025 - Present

Data Latency

Refreshed Monthly

Geographical Coverage

United States

Usage Examples

Use cases unlocked with the foundational Cardiometabolic Cohort dataset

  • Characterize the complete patient journey for those on GLP-1 therapies, including initiation, titration, adherence, persistence, and reasons for switching or discontinuing medications to understand real-world usage in treating cardiometabolic conditions.
  • Identify and analyze distinct patient sub-populations within the broader cardiometabolic spectrum, using deep clinical and lifestyle data to understand the natural history of disease and variations in care needs.
  • Monitor patient-reported symptoms and side effects for cardiometabolic therapies like GLP-1s, linking them to clinical events and wearable data to understand real-world tolerability and barriers to adherence.
  • Develop predictive models for disease progression, identifying patients at high risk of developing new conditions (e.g., progressing from overweight to obesity, or from obesity to diabetes) or worsening after an acute event (e.g., developing heart failure post-myocardial infarction).

Use cases unlocked with advanced offerings for patient recontact, data linkage, and/or custom variables

  • Quantify the real-world impact of a new therapy on a patient's quality of life and daily activities by deploying custom surveys to capture treatment satisfaction and other patient-centric endpoints.
  • Perform pharmacogenomic (PGx) analysis with potential linking of sponsor-provided genomic data with the cohort's longitudinal clinical data to identify genetic markers that predict treatment response or adverse events.
  • Create deep clinical phenotypes by extracting specific, critical data points — such as ejection fraction for heart failure patients — directly from unstructured radiology reports and clinical notes.
  • Assess the total cost of care and healthcare resource utilization for patients with multiple cardiometabolic comorbidities, such as a combination of type 2 diabetes, CKD, and heart failure.

Data Modalities

Clinical Data

  • Diagnoses: Including qualifying cardiometabolic conditions, as well as the participant's broader diagnostic history
  • Encounters: Data on inpatient, outpatient, and emergency room visits, including hospital admission and discharge dates
  • Clinical Laboratory Tests: Standard panels (e.g., chemistry, hematology) and cardiometabolic labs results (e.g., HbA1c, eGFR, lipid panels)
  • Vitals & Measurements: Vitals such as blood pressure, heart rate, height, weight, and BMI
  • Standard Derived Variables: We are continuously expanding our set of standard variables derived from unstructured clinical notes, which includes key Cardiometabolic data points, such as GLP-1 switching reason and waist circumference

Participant-Reported Data:

  • Sociodemographics: Race, ethnicity, education, income, and insurance status
  • Lifestyle Factors: Self-reported information on smoking history, alcohol use, and physical activity levels
  • Quality of Life (PROs): Responses to validated clinical measurements assessing general health, pain, fatigue, and physical and mental health

Medication Data

  • Prescription Data: Medication name, start/end dates, and dosage from EHR and pharmacy data sources
  • Participant-Reported Medication Use: Detailed survey data on GLP-1 agonist usage, including adherence and reasons for discontinuation

Device & Wearable Data*

  • Consumer Device Data: Device data from Apple Health Kit, Google Health Connect, and other consumer devices
  • CGM Data: Data from Continuous Glucose Monitors (CGMs) is also planned for future availability

* Launching in 2026

Advanced Offerings and Custom Research Services

Advanced Data Tiers

  • Raw Unstructured Documents: Clinical documents such as progress notes, discharge summaries, and radiology and pathology reports
  • Medical Images: Raw medical images in DICOM format, including X-rays, CT scans, MRIs, and ECG waveforms

Potential Data Linkage & Enrichment

The dataset can be enriched by linking to customer-provided or other third-party datasets through tokenization. Potential linkages include:

  • Genomics & -Omics Data: Sponsor-provided genomics data (e.g., whole genome/exome sequencing) and other -omics data such as proteomics
  • Sponsor-Specific Clinical Trial Data: Data from a sponsor's own clinical trials, including participant-level data from past or current studies
  • Claims Data: Medical and pharmacy claims data from third-party sources
  • Third-Party Wearable Data: Data from other third-party digital health technologies and wearables

Custom Data & Derived Variables

Custom data elements can be generated to meet specific sponsor needs beyond the foundational Cardiometabolic Cohort dataset.

  • Derived Variables from Clinical Notes: Custom variables extracted from unstructured clinical notes to capture patient traits and context that are not available in structured data
  • Custom Surveys & ePROs: Custom surveys and ePROs deployed to the cohort to answer specific research questions
  • Custom Algorithms & Phenotyping: Development of bespoke algorithms for applications like digital phenotyping, risk prediction, or identifying specific patient subtypes
  • Targeted Biosample Collection: Prospective collection of biosamples like blood to support personalized pharmacokinetics or biomarker analysis
  • Digital Biomarkers: Existing and novel biomarkers and endpoints derived from sensors data

Data Spotlight: Uncovering the 'Why' Behind Treatment Changes

While standard RWD struggles to capture the ‘why’ behind a therapy change, Verily’s platform can uncover this deep clinical context. We leverage advanced methodologies, like applying large language models (LLMs) to unstructured physician notes, to derive custom variables critical to your research.

The following is an example of the deep insights our platform's AI-powered methodologies can unlock for custom research projects.

Examples of GLP-1 Switching Reasons, Identified and Categorized from Unstructured Data

e.g. Achieved desired effect
“The patient used to take GLP-1 (Ozempic) and switched to another drug (SGLT2 inhibitor, Empagliflozin/Jardiance). The reason for the switch was noted as improvements in weight and diabetes control, potentially indicating achievement of desired effect.”

e.g. Lack of effect
“The patient was taking Ozempic (a GLP-1) and stopped because they "did not see any weight benefit from it and also her sugars are elevated". They switched to Tirzepatide (another GLP-1).”

e.g. Side effect
“The patient stopped taking the GLP-1 medication Ozempic due to significant digestive problems (constipation).”

e.g. Insurance coverage / cost
“The patient was taking the GLP-1 medication Wegovy for weight loss, but switched to or stopped due to the insurance denial of prior authorization, which is a non-trivial reason (insurance coverage/cost-related).”

e.g. Medication availability
“Switched from dulaglutide (a GLP-1 receptor agonist branded as Trulicity) to tirzepatide (Mounjaro), which is another type of GLP-1 medication. The reason for this switch was due to the pharmacy having difficulty obtaining Mounjaro, although the patient was tolerating it.”

This ability to derive data from unstructured notes offers an additional layer of deep clinical insight, showcasing the power of Verily's platform to generate custom variables critical to your research goals. Partner with us to apply this same AI-powered methodology to derive custom variables from clinical notes that are critical to your specific research goals.