Data Catalog

Welcome to the Data Catalog, where you'll find an overview of all datasets currently ingested in the mimi_ws_1 workspace, which operates as our data lakehouse.

In most cases, we have preserved the original table formats from the source systems. However, we’ve made a few adjustments to ensure consistency and clarity. For example, all column names have been transformed to follow the snake case naming convention. This means spaces and special characters are replaced with underscores, and all letters are lowercase.
For instance, a column like Provider Name is now standardized as provider_name. Please see the Data Engineering section for more information.

Data Sources and Tables. Below is a comprehensive list of the data sources and tables available within the mimi_ws_1 workspace:

AHRQ - mimi_ws_1.ahrq

CMS Blue Button - mimi_ws_1.bluebutton

Canada Drug Agency - mimi_ws_1.cdaamc

CDC - mimi_ws_1.cdc

  • Description: Datasets from the Centers for Disease Control and Prevention (CDC)
  • Tables:
    • nhanes_demo_demographic_variables_sample_weights: CDC NHANES DEMO Demographic Variables & Sample Weights | resolution: respondent, interval: yearly
    • nhanes_exam_blood_pressure: CDC NHANES EXAM Blood Pressure | resolution: respondent, interval: yearly
    • nhanes_exam_body_measures: CDC NHANES EXAM Body Measures | resolution: respondent, interval: yearly
    • nhanes_lab_albumin_creatinine_urine: CDC NHANES LAB Albumin & Creatinine - Urine | resolution: respondent, interval: yearly
    • nhanes_lab_alpha1acid_glycoprotein_serum_surplus: CDC NHANES LAB Alpha-1-Acid Glycoprotein - Serum (Surplus) | resolution: respondent, interval: yearly
    • nhanes_lab_cholesterol_hdl: CDC NHANES LAB Cholesterol - HDL | resolution: respondent, interval: yearly
    • nhanes_lab_cholesterol_ldl_triglycerides: CDC NHANES LAB Cholesterol - LDL & Triglycerides | resolution: respondent, interval: yearly
    • nhanes_lab_cholesterol_total: CDC NHANES LAB Cholesterol - Total | resolution: respondent, interval: yearly
    • nhanes_lab_fasting_questionnaire: CDC NHANES LAB Fasting Questionnaire | resolution: respondent, interval: yearly
    • nhanes_lab_glycohemoglobin: CDC NHANES LAB Glycohemoglobin | resolution: respondent, interval: yearly
    • nhanes_lab_glyphosate_glyp_urine: CDC NHANES LAB Glyphosate (GLYP) - Urine | resolution: respondent, interval: yearly
    • nhanes_lab_highsensitivity_creactive_protein: CDC NHANES LAB High-Sensitivity C-Reactive Protein | resolution: respondent, interval: yearly
    • nhanes_lab_insulin: CDC NHANES LAB Insulin | resolution: respondent, interval: yearly
    • nhanes_lab_oral_glucose_tolerance_test: CDC NHANES LAB Oral Glucose Tolerance Test | resolution: respondent, interval: yearly
    • nhanes_lab_plasma_fasting_glucose: CDC NHANES LAB Plasma Fasting Glucose | resolution: respondent, interval: yearly
    • nhanes_lab_standard_biochemistry_profile: CDC NHANES LAB Standard Biochemistry Profile | resolution: respondent, interval: yearly
    • nhanes_metadata: National Health and Nutrition Examination Survey (NHANES) Metadata | resolution: variable, interval: yearly
    • nhanes_mortality: CDC NHANES Mortality Dataset | resolution: respondent, interval: yearly
    • nhanes_qre_acculturation: CDC NHANES QRE Acculturation | resolution: respondent, interval: yearly
    • nhanes_qre_air_quality: CDC NHANES QRE Air Quality | resolution: respondent, interval: yearly
    • nhanes_qre_alcohol_use: CDC NHANES QRE Alcohol Use | resolution: respondent, interval: yearly
    • nhanes_qre_blood_pressure_cholesterol: CDC NHANES QRE Blood Pressure & Cholesterol | resolution: respondent, interval: yearly
    • nhanes_qre_bowel_health: CDC NHANES QRE Bowel Health | resolution: respondent, interval: yearly
    • nhanes_qre_cardiovascular_health: CDC NHANES QRE Cardiovascular Health | resolution: respondent, interval: yearly
    • nhanes_qre_diabetes: CDC NHANES QRE Diabetes | resolution: respondent, interval: yearly
    • nhanes_qre_hospital_utilization_access_to_care: CDC NHANES QRE Hospital Utilization & Access to Care | resolution: respondent, interval: yearly
    • nhanes_qre_income: CDC NHANES QRE Income | resolution: respondent, interval: yearly
    • nhanes_qre_kidney_conditions: CDC NHANES QRE Kidney Conditions | resolution: respondent, interval: yearly
    • nhanes_qre_medical_conditions: CDC NHANES QRE Medical Conditions | resolution: respondent, interval: yearly
    • nhanes_qre_preventive_aspirin_use: CDC NHANES QRE Preventive Aspirin Use | resolution: respondent, interval: yearly
    • nhanes_qre_smoking_adult_recent_tobacco_use_youth_cigarettetobacco_use: CDC NHANES QRE Smoking - Adult Recent Tobacco Use & Youth Cigarette/Tobacco Use | resolution: respondent, interval: yearly
    • nhanes_qre_smoking_cigarette_use: CDC NHANES QRE Smoking - Cigarette Use | resolution: respondent, interval: yearly
    • nhanes_qre_smoking_household_smokers: CDC NHANES QRE Smoking - Household Smokers | resolution: respondent, interval: yearly
    • nhanes_qre_smoking_recent_tobacco_use: CDC NHANES QRE Smoking - Recent Tobacco Use | resolution: respondent, interval: yearly
    • nndss: National Notifiable Diseases Surveillance System (NNDSS) | resolution: state, interval: weekly
    • nssp_edvisits: NSSP Emergency Department Visit Trajectories by State and Sub State Regions- COVID-19, Flu, RSV, Combined | resolution: county, interval: weekly
    • nwss_covid: National Wastewater Surveillance System (NWSS) for Covid | resolution: monitoring_site, interval: weekly
    • nwss_mpox: National Wastewater Surveillance System (NWSS) for Mpox | resolution: monitoring_site, interval: weekly
    • places_censustract: PLACES: Local Data for Better Health | resolution: tract, interval: yearly
    • places_county: PLACES: Local Data for Better Health | resolution: county, interval: yearly
    • places_zcta: PLACES: Local Data for Better Health | resolution: zcta, interval: yearly
    • svi_censustract_multiyears: Social Vulnerability Index at Census Tract-Level | resolution: tract, interval: yearly
    • svi_censustract_y2000: Social Vulnerability Index at Census Tract-Level | year: 2000, resolution: tract, interval: snapshot
    • svi_censustract_y2010: Social Vulnerability Index at Census Tract-Level | year: 2010, resolution: tract, interval: snapshot
    • svi_censustract_y2014: Social Vulnerability Index at Census Tract-Level | year: 2014, resolution: tract, interval: snapshot
    • svi_censustract_y2016: Social Vulnerability Index at Census Tract-Level | year: 2016, resolution: tract, interval: snapshot
    • svi_censustract_y2018: Social Vulnerability Index at Census Tract-Level | year: 2018, resolution: tract, interval: snapshot
    • svi_censustract_y2020: Social Vulnerability Index at Census Tract-Level | year: 2020, resolution: tract, interval: snapshot
    • svi_censustract_y2022: Social Vulnerability Index at County-Level | year: 2022, resolution: tract, interval: snapshot
    • svi_county_multiyears: Social Vulnerability Index at County-Level | resolution: county, interval: yearly
    • svi_county_y2000: Social Vulnerability Index at County-Level | year: 2000, resolution: county, interval: snapshot
    • svi_county_y2010: Social Vulnerability Index at County-Level | year: 2010, resolution: county, interval: snapshot
    • svi_county_y2014: Social Vulnerability Index at County-Level | year: 2014, resolution: county, interval: snapshot
    • svi_county_y2016: Social Vulnerability Index at County-Level | year: 2016, resolution: county, interval: snapshot
    • svi_county_y2018: Social Vulnerability Index at County-Level | year: 2018, resolution: county, interval: snapshot
    • svi_county_y2020: Social Vulnerability Index at County-Level | year: 2020, resolution: county, interval: snapshot
    • svi_county_y2022: Social Vulnerability Index at County-Level | year: 2022, resolution: county, interval: snapshot
    • urbanrural_classification: NCHS Urban-Rural Classification Scheme for Counties | resolution: county, interval: yearly
    • vsrr_drugoverdose: Vital Statistics Rapid Reporting (VSRR) for Drug Overdose | resolution: state, interval: monthly

Census - mimi_ws_1.census

ClinicalTrials.gov - mimi_ws_1.clinicaltrialsgov

CMS Coding & Billing Section - mimi_ws_1.cmscoding

CMS Data & Research Section - mimi_ws_1.cmsdataresearch

CMS Innovation Center - mimi_ws_1.cmsinnovation

CMS Information Technology - mimi_ws_1.cmsit

CMS Insurance Marketplace - mimi_ws_1.cmsmarketplace

CMS Office of Minority Health - mimi_ws_1.cmsomh

  • Description: Datasets from the CMS Office of Minority Health site
  • Tables:
    • archive_index: Office of Minority Health reports and URLs | resolution: report, interval: snapshot
    • text_from_pdfs: Parsed Text from PDFs, Office of Minority Health Reports | resolution: reports, interval: snapshot

CMS Payment Section - mimi_ws_1.cmspayment

CMS Health & Safety Standards - mimi_ws_1.cmssafetystandards

CSSC Operations - mimi_ws_1.csscoperations

Dartmouth - mimi_ws_1.dartmouth

Data.CMS.gov - mimi_ws_1.datacmsgov

Data Commons - mimi_ws_1.datacommons

  • Description: Datasets from the Data Commons project
  • Tables:

Data.Healthcare.gov - mimi_ws_1.datahealthcaregov

Data.Medicaid.gov - mimi_ws_1.datamedicaidgov

CMS DE-SynPUF - mimi_ws_1.desynpuf

  • Description: CMS Data Entrepreneurs' Synthetic Public Use File (DE-SynPUF)
  • Tables:
    • beneficiary_summary: Beneficiary Summary from 2008 to 2010, DE-SynPUF | resolution: synthetic_beneficiary, interval: snapshot
    • carrier_claims: Carrier Claims from 2008 to 2010, DE-SynPUF | resolution: synthetic_event, interval: snapshot
    • inpatient_claims: Inpatient Claims from 2008 to 2010, DE-SynPUF | resolution: synthetic_event, interval: snapshot
    • outpatient_claims: Outpatient Claims from 2008 to 2010, DE-SynPUF | resolution: synthetic_event, interval: snapshot
    • prescription_drug_events: Prescription Drug Events from 2008 to 2010, DE-SynPUF | resolution: synthetic_event, interval: snapshot

Environmental Protection Agency - mimi_ws_1.epa

Federal Communications Commision - mimi_ws_1.fcc

FDA - mimi_ws_1.fda

Foursquare - mimi_ws_1.foursquare

Graham Center - mimi_ws_1.grahamcenter

Grants.gov - mimi_ws_1.grants

HealthIT - mimi_ws_1.healthit

HHS-OIG - mimi_ws_1.hhsoig

HRSA - mimi_ws_1.hrsa

HUDUser - mimi_ws_1.huduser

Internal Revenue Service - mimi_ws_1.irs

Legacy.com - mimi_ws_1.legacycom

  • Description: Datasets from Legacy.com, Inc.
  • Tables:
    • obituaries: Scraped Obituaries | resolution: person, interval: weekly

CMS Medicare Coverage Database - mimi_ws_1.mcd

State Medical Boards - mimi_ws_1.medicalboard

MedlinePlus - mimi_ws_1.medlineplus

  • Description: Datasets from MedlinePlus by the National Library of Medicine
  • Tables:
    • also_called: Alternative names or terms for health topics from MedlinePlus | resolution: term, interval: latest
    • health_topic: Main information for each individual health topic from MedlinePlus | resolution: health_topic, interval: latest
    • information_category: Categories of information for each site from MedlinePlus | resolution: health_topic, interval: latest
    • language_mapped_topic: Information about topic translations or language variants from MedlinePlus | resolution: health_topic, interval: latest
    • mesh_heading: Medical Subject Headings (MeSH) descriptors for topics from MedlinePlus | resolution: health_topic, interval: latest
    • mesh_qualifier: MeSH qualifiers associated with MeSH descriptors from MedlinePlus | resolution: mesh, interval: latest
    • organization: Organizations associated with each site from MedlinePlus | resolution: organization, interval: latest
    • other_language: Information about the topic in other languages from MedlinePlus | resolution: health_topic, interval: latest
    • primary_institute: Information about the primary institute associated with a topic from MedlinePlus | resolution: health_topic, interval: latest
    • related_topic: Information about topics related to the main topic from MedlinePlus | resolution: health_topic, interval: latest
    • see_reference: Cross-references or "see also" type information from MedlinePlus | resolution: health_topic, interval: latest
    • site: Information about external sites related to the topic from MedlinePlus | resolution: website, interval: latest
    • standard_description: Standard descriptions for each site from MedlinePlus | resolution: website, interval: latest
    • topic_group: Group information associated with health topics from MedlinePlus | resolution: health_topic, interval: latest

mimilabs - mimi_ws_1.mimilabs

National Association of State Budget Officers - mimi_ws_1.nasbo

NBER - mimi_ws_1.nber

Neighborhood Atlas - mimi_ws_1.neighborhoodatlas

National Library of Medicine - mimi_ws_1.nlm

Noridian - mimi_ws_1.noridian

NPPES - mimi_ws_1.nppes

New York University - mimi_ws_1.nyu

opencanadaca - mimi_ws_1.opencanadaca

Open Payments - mimi_ws_1.openpayments

Palmetto GBA - mimi_ws_1.palmettogba

Part C/D - mimi_ws_1.partcd

Payer MRF - mimi_ws_1.payermrf

pan-Canadian Pharma Alliance - mimi_ws_1.pcpa

Placekey - mimi_ws_1.placekey

Prescription Drug Plan - mimi_ws_1.prescriptiondrugplan

Provider Data Catalog - mimi_ws_1.provdatacatalog

QCOR CMS - mimi_ws_1.qcorcms

ResDAC - mimi_ws_1.resdac

State Government Databases - mimi_ws_1.stategov

Surgo Ventures - mimi_ws_1.surgoventures

CMS Synthetic Medicare PUF - mimi_ws_1.synmedpuf

Synthea - mimi_ws_1.synthea

  • Description: Datasets from the Synthea Project by the MITRE Corporation - 1.1M synthetic patients
  • Tables:
    • allergies: Patient allergy data, MITRE Synthea | resolution: synthetic_patient, interval: snapshot
    • careplans: Patient care plan data, including goals, MITRE Synthea | resolution: synthetic_patient, interval: snapshot
    • conditions: Patient conditions or diagnoses, MITRE Synthea | resolution: synthetic_patient, interval: snapshot
    • devices: Patient-affixed permanent and semi-permanent devices, MITRE Synthea | resolution: synthetic_patient, interval: snapshot
    • encounters: Patient encounter data, MITRE Synthea | resolution: synthetic_event, interval: snapshot
    • imaging_studies: Patient imaging metadata, MITRE Synthea | resolution: synthetic_event, interval: snapshot
    • immunizations: Patient immunization data, MITRE Synthea | resolution: synthetic_event, interval: snapshot
    • medications: Patient medication data, MITRE Synthea | resolution: synthetic_event, interval: snapshot
    • observations: Patient observations including vital signs and lab reports, MITRE Synthea | resolution: synthetic_event, interval: snapshot
    • organizations: Provider organizations including hospitals, MITRE Synthea | resolution: synthetic_organization, interval: snapshot
    • patients: Patient demographic data, MITRE Synthea | resolution: synthetic_patient, interval: snapshot
    • payer_transitions: Payer Transition data (i.e. changes in health insurance), MITRE Synthea | resolution: synthetic_event, interval: snapshot
    • payers: Payer organization data, MITRE Synthea | resolution: synthetic_health_plan, interval: snapshot
    • procedures: Patient procedure data including surgeries, MITRE Synthea | resolution: synthetic_event, interval: snapshot
    • providers: Clinicians that provide patient care, MITRE Synthea | resolution: synthetic_provider, interval: snapshot
    • supplies: Supplies used in the provision of care, MITRE Synthea | resolution: synthetic_patient, interval: snapshot

Tuva Health - mimi_ws_1.tuvahealth

UMN IHDC - mimi_ws_1.umn_ihdc

  • Description: UMN Interdisciplinary Health Data Competition (IHDC)
  • Tables:
    • competition_dataset_2025: UMN Interdisciplinary Health Data Competition Data | resolution: provider, interval: snapshot
    • competition_dataset_2025_archived_20250208: UMN Data Competition Dataset Draft | resolution: provider, interval: snapshot
    • competition_dataset_2025_v2: UMN Data Competition Dataset Draft V2 | resolution: provider, interval: snapshot

United Nations (UN) - mimi_ws_1.unitednations

USA Spending - mimi_ws_1.usaspending

USDA - mimi_ws_1.usda

Veterans Affairs (VA) - mimi_ws_1.va

X12 - mimi_ws_1.x12

Zillow - mimi_ws_1.zillow

  • Description: Datasets from Zillow Group, Inc. (a real-estate marketplace company)
  • Tables:
    • homevalue_zip: Home Values by Zillow | resolution: zip, interval: monthly
    • rent_zip: Rentals by Zillow | resolution: zip, interval: monthly