Data Sources & Methodology

Last updated: March 7, 2026

Where Our Data Comes From

MSHAScan compiles publicly available U.S. mine safety records from the Mine Safety and Health Administration (MSHA) Open Government Data portal. This includes inspection and enforcement data published by the U.S. Department of Labor. All source data is in the public domain.

Datasets We Use

We download and process the following MSHA datasets:

How We Process the Data

  1. Download — Every Saturday, we automatically download the latest pipe-delimited data files from MSHA's open data portal.
  2. Clean — We sanitize text encoding (UTF-8 normalization) and remove non-printable characters to ensure consistent display.
  3. Normalize — We standardize operator names, state codes, county identifiers, and location data. We generate URL-friendly slugs for each mine, operator, and state.
  4. Aggregate — We compute summary statistics at the mine, operator, state, and county level: total violations, S&S violations, accidents, fatalities, and penalties.
  5. Publish — Processed data becomes searchable on the site immediately after import completes.

Update Frequency

MSHA publishes updated datasets every Friday. MSHAScan downloads and processes the new data every Saturday morning. This means our records are typically no more than one week behind the official MSHA data.

Data Accuracy

We present the data as published by MSHA. While we apply normalization and cleanup steps, we do not alter the substance of any record. If MSHA corrects or removes a record in a later data release, our next weekly import will reflect that change.

Important Disclaimer

MSHAScan is a secondary reference tool, not a primary data source. For legal, regulatory, or compliance purposes, always consult the official MSHA records directly. Our data is provided for informational and educational use only.

Questions?

If you have questions about our data or methodology, please contact us.