- To provide daily and annual PM2.5 concentration data in the U.S. at a resolution of 1 km (about 30 arc-seconds) for public health research to respectively estimate short- and long-term effects on human health, and for other related research.
- The Daily and Annual PM2.5 Concentrations for the Contiguous United States, 1-km Grids, v1 (2000 - 2016) data set includes predictions of PM2.5 concentrations in grid cells at a resolution of 1 km for the years 2000 to 2016. A generalized additive model was used that accounted for geographic difference to ensemble daily predictions of three machine learning models: neural network, random forest, and gradient boosting. The three machine learners incorporated multiple predictors, including satellite data, meteorological variables, land-use variables, elevation, chemical transport model predictions, several reanalysis data sets, as well as other predictors. The annual predictions were calculated by averaging the daily predictions for each year in each grid cell. The ensembled model demonstrated better predictive performance than the individual machine learners with 10-fold cross-validated R-squared values of 0.86 for daily predictions and 0.89 for annual predictions.
- Recommended Citation(s)*:
Di, Q., Y. Wei, A. Shtein, C. Hultquist, X. Xing, H. Amini, L. Shi, I. Kloog, R. Silvern, J. Kelly, M. B. Sabath, C. Choirat, P. Koutrakis, A. Lyapustin, Y. Wang, L. J. Mickley, and J. Schwartz. 2021. Daily and Annual PM2.5 Concentrations for the Contiguous United States, 1-km Grids, v1 (2000 - 2016). Palisades, New York: NASA Socioeconomic Data and Applications Center (SEDAC). https://doi.org/10.7927/0rvr-4538. Accessed DAY MONTH YEAR.
Di, Q., H. Amini, L. Shi, I. Kloog, R. Silvern, J. Kelly, M. B. Sabath, C. Choirat, P. Koutrakis, A. Lyapustin, Y. Wang, L. J. Mickley, and J. Schwartz. 2019. An Ensemble-based Model of PM2.5 Concentration Across the Contiguous United States with High Spatiotemporal Resolution. Environment International 130: 104909. https://doi.org/10.1016/j.envint.2019.104909.
* When authors make use of data they should cite both the data set and the scientific publication, if available. Such a practice gives credit to data set producers and advances principles of transparency and reproducibility. Please visit the data citations page for details. Users who would like to choose to format the citation(s) for this dataset using a myriad of alternate styles can copy the DOI number and paste it into Crosscite's website.
† For EndNote users, please check the Research Note field for issues with importing authors that are organizations when using the ENW file format.
- Available Formats:
- raster, tabular, vector