- To provide annual PM2.5 component concentration data for the contiguous U.S. at resolutions of 50m in urban areas and 1km in non-urban areas for public health research to estimate effects on human health, and for other related research.
- The Annual Mean PM2.5 Components (EC, NH4, NO3, OC, SO4) 50m Urban and 1km Non-Urban Area Grids for Contiguous U.S., 2000-2019, v1 data set contains annual predictions of the chemical concentrations at a hyper resolution (50m x 50m grid cells) in urban areas and at a high resolution (1km x 1km grid cells) in non-urban areas for the years 2000 to 2019. Particulate matter with an aerodynamic diameter less than 2.5 µm (PM2.5) increases mortality and morbidity. PM2.5 is composed of a mixture of chemical components that vary across space and time. Due to limited hyperlocal data availability, less is known about health risks of PM2.5 components, their U.S.-wide exposure disparities, or which species are driving the biggest intra-urban changes in PM2.5 mass. The national super-learned models were developed across the U.S. for hyperlocal estimation of annual mean elemental carbon, ammonium, nitrate, organic carbon, and sulfate concentrations across 3,535 urban areas at a 50m spatial resolution, and at a 1km resolution for non-urban areas from 2000 to 2019. Using Machine-Learning models (ML), combined with either a Generalized Additive Model (GAM) Ensemble Geographically-Weighted-Averaging (GAM-ENWA) or Super-Learning (SL) and approximately 82 billion predictions across 20 years, hyperlocal super-learned PM2.5 components are now available for further research. The overall R-squared values of 10-fold cross validated models ranged from 0.910 to 0.970 on the training sets for these components, while on the test sets the R-squared values ranged from 0.860 to 0.960. Remarkable spatiotemporal intra-urban and inter-urban variabilities were found in PM2.5 components. The Coordinate Reference System (CRS) for predictions is the World Geodetic System 1984 (WGS84) and the units for the PM2.5 Components are µg/m^3. The data are provided in RDS tabular format, a file format native to the R programming language, but can also be opened by other languages such as Python.
- Recommended Citation(s)*:
Amini, H., M. Danesh-Yazdi, Q. Di, W. Requia, Y. Wei, Y. AbuAwad, L. Shi, M. Franklin, C.-M. Kang, J. M. Wolfson, P. James, R. Habre, Q. Zhu, J. S. Apte, Z. J. Andersen, X. Xing, C. Hultquist, I. Kloog, F. Dominici, P. Koutrakis, and J. Schwartz. 2023. Annual Mean PM2.5 Components (EC, NH4, NO3, OC, SO4) 50m Urban and 1km Non-Urban Area Grids for Contiguous U.S., 2000-2019 v1. Palisades, New York: NASA Socioeconomic Data and Applications Center (SEDAC). https://doi.org/10.7927/7wj3-en73. Accessed DAY MONTH YEAR.
* When authors make use of data they should cite both the data set and the scientific publication, if available. Such a practice gives credit to data set producers and advances principles of transparency and reproducibility. Please visit the data citations page for details. Users who would like to choose to format the citation(s) for this dataset using a myriad of alternate styles can copy the DOI number and paste it into Crosscite's website.
† For EndNote users, please check the Research Note field for issues with importing authors that are organizations when using the ENW file format.
- Available Formats: