Air Quality Data for Health-Related ApplicationsFollow Us: Twitter Follow Us on Facebook YouTube Flickr | Share: Twitter Facebook
Daily and Annual PM2.5, O3, and NO2 Concentrations at ZIP Codes for the Contiguous U.S., v1 (
2000 – 2016)
- To provide daily and annual Fine Particulate Matter (PM2.5), Ozone (O3), and Nitrogen Dioxide (NO2) concentrations data at ZIP Codes for the contiguous U.S. for research in environmental epidemiology, environmental justice, and health equity by linking with ZIP Code-level demographic and medical data sets, and for other related research.
- The Daily and Annual PM2.5, O3, and NO2 Concentrations at ZIP Codes for the Contiguous U.S., 2000-2016, v1.0 data set contains daily and annual concentration predictions for Fine Particulate Matter (PM2.5), Ozone (O3), and Nitrogen Dioxide (NO2) pollutants at ZIP Code-level for the years 2000 to 2016. Ensemble predictions of three machine-learning models were implemented (Random Forest, Gradient Boosting, and Neural Network) to estimate the daily concentrations at the centroids of 1km x 1km grid cells across the contiguous U.S. for 2000 to 2016. The predictors included air monitoring data, satellite aerosol optical depth, meteorological conditions, chemical transport model simulations, and land-use variables. The ensemble models demonstrated excellent predictive performance with 10-fold cross-validated R-squared values of 0.86 for PM2.5, 0.86 for O3, and 0.79 for NO2. The predictions allow for estimates of ZIP Code-level pollution concentrations. For general ZIP Codes with polygon representations, pollution levels were estimated by averaging the predictions of grid cells whose centroids lie inside the polygon of that ZIP Code; for other ZIP Codes such as Post Offices or large volume single customers, they were treated as a single point and predicted their pollution levels by assigning the predictions using the nearest grid cell. The polygon shapes and points with latitudes and longitudes for ZIP Codes were obtained from Esri and the U.S. ZIP Code Database and were updated annually. The data include about 31,000 general ZIP Codes with polygon representations, and about 10,000 ZIP Codes as single points. Compared with the 1km grid data, the ZIP Code-level predictions are much smaller in size and are manageable in personal computing environments. This greatly improves the inclusion of scientists in different fields by lowering the key barrier to participation in air pollution research. The units are µg/m^3 for PM2.5 and ppb for O3 and NO2.
- Recommended Citation(s)*:
Wei, Y., X. Xing, A. Shtein, E. Castro, C. Hultquist, M. D. Yazdi, L. Li, and J. Schwartz. 2022. Daily and Annual PM2.5, O3, and NO2 Concentrations at ZIP Codes for the Contiguous U.S., 2000-2016, v1.0. Palisades, New York: NASA Socioeconomic Data and Applications Center (SEDAC). https://doi.org/10.7927/9yp5-hz11. Accessed DAY MONTH YEAR.
ENW (EndNote & RefWorks)†
Wei, Y., X. Qui, M. D. Yazdi, A. Shtein, L. Shi, J. Yang, A. A. Peralta, B. A. Coull, and J. Schwartz. 2022. Impact of Exposure Measurement Error on Est. Concentration-resp. Relationship Between Long-term PM2.5 Exposure and Mortality. Environmental Health Perspectives 130(7): 077006. https://doi.org/10.1289/EHP10389.
ENW (EndNote & RefWorks)†
* When authors make use of data they should cite both the data set and the scientific publication, if available. Such a practice gives credit to data set producers and advances principles of transparency and reproducibility. Please visit the data citations page for details. Users who would like to choose to format the citation(s) for this dataset using a myriad of alternate styles can copy the DOI number and paste it into Crosscite's website.
† For EndNote users, please check the Research Note field for issues with importing authors that are organizations when using the ENW file format.
- Available Formats: