Methods » U.S. Census Grids

Summary Files

The U.S. Census Grids are created by taking population and housing counts at the block level and proportionally allocating the counts in the census blocks to a latitude-longitude quadrilateral grid. If a grid cell contains 40% of the area of one census block and 30% of the area of a second census block, the population count for that grid cell will be 40% of the population of the first census block and 30% of the population of the second census block.

U.S. Census Grids uses TIGER/Line files for the census block boundaries and SF1 and SF3 tables for the demographic and socioeconomic characteristics of each census block. SF1 data are based on the census short form and therefore include counts for the total population. The SF3 data are based on the census long form, which is sent to approximately one out of every six households.

30 arc-second grids using SF1 data

The relevant fields are extracted from the SF1 tables. Some of the grids contain data from a single field, such as the number of non-Hispanic whites, which is taken from SF1 table P10, field P10001 for 1990, P8, field P008003 for 2000, and P4, PCT0121001 for 2010. Other grids use data from several fields. This is true of all the age grids, which are derived from SF1 tables P12 and P14 for 2000 and 2010. These tables report the number of people in each age category by gender. The 2000 grid for the population under age one uses SF1 table P14, field P14003 (the number of males under age one) plus SF1 table P14, field P14024 (the number of females under age one). The Variable Catalog contains a list of the fields used for each census grid year.

The TIGER/Line files are then joined to the SF1 field variables. The density of the variable being gridded is calculated for each census block, for example the number of foreign born residents per square kilometer. A 30 arc-second quadrilateral grid is intersected with the census block coverage. This divides each census block into pieces that fit into the grid cells. The total count for the grid cell is calculated by taking the area of each census block piece within the grid cell, multiplying it by the density of the variable being gridded, and summing these values for all census block pieces in the grid cell.

30 arc-second grids using SF3 data

The lowest level of geography for which SF3 data are released is the census block group. For the SF3 grids, these data are proportionately allocated to census blocks using the distribution of the underlying SF1 population. For instance, if 35% of the block group’s population aged 25 and older lives in a given census block, as reported in the SF1 tables, 35% of the block group’s population aged 25 and older with a high school diploma, as reported in the SF3 tables, is assigned to that census block. Once the SF3 data have been allocated to the census block level, the gridding process is the same as described above.

Metropolitan Statistical Area grids

The metropolitan statistical area (MSA) grids are created by selecting all census blocks within the MSA and gridding those blocks using a 7.5 arc-second quadrilateral grid.

Notes

1. The grids for Alaska are divided by hemisphere. Most of the state is in the western hemisphere. There are several islands, with a total population of 47, in the eastern hemisphere. These are gridded separately.

2. The initial beta version had an error with the datum used to create the grids for Hawaii. The TIGER/Line files use a local datum for Hawaii, but the beta version dated April 27, 2006 erroneously assumed that the datum was NAD83. The corrected Hawaiian data have a date of June 1, 2006.

3. SF3 data is only available for the years 1990 and 2000. It was discontinued by the US Census Bureau for Census 2010.

Social Vulnerability Index Grids

The CDC's Social Vulnerability Index uses 15 census variables at the census tract level to produce an SVI score, which ranges between 0 and 1, to indicate relatively low or high vulnerability, respectively. The SEDAC Social Vulnerability Index Grids data set rasterizes these census tract-level social vulnerability measures at 1 km spatial resolution to fit to the standard grid of CIESIN’s Gridded Population of the World, Version 4, Revision 11 (GPWv4.11) and U.S. Census Grid families of population products. In addition to the general SVI indicator, gridded layers are provided for four sub-category themes (Socioeconomic, Household Composition & Disability, Minority Status & Language, and Housing Type & Transportation). SEDAC gridded the vector SVI data for each of the five layers for each of the years 2000, 2010, 2014, 2016, and 2018.

30 arc-second grids using CDC SVI data

CDC's Social Vulnerability Index uses 15 variables at the census tract level. The data comes from the U.S. decennial census for the years 2000 & 2010, and the American Community Survey (ACS) for the years 2014, 2016, and 2018. It is a hierarchical additive index (Tate, 2013), with the component elements of the CDC’s SVI including the following for 4 themes: Socioeconomic Status (Below Poverty, Unemployed, Income, No High School Diploma), Household Composition & Disability (Aged 65 or Older, Aged 17 or Younger, Civilian with a Disability, Single-Parent Households), and Minority Status & Language (Minority, Speaks English “Less than Well”), and Housing Type & Transportation (Multi-Unit Structures, Mobile Homes, Crowding, No Vehicle, Group Quarters). While state ranked versions of the indices are available, note that all of these data which are gridded are from the national U.S. data set version.

SEDAC grids the CDC’s preexisting SVI layer in order to produce overall SVI scores as well as scores for each sub-category theme at 1 km resolution for the contiguous U.S. SVI data at the census tract level were downloaded from the CDC website for the years 2000, 2010, 2014, 2016, and 2018. The first two files, 2000 and 2010, are converted from Esri Geodatabase files (.mdb) to shapefiles while later years are directly downloadable as shapefiles. SEDAC uses an R script to grid the vector SVI value data and fit the polygon values to the standard grid of the GPWv4.11 (30 arc-seconds) population products. The SVI grid data were generated using a rasterization of polygons approach that uses the center of each grid cell as input, and if this central point is not available, it is drawn from the value of overlapping polygons. The SVI values are evenly allocated within the tract as there were no finer grained measures to use at this time to allocate or weight the spatial distribution of social vulnerability within tracts. This creates a 1 km gridded raster surface that visually matches the polygon census tract units.

The raster layers store the SVI values which range of 0 to 1. In the SVI, low values represent low vulnerability while high values represent high vulnerability. The GPWv4.11 Land and Water Area data set was used to mask out grid cells that are water. This creates a no data value in the product for pixels that do not contain land. No data values also result from grid cells that are without input SVI data which varies slightly by the census year. Finally, a layer option is provided that is the same as described above, but adds an additional mask using the GPWv4.11 Population Count data of areas of no population.

The resulting rasters contained gridded Social Vulnerability Index values at a 1 km fit to the GPW grid. The method was repeated to produce 5 layers for each of the 5 years of data with each year having one layer representing overall SVI and 4 sub-themes including Socioeconomic, Household Composition, Minority Status/Language, and Housing/Transportation. An additional projection specification is added to the end of each layer name to indicate either NAD83 or WGS84. Finally, NoPop is indicated for rasters with masks for water and no population applied. The 1 km raster grid data are downloadable in GeoTIFF format categorized by year and projection.

For more information, see the data set documentation.