Introduction

There have been numerous reports that the impact of the ongoing COVID-19 epidemic has disproportionately impacted traditionally vulnerable communities associated with well-researched social determinants of health and neighborhood attributes [1], such as the proportion of racial and ethnic minorities, migrants, and lower-income households both globally [2] and domestically [3]. For instance, according to an April 8th Centers for Disease Control and Prevention’s Morbidity and Mortality Weekly Report, of the 580 hospitalized patients in 99 counties in 14 states found that non-Hispanic (NH) blacks were disproportionately affected [4]. Poverty and low income have also been identified as important issues not just as a function of potentially classist policy, but also due to the increased difficulty for those in more tenuous financial situations and occupations to implement physical distancing and self-isolation. Physical distancing, sometimes referred to as social distancing, is considered an important non-pharmaceutical intervention in mitigating the spread of COVID-19. It involves keeping distance between yourself and others and avoiding gathering in groups or crowded areas to reduce the likelihood of inhalation of the virus or exposing others. The distance recommended by the US Centers for Disease Control and Prevention (CDC) is six feet (1.8 m) [5]. Self-isolation refers to the elimination of human contact, including at home, when someone is believed to be infected, or has been exposed to another individual who is infected [6]. Individuals at or below the poverty line are affected by residential overcrowding, increased smoking rates [7], exposures to environmental pollutants [8], and lack of access to healthcare according to a UN report—all of which can increase the spread of the virus or cause adverse outcomes [9]. However, it is important to note that at this point, there is very little data at smaller geographies (e.g., counties or sub-county units) that report on demographic or economic characteristics of those with the virus, hospitalizations, or deaths [10].

These types of health disparities, which are not new to this pandemic, may be a function of elements of the social and physical environments, as well as factors associated with systemic and institutionalized racism and classism resulting in suboptimal access to resources, including health care and support services [11, 12]. Additionally, these vulnerable communities’ risk of having worse outcomes from COVID-19 infection may be increased due to a higher prevalence of comorbidities or underlying conditions such as asthma, diabetes, and other conditions [13,14,15,16]. This suggests that existing health disparities are likely to be rapidly magnified in the context of COVID-19, and potentially extend well beyond the lifespan of the epidemic due to sociodemographic inequity and economic hardships at the global, national, and local levels which themselves have a differential impact on individuals and populations both directly and indirectly [17,18,19,20].

Scale is an important element when examining any phenomenon that has a locational component, and the choice of both unit of analysis and study area can have large effects on the outcomes of the analyses. For instance, when quantifying the association between a health outcome and population density, using a county scale could obscure the heterogeneity of both the outcome of interest (e.g., there may be very high rates in one part of the county, and low rates everywhere else) and the associated variable (e.g., one neighborhood in the county could house a large number of residents in high-rise buildings and the remainder could be low-intensity single-family homes and park land). This issue, often called the modifiable area unit problem (MAUP), has been documented in multiple research domains, including health [21, 22], exposure estimation [23], measures of access [24], and environmental justice [25, 26]. A related phenomenon is the choice of the overall study area. For instance, based on the American Community Survey 2018 5-year estimates [27], Staten Island, one of the boroughs of New York City, may seem to have very high population density (over 8000 people per square mile) when compared with other counties in the entire USA, some of which have values far below one person per square mile (e.g., Denali, Alaska and Esmeralda, Texas). However, Staten Island is the least densely population borough of New York City, with Manhattan having more than 70,000 residents per square mile. As such, it would appear comparatively dense at a national scale, but relatively less dense at a city scale. Since social interaction and physical proximity of infected and susceptible individuals is the basis of community transmission of the virus, it stands to reason that denser areas would be more rapidly and severely impacted than low-population areas. Although this may be true at a national or even global level at this stage of the epidemic [28], where transmission may be easier in urban regions versus rural areas, it is unclear if this is the case at an intra-urban, sub-city scale.

The understanding of the disease characteristics at national and global scales is unquestionably important; however, it is also necessary to appreciate its dynamics at a more granular, neighborhood-level. According to data collected by the New York Times aggregated by county, as of April 13, NYC had the most reported cumulative cases (106,764) and deaths (7154) in the USA. Cook County, Illinois, which contains Chicago, is fifth on the list for that day, with a cumulative 15,474 cases and 543 deaths [29]. However, the spatial distribution of reported positive cases are not homogeneously distributed across either city.

The goal of this ecological cross-sectional study is to examine the spatial and demographic nature of reported SARS-CoV-2 diagnoses in New York City (NYC) and Chicago (CHI) as of April 13, 2020. Specifically, we examine SARS-CoV-2 diagnosis rates per ZIP code tabulation area (ZCTA) and compare sociodemographic and economic characteristics between spatial hot spots and cold spots. The characteristics of the NYC and CHI hot/cold spots are then compared to reveal differences and similarities between the cities. The rationale behind selecting New York City and Chicago was their overall comparability, particularly with respect to some characteristics of interest including population density and the use of public transportation. New York City and Chicago rank first and third in the USA in terms of overall population and first and second in population density in cities with more than 1,000,000 residents [30, 31]. According to the American Public Transportation Association, New York City and Chicago rank first and second in the annual number of passenger trips using public transportation [32]. Overall, the demographics of New York City and Chicago are very similar as seen in Table 1, especially with respect to race/ethnicity, education, and income. In addition, due to some evidence suggesting COVID transmission could be potentially mitigated by temperature, humidity, or sunlight [33, 34], we selected cities with similar climatic conditions.

Table 1 NYC and Chicago city-wide demographics.

Data

Cumulative counts of SARS-CoV-2 diagnoses (“cases”) for NYC are from New York City Department of Health and Mental Hygiene’s Incident Command System for COVID-19 Response (4/13/2020) [35]. Of the 211 Zip Code tabulation areas in NYC, 34 (16.1%) had no data, likely due to low or no populations (e.g., airport, commercial areas, parks). Reported SARS-CoV-2 diagnoses (> 5 cases per ZIP code) are from the Illinois Department of Public Health (4/13/2020) [36]. ZCTAs were included in the CHI analysis if their centroid was within the published city boundary or greater than 20% of the ZCTA area was within the city. Of the 60 ZCTAs, three (5.0%) were excluded due to missing data, likely due to low populations or reporting five or fewer cases. Demographic and economic data are from American Community Survey (ACS) 2018 5-year estimates via NHGIS.org [27].

It is vitally important to note that these data may not necessarily represent the true distribution of SARS-CoV-2 cases based on biases in testing and extremely limited or differential testing and/or access [37]. Testing for SARS-CoV-2 also has resulted in high percentage of false negatives. This is due to two reasons, development of rapid testing during an epidemic and the need to collect a sample from deep in the pharynx which can be uncomfortable for the patient and is time consuming for the health care worker. False negatives are especially problematic since the infected individual can continue to spread the virus without being aware. There have also been instances of false positives being reported in health care setting due to the high viral load present in the environment [38]. These factors can decrease the reliability and accuracy of the data and make comparisons within and across geographical regions difficult.

Methods

Crude rates of reported SARS-CoV-2 cases were calculated by normalizing the number of diagnoses by the ZCTA population (cases per 1000 residents) and spatialized and analyzed using ArcGIS 10.7 (ESRI, Redlands California). The Global Moran’s I based on ZCTA contiguity revealed clustering (z-scores = 5.14 and 13.2 for CHI and NYC, respectively. Both p values < 0.001 suggest that it is highly unlikely for the clustered pattern to be a result of random chance. Hot spots of rates for each city were then calculated using the Getis-Ord (GI*) statistic parameterized using contiguity (i.e., ZCTAs sharing a boundary or corner). Resulting hot spots represent clusters of contiguous ZCTAs with higher values within the city (GI* was calculated once for NYC and once for CHI), whereas cold spots represent clusters of ZCTAs with low values. Clusters with ≥ 95% confidence were included in the analyses (Fig. 1). It is important to note that identifying statistical hot and cold spots is not the same as simply selecting the ZCTAs with the highest rates (e.g., top quartile). Unlike using quantiles or some other classification technique that ignores the geographic proximity of one geographic unit to another, hot (or cold) spots require clusters of ZCTAs to have high (or low) values relative to the study area which are accepted or rejected based on a significance value.

Fig. 1
figure 1

Reported SARS-COV-2 cases per 1000 ZCTA residents. Note: cities are shown at different scales. TOP LEFT: NYC crude rates in quintiles. TOP RIGHT: NYC hot spots and cold spots (GI*) at ≥ 95%, confidence. BOTTOM LEFT: CHI crude rates in quintiles. BOTTOM RIGHT: CHI hot spots and cold spots (GI*) at ≥ 95%, confidence

American Community Survey (ACS) data were mapped by ZCTA and linked with SARS-CoV-2 case rates and hot/cold spots for exploration and analysis (Fig. 2). ZCTA averages, stratified by hot/cold spot status, were calculated for variables of interest which include (1) SARS-CoV-2 cases per 1000 residents, (2) total population, (3) population density, (4) average household size, (5) % of housing units with > 1 occupant per room, (6) % NH White, (7) % NH Black, (8) % Latinx/Hispanic, (9) % foreign-born, (10) % 65 years or older, (11) % of workers who commute using public transportation, (12) % of adults without a high school degree, (13) % of adults with a bachelor’s degree or higher, (14) % of residents earning under the federal poverty threshold, (15) median household income, (16) % of the civilian workforce who is unemployed, and (17) % in of workers in management, business, science, and arts occupations.

Fig. 2
figure 2

Selected ZCTA-level demographics for NYC and CHI in quintiles. Note: cities are shown at different scales. TOP ROW: % NH white. SECOND ROW: median household income (in 2018-adjusted dollars). THIRD ROW: population density (people per square mile). FOURTH ROW: % unemployed

SAS Statistical software (version 9.4, SAS Institute, NC) was used to conduct all statistical analyses. Means were calculated across ZCTA for both cities. A Wilcoxon two-sample test was used to calculate differences in demographic variables at the ZCTA level between the two cities. Due to the small sample size the t-approximation was used to calculate p values.

Results

Demographics in the NYC and CHI study areas are summarized and described using population-weighted averages (Table 1). The overall positive test rate on April 13 for NYC was 12.5 per 1000 residents, which was approximately four times greater than CHI. New York City and CHI were comparable in terms of median household income, percent non-Hispanic White, percent with a bachelor’s degree, and percent below the poverty line. The main difference between the two cities was in terms of population density and percent foreign-born. Although NYC and CHI are roughly the same geographic area (294 mi2 and 262 mi2, respectively), NYC has almost three times as many people per mi2. The higher density is also observed in households, with the number of households with > 1 person per room being nearly three times that of CHI. The dissimilarities in density and urban landscape are also reflected in the percentage of individuals using public transportation, with 56.3% of NYC residents relying on public transportation and only 27.6% in CHI.

Demographics in the NYC and CHI hotspots were summarized and described using ZCTA averages (Table 2). There were notable differences between hot spots and cold spots in each city, as well as differences comparing across study areas. The NYC hot spots included 31 ZCTAs, representing nearly 1.5 million people (~ 17.4% of the population in the NYC study area) and CHI hotspots consisted of 8 ZCTAs with 445,000 residents (~ 15.8% of the CHI study area population). Hot spot neighborhoods in both cities tended to have lower proportions of non-Hispanic (NH) white residents, higher proportions of NH Black/African-American residents, a greater percentage of older residents, fewer college graduates, and lower proportions of workers in managerial occupations compared to cold spots or the “rest of city”.

Table 2 Hot/cold spot characteristics of NYC and Chicago (ZCTA-level averages)

Spatial density is an important factor in the spread of communicable diseases. Hot spots in both cities had significantly larger household sizes compared to cold spots (NYC: 3.0 people per household in hot spots and 2.1 in cold spots; CHI: 2.8 people per household in hot spots and 2.0 in cold spots). However, hotspots were located in neighborhoods that were significantly less dense (NYC: 22,900 people per square mile in hot spots and 68,900 in cold spots; CHI: 10,000 people per square mile in hot spots and 23,400 in cold spots), and the proportion of housing units with more than one occupant per room was not significantly different (0.39 and 0.36 in NYC and CHI, respectively) between hot and cold spots. Additionally, there were lower proportions of public transportation commuters in both cities’ hot spots than cold spots, and the difference in NYC (p < 0.01) was more meaningful than that in CHI (p = 0.62). This is not reflective of public transportation use during the outbreak, but rather a pre-pandemic measure of “connectedness” or “centrality” of a neighborhood.

There are some variables that suggest different patterns between NYC and CHI. Poverty rates, for instance, are lower for both hot and cold spots compared to the rest of the city in NYC, whereas in CHI, the poverty rates are highest in the hot spots. Unemployment follows a similar trend, where the NYC rates are highest in the areas which are neither hot spots nor cold spots, but in CHI, the unemployment rates are by far the highest in the hot spots. Although median household income is highest in cold spots for both cities, in NYC, the median household income in hot spots is higher than the rest of the city, whereas in CHI hot spot, incomes are much lower than the rest of the city. Finally, the proportions of both foreign-born (p < 0.01) and Latinx (p < 0.01) residents are higher in NYC hot spots than cold spots (but hot spot values are similar to the rest of the city), whereas the opposite is true for Chicago with lower proportions of foreign-born (p < 0 .06) and Latinx (p = 0.12) residents in hot spots versus other parts of the city.

Conclusion and Discussion

In both Chicago and New York City, cold spots had a higher prevalence of social determinants of health characteristics typically associated with better health outcomes and the ability to maintain physical distance. These neighborhoods tended to be wealthier, have higher educational attainment, higher proportions of non-Hispanic white residents, and more workers in managerial occupations. Hot spots between the cities also had some similarities, such as lower rates of college graduates and higher proportions of people of color. However, there are some other findings which must be highlighted. For instance, in both cities, it is not the densest areas which appear to be most impacted by SARS-CoV-2, but rather, it is the less-centralized, lower-density neighborhoods. In these two large US cities, it appears to be larger households (more people per household), rather than overcrowding or overall population density—which may be reflective of neighborhood socioeconomic status—that may be a more strongly associated with geographic hot spots.

Perhaps most striking are the differences in the economic and racial composition of the hot spots between NYC and CHI. At this point in the epidemic, NYC has a mix of racial/ethnic neighborhoods comprising hotspots. For instance, the Staten Island hot spot in NYC is nearly 60% NH white, whereas the hot spot in Eastern Queens is less than 6% NH white (Figs. 1 and 2). In NYC overall, the ZCTA-level average shows approximately 27% of the population as NH white, 30% NH Black, and 29% Latinx/Hispanic. In Chicago, the hot spots ZCTAs are on average approximately only 4% NH white, 11% Latinx/Hispanic, and nearly 83% NH Black. Although in both cities NH white residents may be underrepresented in hot spots, Chicago shows the inequities much more clearly. Economic distinctions are even more stark. The population in NYC’s hot spots are, overall, middle income with ZCTA-level average median household incomes around $70,000, which, although lower than the cold spots ($117,000), are higher than the rest of the city ($64,000). Conversely, the average median household income in Chicago’s hot spots is only $35,000 with cold spots and the rest of the city being $97,000 and $63,000, respectively. Poverty rates in NYC hot and cold spots were both around 13%, whereas the rate in the rest of the city was over 18%. Chicago, on the other hand, had hot spots with poverty rates of over 30% which is higher than both cold spots (10%) and the rest of the city (18%). This is mirrored by unemployment rates, where the NYC hot spots had rates of under 7% compared to Chicago’s nearly 18%.

The NYC hot spots can generally be characterized as working-class and middle-income communities, perhaps indicative of a higher concentration of service workers and other occupations (including those classified as “essential services” during the pandemic) that may not require a college degree but do pay wages above poverty levels. Chicago’s hot spot neighborhoods are among the city’s most vulnerable, low-income neighborhoods with extremely high rates of poverty, unemployment, and NH Black residents.

It is important to note that this represents an ecological analysis and does not use individual-level data. The results characterize the neighborhoods (clusters of ZCTAs) and not necessarily the individuals living in those neighborhoods. Additionally, using hot spots, rather than an analysis of individual ZCTAs, is designed to detect spatially proximal groups of ZCTAs with high or low rates but may not include geographic outliers (e.g., a single ZCTA with high rates surrounded by low-rate ZCTAs). However, the analysis of ZCTA clusters, as opposed to individual ZCTAs, is able to more readily represent the influence of proximal neighborhoods on one another which is relevant not only with respect to infectious disease but also population characteristics and behaviors [39]. The goal of this project is not to infer causation but rather to describe the nature of SARS-CoV-2 hot and cold spots in two large US cities. The information about the demographic and economic characteristics of hardest-hit areas may help inform more equitable future public health response strategies and direct resources to mitigate the impact of COVID-19 properly and preemptively. However, the differences found, although striking, may be at least partially a function of a number of factors, including possible outmigration (e.g., certain residents may have been able to leave the city as the epidemic began) and potential bias and extremely limited testing/reporting and possible false positives/negatives. For instance, it is possible that the Staten Island cluster is a result of more aggressive testing practices in those neighborhoods compared to other areas with less social or political capital. The possibility of bias in access to testing is reflected in a number of other studies. For example, a California-based retrospective cohort analysis suggests that African American patients may be tested much later in the progress of the disease when compared to other races or ethnicities potentially resulting in worse health outcomes [40]. A New York City-based economic study found that testing rates were equitably distributed across income ranges (e.g., the top and bottom 10% of incomes received 11 and 10% of tests, respectively); however, the proportion of those tests which were positive was heavily biased towards lower-income communities, suggesting a greater need for testing in those areas (i.e., where there are more confirmed cases, there should be more testing) [41]. It is also important to note that this analysis is based on testing results and do not examine COVID-19-related hospitalizations or deaths. This study also conceptualized economic variables as median household income and poverty and did not explicitly examine economic disparities (e.g., the Index of Concentration at the Extremes [42]) within ZCTAs; however, this could be a meaningful variable for future work. Additionally, NYC and CHI are not only different in urban morphology and demographics but also may have been in different stages of the epidemic at the time we undertook our study. The cumulative SARS-CoV-2 rates, particularly when comparatively low as is the case in some Chicago ZCTAs, can change rapidly due to the dynamic nature of infectious disease spread as well as a hopeful increase in thorough testing. However, it is clear that as of April 13, 2020, Chicago and New York City have some similarities, particularly in with respect to possible “protective” factors, as well as important distinctions. Further study will be needed to determine if other cities, domestic or global, have comparable trends and hypotheses will need to be generated and tested to attempt to identify associations as more complete, and reliable data become more available.