Evaluating spatial inequity in last-mile delivery: a national analysis

Purpose – Despite large bodies of research related to the impacts of e-commerce on last-mile logistics and sustainability,therehasbeenlimitedefforttoevaluateurbanfreightusinganequitylens.Therefore,thisstudy proposesamodelingframeworkthatenablesresearchersandplannerstoestimatethebaselineequityperformanceofamajore-commerceplatformandevaluateequityimpactsofpossibleurbanfreight managementstrategies.Thestudyalsoanalyzesthesensitivityofvariousoperationaldecisionstomitigatebiasintheanalysis. Design/methodology/approach – The model adapts empirical methodologies from activity-based modeling, transport equity evaluation, and residential freight trip generation (RFTG) to estimate person-and household-level delivery demand and cargo van traffic exposure in 41 U.S. Metropolitan Statistical Areas (MSAs). Findings – Evaluating 12 measurements across varying population segments and spatial units, the study finds robust evidence for racial and socio-economic inequities in last-mile delivery for low-income and, especially, populations of color (POC). By the most conservative measurement, POC are exposed to roughly 35% more cargo van traffic than white populations on average, despite ordering less than half as many packages. The study explores the model ’ s utility by evaluating a simple scenario that finds marginal equity gains for urban freight management strategies that prioritize line-haul efficiency improvements over those improving intra-neighborhood circulations. Originality/value – Presents a first effort in building a modeling framework for more equitable decision-making in last-mile delivery operations and broader city planning.


Introduction
Spatial inequities are prevalent in urban freight systems.Satellite-based estimates suggest low-income populations of color (POC) in major U.S. cities are exposed to 28% more diesel traffic-related Nitrogen Oxide (NOx) emissions than high-income white populations (Demetillo et al., 2021).Despite a steady decrease in tailpipe emissions across high-income countries, diesel exhaust from freight vehicles remains deleterious to respiratory and heart

Spatial equity in last-mile delivery
The literature review draws methodological insights from three bodies of empirical work: activity-based demand modeling, transport equity evaluation, and residential freight trip generation (RFTG) (Section 2).Section 3 describes a four-phased methodology that calibrates a model utilizing public household travel survey data and estimates racial and socioeconomic distributions of last-mile home delivery activity in 41 U.S. Metropolitan Statistical Areas (MSAs).Section 4 presents the results, weighs the sensitivity of different operational decisions to mitigate bias, and demonstrates the model's utility by conducting a simple scenario evaluation across the model's two delivery segments: cargo van line-haul and a traveling salesperson problem (TSP).Section 5 discusses analytical limitations, practical applications for private and public actors, and concludes with how future research can expand inquiry.

Literature review
Modelling freight demand and trip generation can enable researchers to estimate baseline and projected impacts of proposed regulatory and/or operational strategies to better inform equitable transportation planning.It is especially applicable to last-mile delivery given IJPDLM e-commerce's growing influence on urban livability, form, and sustainability (see International Transport Forum, 2022 for an overview).Importantly, e-commerce placed households as an important freight agent in the supply chain; in particular, high-income and educated households (Newing et al., 2022;S anchez-D ıaz et al., 2021) and white populations who order packages at greater frequencies (Figliozzi and Unnikrishnan, 2021).The result has stemmed some commercial vehicle traffic away from traditional truck corridors and into residential neighborhoods, while simultaneously bringing warehousing and distribution facilities closer to urban centers to facilitate faster and more reliable deliveries (Fried and Goodchild, 2023).However, in their study of metro Seattle, Fried et al. (2024) find that it is not populations groups ordering the most packages that bare the lion-share of delivery's external costs (e.g.pollution), but the populations in proximity to warehouses and distribution centers that are disproportionately low-income and POC (Fried et al., 2024).
As such, strategies seeking to improve urban delivery efficiency will have disparate impacts across population groups.It is, therefore, important that planners and practitioners have methods and data to spatially evaluate the distributional impacts of urban freight management strategies and policies.The following review summarizes the current state of demand modeling techniques in transport and urban freight literature and distills relevant insights to designing a modeling framework for evaluating equity in residential, last-mile delivery.

Activity-based models in transport equity evaluation
The most well-known modeling practice in transportation planning is activity-based travel demand models (ABMs).ABMs are a spatial microsimulation that use household travel surveys to represent the effect of transport conditions on individual-or householdlevel travel choice and activity (Castiglione et al., 2014).ABMs have steadily superseded conventional four-step travel models as a principal tool regional planning agencies implement to forecast travel demand and evaluate impacts of future transport investments, due to ABM's ability to capture individual travel capabilities that zonal, four-step models otherwise obscure (Martens and Hurvitz, 2011).For this reason, ABMs have been widely adapted to evaluate equity in proposed transport programs, with numerous authors identifying best practices for researchers and practitioners (e.g.Bills and Walker, 2017;Castiglione et al., 2006;Karner and Niemeier, 2013;Martens and Hurvitz, 2011;Williams and Golub, 2017).Distilling these practical guidelines, this study identifies the most relevant "operational decisions" when designing an equity evaluation: (1) Output metrics that define or proxy the distributed cost/benefit; (2) Equity indicators that identify inequitable outcomes, reflecting a reasoned moral standard; (3) Spatial units by which to aggregate output metrics; and (4) Population segmentation that defines socially marginalized "target" groups by which to compare to a control group.
In terms of output metrics, most equity-oriented ABMs assess socio-spatial distributions and shifts in accessibility (i.e.n opportunities accessed via t time budget and j modal choice) to proxy for activity participation and social inclusion (Allen and Farber, 2020).To a lesser extent, researchers use ABMs to evaluate transport's distributive environmental and safety impacts (Guo et al., 2020).In urban freight research, delivery vehicle miles traveled (VMT) has served as a base proxy for deriving many freight traffic-related externalities such as emissions, road deaths/injuries, or generalized costs (C ardenas et al., 2017).Most equity indicators reflect normative preference for absolute or "simple" equality, frequently captured by statistical difference in means tests (e.g.t-tests) between population Spatial equity in last-mile delivery segments or Gini indices between spatial units.Simple equality is sometimes referred to as the "default criterion for fairness" (Folger, 1996), compelling authors to argue the case when their reasoning normatively or empirically departs.Recent transportation planning literature, however, has adopted more nuanced, liberal egalitarian arguments (Pereira et al., 2017).For instance, Bills (2022) calculate individual difference densities in logsum accessibility-a relative, distribution-based indicator-to evaluate proportional and Rawlsian equity in proposed transportation improvements in the San Francisco Bay Area.Nahmias-Biran and Shiftan (2020) draw on a Capabilities Approach (CA) to transport, which centers freedom of choice and well-being over need-based access to resources, and derive a value of capabilities metric that monetizes accessibility gains, which can then input into conventional cost-benefit analyses.Some transport equity evaluations highlight the impact of the modifiable areal unit problem (MAUP), which can generate statistical biases based on the spatial unit selected for GIS analysis.In the U.S., spatial discrepancies manifest in the selection of Census-defined enumeration units, which ranges from transportation analysis zones (TAZs), to tracts, and then block groups.There is sizable debate within environmental justice research on the most representative spatial or distance-based unit for estimating harmful proximity to environmental hazards (Baden et al., 2007;Chakraborty et al., 2011), with transportation researchers observing statistical discrepancies in the strength, significance, and (occasionally) sign of correlations across scales.For example, the case of disparate socioeconomic and racial exposure to near-roadway traffic (Rowangould, 2013).
Population segmentation represents another source of variability.Analysts typically define target populations using cross-classified "social vulnerability" indicators (e.g.nonwhite, populations of color [POC] and/or low-income) and thresholds (e.g.incomes at x% below the regional mean), against a control group.Rowangould et al. (2016) compared public transport accessibility metrics across relative, absolute thresholds, population-weighting, and community-defined population segments and concluded the latter two segments to be most useful in evaluating equity for current and future transport conditions.
However, in the absence of community knowledge, equity evaluations should at a minimum address geographic and social biases by employing multiple equity indicators, spatial units, and population segments.The process encourages analysts to be deliberate about operational decisions that could otherwise appear arbitrary, or worse, purposely selected to achieve desired results.

Residential freight trip generation (RFTG)
With methodologies stemming largely from passenger ABMs and four-step models, freight trip generation (FTG) models the complex logistical decisions made by commercial agents in response to economic activity (Holgu ın-Veras et al., 2012).While most FTG models concern business-to-business freight flows, residential FTG (RFTG) models have emerged more recently to understand e-commerce's influence on consumer shopping/travel habits and external net-impacts.Residential delivery constitutes a sizable fraction of total urban commercial freight traffic volume, although modeled estimates range from a quarter (Sakai et al., 2022), a third (Wang and Zhou, 2015), to half (Holgu ın-Veras et al., 2021) depending on geographic context and/or model sophistication.
Freight models abstract a large degree of heterogeneity in agent behavior (S anchez-D ıaz, 2017).Households are core agents in e-commerce supply chains, with consumer behaviors largely affected by socio-demographics, commodity type, local built environment, and accessibility to shopping opportunities (Beckers et al., 2022;Sakai et al., 2022;Wang and He, 2021).The proliferation of retail omni-channeling and automated collection and delivery points (CDPs) (e.g.electronic parcel lockers) also means many online orders are delivered IJPDLM outside the home, adding a spatial mismatch that is unaccounted for in many RFTGs.As such, freight trip models implemented without ground-truthed calibration and/or supplementary, qualitative knowledge can lead to ecological fallacies (McLeod et al., 2019).
Data availability has also presented challenges (Buldeo Rai and Dablanc, 2022).Many RFTG models rely on extensive field data collection efforts (e.g.Rodrigue, 2017) or proprietary sources, such as GPS traces or travel logs from logistics providers (e.g.C ardenas et al., 2017).Although these data are important for fine-tuned understanding of local delivery activity, they are difficult to obtain and do not allow for national and comparative analyses.
Nevertheless, RFTG has contributed valuable insights for understanding consumer travel and ordering behavior.Jaller and Pahwa (2020) use publicly available American Time Use Survey data to evaluate emission trade-offs between in-person versus online shopping behaviors and conclude emission mitigation is more sensitive to delivery's operational characteristics rather than simple substitution of one shopping behavior over another.Beckers et al. (2022) use socio-economic data and delivery pick-up location preference to forecast delivery trip frequency in a mid-sized Belgian city using an ordered logit model and a spatial microsimulation heuristic.Escand et al. (2018) use U.S. Postal Service (USPS) survey data to evaluate spatio-temporal mismatch in supply/demand of curbside parking in commercial districts and residential neighborhoods in New York City.
However, this study does not observe any freight modeling research that assesses social inequities.Most pertinent to this study is the work of Wang and Zhou (2015), who presented an early effort to model and spatially analyze RFTG using the publicly available National Household Travel Survey (NHTS).The following methodology details how this study supplements the authors' preliminary approach by connecting to ABM techniques and operational decisions embedded in transportation equity evaluation.

Methodology
This paper draws on ABM and RFTG techniques to estimate current and future social inequities in residential, last-mile delivery.As discussed in Section 2.1, operational decisions taken in the analysis can greatly influence observed inequities.Therefore, Figure 1 overviews the four-phased methodology, which builds off preliminary research by Fried et al. (2024).This paper adds to the previous study by formulating a replicable approach using predominantly open data/tools and expanding the analysis across 41 U.S. metropolitan statistical areas (MSAs).The approach additionally focuses the operational decisions undertaken in the equity evaluation and explores their analytical impact.

Phase 1: delivery demand model calibration
Phase 1 concerns the calibration of the behavioral model that estimates person-and household-level demand for home delivery.The underlying dataset is the NHTS, conducted by the Federal Highway Administration in 2017.The survey asks respondents for the number of home deliveries received in the past 30 days Wang and Zhou (2015) employed a nested binary choice and negative binomial (negbin) regression model to estimate residential freight trips at the zipcode-level in the New York State Capital District using the same dataset from 2009.Following Wang and Zhou, this study censors ordering frequency greater than 10 packages per month.
When observing monthly delivery frequency in the 2017 NHTS data, the most recent release at the time of the study, the authors find evidence of overdispersion (μ 5 2.27; σ 2 5 8.85), confirming a negative binomial count distribution, and excessive zero frequencies (42.8%).As described by S anchez-D ıaz (2020), excessive zeros stem from two behaviors: a) the respondent does not order any parcels or b) order frequencies are so low, the respondent Spatial equity in last-mile delivery cannot recall the last delivery.Zero-inflated negative binomial (ZINB) model is, therefore, appropriate.ZINB models differ from conventional count models in that high incidences of zero outcomes are a result of two different processes: the logit-probability of an outcome occurring and its expected count frequency should the non-zero outcome occur.The mathematical relation follows: Equation 1 where: y i ¼ parcels recieved in last 30 days βþε i and β is a coefficient of independent variable x with standard error ε α ¼ dispersion parameter with Γ gamma function Table 1 overviews the selected independent variables.The authors selected variables based on their prevalence as behavioral factors observed across e-commerce literature (Beckers et al., 2022, p. 301).The model inputs eight independent variables across three categories: household-level demographic constraints (income, size, and vehicle ownership), person-level demographic constraints (working age, sex, white/non-white race, and college-level education attainment), and a dummy indicator signifying high levels of urbanization.High urbanization signifies that a household occupies a tract with population density greater than 3,000 people per square mile, roughly one standard deviation (σ) above the mean (μ) population density in observed MSAs.NHTS also categorizes respondent incomes into 11 groupings at intervals ranging from "less than $10,000"' to "'$250,000 or more."'The model reclassifies income into low (<$50,000), middle (>$50,000 and <$150,000) and high (>$150,000), which most closely resembles one σ below and above the μ household income.
Table 2 compares the coefficients (β) and standard error outputs from both the negbin and ZINB models.Using both maximum log-likelihood (LL) and Akaike Information Criterion (AIC) testing, the authors determine the ZINB model parameters possess a strongest fit than the negbin model.In the ZINB, all coefficient proved statistically significant at Pr(>jzj) < 0.01, except the high urbanization indicator in the negbin portion that had weaker significance.The results confirm the inclusion of all person, household, and urbanization variables in both ZINB portions.
Income, race, working age, and college attainment proved particularly strong determinants in both expected frequency and zero probability of package ordering.For

Spatial equity in last-mile delivery
instance, the expected log change in ordering frequency would decrease 0.39 for low-income households compared to high-income households, holding all other variables constant, while the log odds for an excessive zero would increase by 1.3.In other words, wealthier households, white, working age, and college educated individuals are more likely to order a package and at higher frequencies.The authors also tested interactions between these four variables, but found nominal LL improvements, low significance in the interaction terms, and no evidence for multicollinearity (using variance inflation factor testing).As such, the model does not include interaction terms.

Phase 2: population synthesis and demand model application
Since the ZINB is calibrated using person-and household-level responses while demographic data is provided zonally.Therefore, this study synthesizes a population to derive residential freight trips from the underlying behavioral model.While there are several heuristic algorithms that synthesize persons within households within zones, this study opts for an entropy maximization-based approach.PopulationSim is an open-source package that synthesizes population weights while preserving the distribution of initial households weights, the base entropy, in the PUMS data.The tool then employs list balancing to match the geographic constraints.PopulationSim's advantage is that it allows for relatively low deviations between the distribution of synthesized person/ household variables, the seeded weights, and marginal controls.Moreover, PopulationSim has become a widely adopted tool by several transportation agencies in initializing regional ABMs and informing long-term mobility plans.

IJPDLM
The population synthesis processes inputs from spatial units (PUMAs, tracts, and block groups) contained within the 41 sampled MSAs' boundaries.The analysis selects MSAs based on the presence of at least one Amazon last-mile delivery station (LMDS) that was open in 2017 (the NHTS sample year).
After dropping NAs [1], the synthesized population contains 107.4 million persons in 52.7 million households.The authors then apply the ZINB to estimate delivery demand at the individual-level, which can be assigned and aggregated to various spatial units.On average, white individuals ordered 44% more packages per month than non-white individuals.Additionally, individuals belonging to high-income households ordered 45 and 174% more packages than their middle-and low-income counterparts, respectively.

Phase 3: RFTG and VMT estimation
This methodology expands on Wang and Zhou (2015) by assigning residential freight trips to the road network, which is critical in transport equity analysis given the high correlation between highway proximity and socially marginalized populations (Rowangould, 2013).This study uses the OpenStreetMap network.
Figure 2 overviews the 41 sampled MSAs, including select demographic characteristics and number of Amazon LMDS.LMDS facility location information was the only proprietary data source used, which the authors purchased from MVPWL International, a third-party logistics firm.LMDS are medium-sized facilities (or "depots") that transload inbound trucks and outbound cargo vans that complete home deliveries.Trucks originate further up the urban distribution chain at regional sortation centers, air cargo hubs, or warehouses.Since there is greater uncertainty regarding warehouse origin, this study only analyzes the depotto-consumer portion of Amazon's distribution chain.Since the company is the preeminent online retailer in the U.S., Amazon's logistical decisions are assumed to be representative across the e-commerce space.

Spatial equity in last-mile delivery
The analysis assumes home demand is fulfilled by the closest depot.As such, LMDS locations serve as the origins points for a local depot vehicle routing problem (LDVRP).Goodchild et al.
(2017) provides an approximated LDVRP formula (Equation 3), which is the sum of two segments: (1) A two-way line-haul distance (d) between the depot and spatial unit (i) centroid and (2) A traveling salesperson problem (TSP) distance for a given square-shaped, spatial unit area (A).
Equation 2 where: k 5 constant approximating network geometry (0.92 for Manhattan distance) D 5 number of customer households or residential freight trips v 5 the mean cargo van capacity (i.e.load factor) VMT is then aggregated by coinciding spatial unit, discussed in the following section.The analysis sets a static van load factor (v) to 175 packages, which is about 70% the carrying capacity of an Amazon cargo van assuming the maximum permitted weight of largestandard boxes (∼14 pounds).While the load factor is likely an underestimation, Fried et al.
(2024) note in their analysis that equity indicators are not highly sensitive to fluctuations in cargo van load factors.
Applying the demand model generates an estimate for deliveries/packages (y), which does not necessarily equate to residential freight trips (D).Delivery drivers make several trips during a tour and may deliver several packages during a trip, e.g. for customers residing in an apartment complex.Therefore, this study uses the urbanization indicator to estimate housing unit size for higher and lower population dense areas, which is then used as a conversion factor between delivery demand and freight trips.
If population density for a spatial unit is less than 3,000 people per sq.mi, the authors assume a one-to-one equivalency between delivery and trip (D 5 y*1).That is, most nonurban customers live in single-family homes.Populations living in more urbanized areas are more likely to reside in apartment complexes.However, MSA populations ranged dramatically from New York City (19.9 million) to North Port/Sarasota, FL (732,300), leading to a large variance in average urbanized housing unit size.The study uses the mode urbanized housing complex in the synthesized population, two units.The conversion factor for urbanized spatial units equals the inverse (D 5 y*0.5).However, as Wang and Zhou (2015) note, the usage of conversion factors is likely to underestimate the number of freight trips, and more localized applications of RFTG should adapt the conversion factor to local housing density conditions.
Finally, the study calculates line-haul distances (d) using R5R, an open source, R-based network analysis package developed by Conveyal.

Phase 4: equity evaluation
Phase 4 relates to the operational decisions this study undertakes to evaluate racial and socioeconomic equity.The study generates measurements across geography and populations while controlling for geographic (i.e.MAUP) and social biases and weighs the sensitivities of the operational decisions against each other.The evaluation makes the following operational decisions:

IJPDLM
(1) Output metric is delivery vehicle miles traveled (VMT).This study normalizes VMT by the area of the coinciding spatial unit (mi 2 ).
(2) Equity indicators utilize statistical tests to define inequitable outcomes.This study adopts a default, simple egalitarian ethical standard, i.e. an outcome is inequitable if there exists any statistical difference across space and/or population groups.
(3) Population segmentation defines social marginalization as low-income (low) and nonwhite, POC based on three thresholds.However, rather than compare the target to a singular control group, which obscures ordinal differences between incomes, the evaluation compares population groups between each other.The thresholds include: Relative threshold: one σ below the μ mean for income and more than μ% nonwhite population, relative to the MSA.The relative threshold creates six groups (LowPOC, LowW, MidPOC, MidW, HighPOC, HighW).
Absolute threshold: less than 200% the federal poverty line for a two person household ($32,480 in 2017) and with more than 40% POC population (roughly the national μ% in 2017).The poverty threshold is often used to determine qualification for most public welfare programs.The absolute threshold creates four segments (povertyPOC, povertyW, nonpovertyPOC, nonpovertyW).
Population-weighted: uses the synthesized population and constraint thresholds defined during Phase 1 and 2. Rather than comparing populations between spatial units, such as for the threshold-based segments, population weighting compares delivery exposure between individuals.Population-weighting creates similar groupings to the relative threshold.
(4) Spatial units include both tracts and block groups.The evaluation tests two spatial proximity estimates: Unit hazard coincidence (UHC): is a spatial intersection between delivery routes and coinciding spatial units.
Areal containment (AC): is a distance-based proximity estimate that joins spatial units if 50% of their area is contained within a buffer distance of a delivery route.Mohai and Saha (2015) present evidence for this approach's merits, including using a 3 km (1.86 mile) containment buffer that this study also adopts.
Since the methodology's utility derives from its applicability to evaluating equity in urban freight improvement scenarios, the secondary evaluation presents a cross-sectional analysis in which the study tests improvements in efficiencies for both the line-haul and TSP delivery segments.The study selects the most conservative measurement to utilize in the scenario evaluation.

Results
The operational decisions generate 12 measurements based on combinations of three population segments (relative threshold [RT], absolute threshold [AT], and population weighting [PW]) and four spatial units (tracts and block groups [bg] via unit hazard coincidence [uhc] and areal containment [ac] spatial joins).The following section compares the analytical sensitives of spatial unit and population segment selection.The LDVRP's treatment of line-haul and TSP delivery segments allows for useful applications to estimate the equity impact of urban freight management strategies for both Spatial equity in last-mile delivery public and private actors.As such, this section also presents a scenario evaluation that compares efficiency improvements to both segments.

Sensitivity of spatial unit selection
Spatial unit selection affects the observed distribution of cargo van VMT density across delivery segments.Figure 3 presents the histogram of the natural log-scaled output metric across tracts and block groups.AC and UHC spatial join types impact observed exposure to delivery's line-haul segment, whereas spatial geometries do not influence TSP distances.The line-haul's dependence on depot and road network geometries creates a large skew in the observed distribution.Spatial units near depot locations and/or major roads show substantially higher near-route exposure to line-haul VMT, while non-proximal units exhibit zero or near-zero exposure.Line-haul values of zero signify that all delivery trips contained within the spatial unit complete a delivery in said unit.
Aside from skew, line-haul shows substantially higher variance compared to TSP, which presents a normal distribution.As expected, TSP only accounts for a small fraction of total delivery VMT density compared to line-haul.Across spatial units and join types, TSP VMT constitutes less than 0.1% of total delivery VMT, with one exception.For block groups with join type UHC (bg.uhc),TSP constitutes 29.7% of total VMT.The explanation for this higher share is again geometric: bg.uhc spatially captures far less line-haul routes than larger tracts and the wider spatial join, AC.
High incidences of zero values and line-haul and TSP discrepancies result from the spatial aggregation of delivery routes.AC catches more VMT than UHC, since delivery routes can only coincide with one unit at a time but really contain multiple units.However, the empirical gap between UHC and AC joins is substantially larger for block groups than tracts.The percent difference between the two line-haul averages for tracts and block groups is roughly 75 and 200%, respectively.The mean line-haul VMT density for UHC joins is also substantially lower for block groups compared to tracts, as well as presenting a smaller variance.The mean line-haul density for bg.uhc is only 110% higher than its respective TSP IJPDLM density, while its median value is much smaller.In other words, bg.uhc mitigates the discrepancy between line-haul and TSP VMT, while presenting less extremely high outliers and allowing for more stable comparisons across scenarios.

Sensitivity and significance of population segmentation
This subsection introduces population segmentation to the spatial metrics to explore statistical differences between target and control groups.Figure 4 presents the socio-spatial distribution of total VMT per sq.mi across the 12 measurements.The figure orders the boxes by highest median to smallest, given the skew in the data, although the mean is also visualized.The target group median (lowPOC for RT and PW, povertyPOC for AT) is larger than all control groups in eleven out of 12 measurements.
The exception is the population-weighted bg.uhc measurement (PW.bg.uhc), in which lowPOC is behind midPOC then highPOC.This discrepancy links to the higher TSP share inherent to bg.uhc, which this study dissects further in the scenario evaluation.For now, it is important to note that PW.bg.uhc suggests that POC are still exposed to more cargo van VMT density than white populations, regardless of income.For PW.bg.uhc,POC are exposed to roughly 20 and 35% higher median and mean VMT density, respectively, than their white counterparts.
Table 3 displays the percent change in medians between target and control groups and tests its significance.This study uses a non-parametric ANOVA by ranks test (Kruskall-Wallis H-test), in which the null hypothesis suggests no significant difference between group medians.All measurements rejected the null hypothesis, justifying the use of a post-hoc Dunn's test to compare differences between target and control subgroups.
The largest differences occur between the target and highW/nonpovertyW groups.Among RT/PW segments, these percent differences ranged from roughly 735% (RT.tract.ac) to 14.1% (PW.bg.uhc).Meanwhile, AT percent differences ranged from 727% (AT.tract.ac) to 20% (AT.bg.uhc).Generally, AC captured the widest relative differences between target and control groups compared to UHC. Results imply that AC methods are more sensitive to detecting spatial inequities, as observed by other scholars (Mohai and Saha, 2015).Moreover, the post-hoc Dunn's test confirms statistically significant differences between most target and control groups.

Scenario evaluation exploring efficiency improvements in line-haul and TSP delivery segments
As observed spatial inequities varied largely by the share of line-haul and TSP VMT across spatial units and population segments, the study proceeds with a scenario evaluation that explores the equity impacts of altering both delivery segments.The analysis utilizes PW.bg.uhc, as it presents a more conservative estimate across spatial units and population segments.The analysis constructs a simple scenario in which efficiencies of the cargo van TSP and line-haul segments of the LDVRP are improved by 25%, scenario 1 and scenario 2 respectively.Since line-haul distances constitute a substantially higher proportion of total VMT, the evaluation compares percent differences between the two scenarios rather than raw VMT savings.Moreover, comparing individual difference densities grants nuanced insights into the distribution of impacts across target/control groups (Bills, 2022;Bills and Walker, 2017).
Figure 5 displays the cumulative distribution of percent change in total VMT density across the two scenarios.Since Section 4.2 finds statistically significant differences between POC and white populations, regardless of income, the evaluation redefines the target group as all POC individuals.Slightly less than half the target and control population are equally benefited by scenario 1.Since TSP VMT is normally distributed and line-haul VMT Spatial equity in last-mile delivery constitutes a greater share of total VMT, the control group receives higher marginal benefit from scenario 1.The inverse is true for scenario 2. Roughly half of the target group population receives outsized benefit from line-haul efficiency improvement.Table 4 validates mean differences in percentage change of VMT density across scenarios and groups using a Welch's t-test.The target group's average percentage change is only 3.4% lower than the control group in scenario 1, but 16.7% higher in scenario 2. Both differences

Spatial equity in last-mile delivery
are statistically significant, the test rejects the null that VMT savings for target and control populations are equal.However, while the target group's mean percentage change equates to 7.4 delivery miles saved per sq.mile per person per month in scenario 1, scenario 2 equates to 75.7 miles saved.While TSP and line-haul improvement create equal benefits for many, the scenario evaluation confirms that POC receive significant, outsized benefits from strategies that improve delivery line-haul efficiencies compared to those that only improve TSP efficiencies.

Discussion and conclusion
Evaluating 12 measures across varying population segments and spatial units, the study finds robust statistical evidence for racial and socio-economic inequities in last-mile deliveries for low-income and, especially, POC.By the most conservative estimate, POC are exposed to approximately 35% more cargo van traffic than white populations, despite ordering 44% less packages on average.Differences between indicators are substantial enough that researchers should take care when making operational decisions that evaluate equity.For instance, the median difference in observed traffic exposure between low-income POC and high-income white populations is up to 50-times higher for the most sensitive measurement compared to the most conservative one.
While the analysis mitigated typical geographic and social biases found in equity evaluation, such as MAUP and those arising from population segmentation, some limitations persist.First, analyzing inequities in cargo van distribution magnifies a highly visible sliver of a regionally interconnected and often opaque supply chain.Middle-mile commodity flows between regional logistics facilities are largely under-studied in urban freight literature (Tejada and Conway, 2022).Considering many warehouses and distribution centers have historically located and continue to open in socially marginalized neighborhoods (Yuan, 2018a, b), excluding inter-terminal truck flows obscure further observations of spatial inequity.
Data availability also constrained the study to 2017.U.S. online sales have grown by approximately 80% since then (U.S.Department of Commerce, 2023).In addition to pushing consumer shopping toward online and omni-channel retail, COVID-19's lingering behavioral effect on consumption also impacted the spatial ordering of urban logistics land uses (e.g. the emergence of "dark stores" and "micro-fulfillment centers").Meanwhile, Figliozzi and Unnikrishnan (2021) find online shopping has not increased equally across populations: wealthier and white households have ordered far more since the pandemic's start than lowincome households and POCs .This study did not attempt to project demand for future years.As such, it is possible that the model underestimates observed spatial inequities in home delivery.Moreover, future studies can and should test for modeled uncertainties not captured in this study's deterministic approach.

Application to private operations and public policy
The scenario evaluation demonstrates that efficiency improvements to delivery's line-haul segment presents outsized or higher marginal equity benefits than improvements to the TSP segment (see Figure 6).Enacting public policy and or private operations that redistribute or reduce traffic near warehouses and between delivery zones (i.e.distribution-oriented solutions) would have disproportionate benefits for low-income households and POCs than Spatial equity in last-mile delivery those that consolidate delivery demand inside neighborhoods (i.e.consumer-oriented solutions), such as by curbside management and short-ranged cargo e-bicycles.Since these latter strategies require high consumer density to be cost-effective and operationally efficient (Katsela et al., 2022), they are usually not implemented outside denser residential neighborhoods with more frequent online shoppers.The model also suggests zoning and development of warehouses and other logistics land uses present mixed efficiency and equity concerns based on their proximity to populations.Holgu ın-Veras et al. (2021) presents a case in Albany (New York) where the siting of an e-commerce distribution center in an exurban area added 800,000 additional VMT per year than an alternatively proposed site located closer to the city center.While it is possible that the exurban location avoided higher population concentrations, the siting traded-off operational inefficiencies that created region-wide externalities and for populations residing along the highway corridor.More centrally located urban distribution centers can build infrastructure to buffer nearby communities from negative externalities associated with "proximity logistics" (Buldeo Rai et al., 2022).
However, if new facilities continue to open in socially marginalized neighborhoods the model would observe less marginal gain in spatial equity.One application of this model would be to evaluate the distributional impact of changes in logistics land use, such as through proactive zoning policies or by integrating delivery depots into the urban fabric closer to more consumer dense neighborhoods.Parcel lockers are another example of this geographic trade-off.Parcel lockers reduce delivery VMT by eliminating the door-to-door milk run portion of the delivery tour.However, Schaefer and Figliozzi (2021) find that parcel lockers follow a consumer-oriented distribution pattern that leaves cold spots in Portland's (Oregon) socio-economically marginalized and predominantly Hispanic neighborhoods.Evaluating spatial equity in such a system would show traffic reductions in wealthier and predominately white neighborhoods, while leaving traffic in the former neighborhoods virtually untouched.

Conclusion
The rise of e-commerce paralleled growing recognition of the need to sustainably manage and decarbonize urban distribution.But by failing to consider spatial equity, companies and cities risk further entrenching historical injustices and excluding socially marginalized communities from the low-emission benefit that many innovative and sustainable urban freight practices intend to provide.While these practices can benefit all populations, solutions that prioritize efficiency improvements in upstream distribution have outsized benefits for marginalized populations.However, the subsequent discussion demonstrates that equitable urban freight practice carries nuanced geographic implications based on where and how logistics land uses spatially organize in metros.
This paper presents a first, systematic modeling approach to informing more equitable and data-driven urban freight policy and operations.The authors designed the methodology with replicability and scenario planning in mind, relying on public household travel data and adapting ABM modeling techniques.However, future work should further define model and evaluation parameters to meet local social conditions and needs.Researchers should calibrate localized models with ground-truthed traffic and survey data to mitigate the occurrence of ecological fallacies.Finally, public participatory approaches and citizen science can help ensure that evaluation outcomes reflect community priorities.
Figure 1.Procedural overview of four-phased methodology Figure 2. Sampled MSAs including select demographic information and number of Amazon last-mile delivery stations (LMDS)

Figure 3 .
Figure 3. Histograms of natural log-scaled cargo van VMT density by spatial unit and delivery segment, with smoothed density curve Figure 4. Boxplot matrix presenting the sociospatial distribution of output metrics for all 12 measurements Figure 5. Scenario impact on cumulative frequency of change in VMT density for target and control groups Figure 6.Hypothetical evaluation framework for select urban freight operations and policy strategies

Note 1 .
The authors observed roughly 48.0 million, person-level NAs in the college attainment variable, approx.24% of the sampled population.This observation is primarily due to the coding of educationlevel for persons under 3 years-old as NA in the PUMS dataset.

IJPDLM
Note(s): *POC includes respondents that respond with one or more Census-defined racial categories that do not include white, non-Hispanic