Skip to content

Urban Data Science & Equitable Cities

By 2050, the UN projects that 68% of the population will live in a city. With urban life shaping health and opportunity, using data to guide decisions and reduce inequality is critical. In this EAAMO Bridges working group, we host speakers, study papers, and workshop late-stage work on computational analysis of urban data, emphasizing topics that explore and address inequities in urban life. We meet every other week for a presentation followed by sustained discussion. See our official page for details.

Recent Speakers & Activites

May 19, 2025

Spatial data science for just and sustainable cities

Rafael M. H. Pereira · Guest Speaker

In this presentation, I will give an overview of my research at the intersection of spatial data science, urban analytics and accessibility, and sustainable mobility. Specifically, I will showcase work related to the development of open data science tools and methods for transportation network modeling used to examine spatial accessibility, energy use and the environmental performance of urban mobility systems. These tools contribute to research and planning by aiding researchers, students, and practitioners in effectively handling large-scale geospatial data for the examination of urban transportation networks and mobility futures. I will give particular attention to two projects related to: (1) a new scalable computational model to estimate public transport emissions at high spatial and temporal resolutions; and (2) recent developments of powerful multimodal routing models and their contribution to the analysis of socioeconomic and spatial inequalities in access to opportunities. At the end, I will discuss some of the advantages and limitations of these tools and models, reflecting on new research avenues for using spatial data science for sustainable and inclusive cities.

May 5, 2025

FloodNet

Charlie Mydlarz · Guest Speaker

FloodNet NYC is a sensor network for real-time urban flood monitoring and community flood resilience. Our team develops tools for real-time urban flood monitoring, implement these tools to measure flooding in New York City, and make flood data and monitoring tools available in a manner that is accessible and useful to stakeholders including residents, community-based organizations, government agencies, and researchers.

Mar 10, 2025

Using Administrative Datasets to Identify Landowners and Operationalize their Characteristics

Henry Gomroy · Guest Speaker

Landowners play central roles in many urban sociological theories, but empirical analysis of these actors has frequently been stymied by insufficient data. Few surveys collect detailed information on landowners and administrative data present multiple challenges, most importantly, that property owners frequently obscure their identities through corporate structures. This paper presents a data construction pipeline for creating linked, longitudinal datasets describing urban properties and the people and companies that own them using widely available tax assessment records and business filings. The author implements this approach in four metropolitan areas — Boston, Massachusetts, Baltimore Maryland, Miami, Florida, and Houston, Texas — between 2005 and 2020, demonstrating the adaptability of the method to areas with different levels of data quality. The pipeline draws on four methodological innovations. First, it uses internal validation and external harmonization to address biases and inaccuracies within tax assessment records. Second, it presents a network-based entity reconciliation methodology better suited than existing methods to the sparse but linked data contained in the source records. Third, it presents a flexible and comprehensive method for operationalizing landowners’ corporate networks. Finally, it operationalizes multiple sociological characteristics of landowners and estimates their potential bias. The paper concludes by demonstrating several empirical analyses this methodology opens.

Feb 24, 2025

Undermatching Disparities and Portfolio Decisions: Evidence from the New York City High School Match

Kenny Peng · Guest Speaker

In the New York City High School Match, applicants rank programs from over 800 options and are placed through a centralized stable matching process. We analyze individual application ranking (portfolio) behaviors that explain undermatching, defined as the difference in selectivity between where the student matched and where they could have matched had they applied. There are substantial disparities: undermatching is over 50% higher for Black and Hispanic applicants than for Asian or white applicants, with further gaps by income and geography. However, while individual student demographic characteristics and grades alone explain only 3.8% of the variation in undermatching, including individual application behaviors explains 40.9%. Black and Hispanic students are more likely to underreach (by only listing unselective programs, or inverting the order of selective and nonselective programs), while Asian and white applicants are more likely to overreach (by applying to only selective programs). Finally, we calculate and interpret ex-ante “theoretically optimal” perturbations of each student’s portfolios, using only program-level offer rate information from the previous year. Recommended portfolio changes from this model decrease undermatching by 24%. Our results suggest the benefit and possibility of personalized feedback, and forecast the effects of different types of interventions: some applicants (disproportionately Asian and white) are more likely to benefit from interventions that encourage listing more non-selective programs and from removing list length restrictions, while others (disproportionately Black and Hispanic) are more likely to benefit from interventions that encourage listing more selective programs and avoiding inverting the ranking order of selective and non-selective programs.

Projects

Members