Talks

Biweekly talks, activities, and discussions hosted by the EAAMO UDS Working Group.

2025

Mon, May 19, 2025

Spatial data science for just and sustainable cities

Rafael M. H. Pereira · Guest Speaker

In this presentation, I will give an overview of my research at the intersection of spatial data science, urban analytics and accessibility, and sustainable mobility. Specifically, I will showcase work related to the development of open data science tools and methods for transportation network modeling used to examine spatial accessibility, energy use and the environmental performance of urban mobility systems. These tools contribute to research and planning by aiding researchers, students, and practitioners in effectively handling large-scale geospatial data for the examination of urban transportation networks and mobility futures. I will give particular attention to two projects related to: (1) a new scalable computational model to estimate public transport emissions at high spatial and temporal resolutions; and (2) recent developments of powerful multimodal routing models and their contribution to the analysis of socioeconomic and spatial inequalities in access to opportunities. At the end, I will discuss some of the advantages and limitations of these tools and models, reflecting on new research avenues for using spatial data science for sustainable and inclusive cities.

Mon, May 5, 2025

FloodNet

Charlie Mydlarz · Guest Speaker

FloodNet NYC is a sensor network for real-time urban flood monitoring and community flood resilience. Our team develops tools for real-time urban flood monitoring, implement these tools to measure flooding in New York City, and make flood data and monitoring tools available in a manner that is accessible and useful to stakeholders including residents, community-based organizations, government agencies, and researchers.

Mon, Mar 10, 2025

Using Administrative Datasets to Identify Landowners and Operationalize their Characteristics

Henry Gomroy · Guest Speaker

Landowners play central roles in many urban sociological theories, but empirical analysis of these actors has frequently been stymied by insufficient data. Few surveys collect detailed information on landowners and administrative data present multiple challenges, most importantly, that property owners frequently obscure their identities through corporate structures. This paper presents a data construction pipeline for creating linked, longitudinal datasets describing urban properties and the people and companies that own them using widely available tax assessment records and business filings. The author implements this approach in four metropolitan areas — Boston, Massachusetts, Baltimore Maryland, Miami, Florida, and Houston, Texas — between 2005 and 2020, demonstrating the adaptability of the method to areas with different levels of data quality. The pipeline draws on four methodological innovations. First, it uses internal validation and external harmonization to address biases and inaccuracies within tax assessment records. Second, it presents a network-based entity reconciliation methodology better suited than existing methods to the sparse but linked data contained in the source records. Third, it presents a flexible and comprehensive method for operationalizing landowners’ corporate networks. Finally, it operationalizes multiple sociological characteristics of landowners and estimates their potential bias. The paper concludes by demonstrating several empirical analyses this methodology opens.

Mon, Feb 24, 2025

Undermatching Disparities and Portfolio Decisions: Evidence from the New York City High School Match

Kenny Peng · Guest Speaker

In the New York City High School Match, applicants rank programs from over 800 options and are placed through a centralized stable matching process. We analyze individual application ranking (portfolio) behaviors that explain undermatching, defined as the difference in selectivity between where the student matched and where they could have matched had they applied. There are substantial disparities: undermatching is over 50% higher for Black and Hispanic applicants than for Asian or white applicants, with further gaps by income and geography. However, while individual student demographic characteristics and grades alone explain only 3.8% of the variation in undermatching, including individual application behaviors explains 40.9%. Black and Hispanic students are more likely to underreach (by only listing unselective programs, or inverting the order of selective and nonselective programs), while Asian and white applicants are more likely to overreach (by applying to only selective programs). Finally, we calculate and interpret ex-ante “theoretically optimal” perturbations of each student’s portfolios, using only program-level offer rate information from the previous year. Recommended portfolio changes from this model decrease undermatching by 24%. Our results suggest the benefit and possibility of personalized feedback, and forecast the effects of different types of interventions: some applicants (disproportionately Asian and white) are more likely to benefit from interventions that encourage listing more non-selective programs and from removing list length restrictions, while others (disproportionately Black and Hispanic) are more likely to benefit from interventions that encourage listing more selective programs and avoiding inverting the ranking order of selective and non-selective programs.

Mon, Feb 10, 2025

Global Rewards in Restless Multi-Armed Bandits: an Application to Urban Food Rescue

Naveen Raman · Guest Speaker

Restless multi-armed bandits (RMAB) extend multi-armed bandits so pulling an arm impacts future states. Despite the success of RMABs, a key limiting assumption is the separability of rewards into a sum across arms. We address this deficiency by proposing restless-multi-armed bandit with global rewards (RMAB-G), a generalization of RMABs to global non-separable rewards. To solve RMAB-G, we develop the Linear- and Shapley-Whittle indices, which extend Whittle indices from RMABs to RMAB-Gs. We prove approximation bounds but also point out how these indices could fail when reward functions are highly non-linear. To overcome this, we propose two sets of adaptive policies: the first computes indices iteratively, and the second combines indices with Monte-Carlo Tree Search (MCTS). Empirically, we demonstrate that our proposed policies outperform baselines and index-based policies with synthetic data and real-world data from food rescue.

2024

Wed, Dec 18, 2024

Finding EAAMO-Suitable Projects in UDS

Working Group Activity · UDS & Equitable Cities WG

End-of-term activity to surface project ideas suitable for EAAMO across urban data science topics. Participants propose, discuss, and scope ideas aligned with group interests in equity, urban mobility, housing, policy, and related domains.

Wed, Dec 4, 2024

Guest Speakers: Andrea Vallebueno (Cancelled)

Session Cancelled — Andrea Vallebueno · (Cancelled)

This session was cancelled.

Wed, Nov 20, 2024

Reparative Urban Science: Challenging the Myth of Neutrality and Crafting Data-Driven Narratives

Wonyoung So · Guest Speaker

This talk proposes how urban planning should approach technology within the context of systemic racism, advocating for a reparative approach to address issues of urban technology perpetuating today's racial inequality and hindering efforts to redress historical oppression. It identifies three mechanisms—formalization, context removal and legitimization, and penalization and extraction—that illustrate how urban technology perpetuates historical inequalities. It then discusses methodologies of reparative urban science, aiming to use urban technology to challenge race-neutral ideologies and create data-driven narratives for reparations.

Wed, Nov 6, 2024

Who Do Large Language Models Write Like? A Sociolinguistic Perspective on AI-Generated Texts

AJ Alvero · Guest Speaker

When large language models (LLMs) generate "human-like text", which humans do they write like? The answer to this question has strong implications for their use as methodological tools across the social sciences, but they also point to new dynamics to consider in important social processes. In this presentation, we will compare LLM-generated text with personal statements written by applicants to selective colleges and universities using the LIWC dictionary and modern embedding approaches. We find LLM writing style is distinctive from humans but closer to those with higher social status, and that LLMs are stylistically narrow but have broad vocabularies. These results motivate two perspectives for future research: a sampling perspective (linguistic closeness) and a distributional perspective (variation in text features).

Wed, Oct 23, 2024

Crafting Research Abstracts: Strava Metro Dataset

Working Group Activity · UDS & Equitable Cities WG

Goal: Gauge interest to collaborate on a project using the Strava Metro dataset and outline a project proposal. Instructions: 1. Read and familiarize yourself with the Strava Metro dataset. 2. Draft a dream research abstract as if you published a paper using the dataset: - Include motivation, problem statement, brief methods. - Include expected results and (optional) short bibliography. In-session: Breakout groups discuss and analyze selected abstracts and identify shared interests.

Wed, Oct 9, 2024

Models to Correct Under-Reporting in Urban Crowdsourcing Systems

Zhi Liu and Sidhika Balachandar · Guest Speakers

Decision-makers often observe the occurrence of events through a reporting process. City governments, for example, rely on resident reports to find and then resolve urban infrastructural problems such as fallen street trees, flooded basements, or rat infestations. Without additional assumptions, there is no way to distinguish events that occur but are not reported from events that truly did not occur. Because disparities in reporting rates correlate with resident demographics, addressing incidents only on the basis of reports leads to systematic neglect in neighborhoods that are less likely to report events. We show how to overcome this challenge in three different settings and estimate reporting rates. First, we leverage the fact that events are spatially correlated and propose a latent variable Bayesian model. Second, we propose a method to fit graph neural networks with both resident report and city inspection data. And third, we incorporate the report delay and the possibility of multiple reports in a Poisson Bayesian model. Our work lays the groundwork for more equitable proactive government services, even with disparate reporting behavior.

Wed, Sep 25, 2024

Introductions and Defining Urban Data Science

Working Group Kickoff & Discussion · EAAMO Bridges — UDS & Equitable Cities

• Short introductions (1–2 minutes each) • Discussion on a working definition of urban data science Readings: - Urban analytics defined - Defining urban data science • Gauging interest in working group research collaborations (Strava Metro dataset)