2025
Mon, May 19, 2025
Spatial data science for just and sustainable cities
Rafael M. H. Pereira · Guest Speaker
In this presentation, I will give an overview of my research at the intersection of spatial data science, urban analytics and accessibility, and sustainable mobility. Specifically, I will showcase work related to the development of open data science tools and methods for transportation network modeling used to examine spatial accessibility, energy use and the environmental performance of urban mobility systems. These tools contribute to research and planning by aiding researchers, students, and practitioners in effectively handling large-scale geospatial data for the examination of urban transportation networks and mobility futures. I will give particular attention to two projects related to: (1) a new scalable computational model to estimate public transport emissions at high spatial and temporal resolutions; and (2) recent developments of powerful multimodal routing models and their contribution to the analysis of socioeconomic and spatial inequalities in access to opportunities. At the end, I will discuss some of the advantages and limitations of these tools and models, reflecting on new research avenues for using spatial data science for sustainable and inclusive cities.
Mon, May 5, 2025
FloodNet
Charlie Mydlarz · Guest Speaker
FloodNet NYC is a sensor network for real-time urban flood monitoring and community flood resilience. Our team develops tools for real-time urban flood monitoring, implement these tools to measure flooding in New York City, and make flood data and monitoring tools available in a manner that is accessible and useful to stakeholders including residents, community-based organizations, government agencies, and researchers.
Mon, Mar 10, 2025
Using Administrative Datasets to Identify Landowners and Operationalize their Characteristics
Henry Gomroy · Guest Speaker
Landowners play central roles in many urban sociological theories, but empirical analysis of these actors has frequently been stymied by insufficient data. Few surveys collect detailed information on landowners and administrative data present multiple challenges, most importantly, that property owners frequently obscure their identities through corporate structures. This paper presents a data construction pipeline for creating linked, longitudinal datasets describing urban properties and the people and companies that own them using widely available tax assessment records and business filings. The author implements this approach in four metropolitan areas — Boston, Massachusetts, Baltimore Maryland, Miami, Florida, and Houston, Texas — between 2005 and 2020, demonstrating the adaptability of the method to areas with different levels of data quality. The pipeline draws on four methodological innovations. First, it uses internal validation and external harmonization to address biases and inaccuracies within tax assessment records. Second, it presents a network-based entity reconciliation methodology better suited than existing methods to the sparse but linked data contained in the source records. Third, it presents a flexible and comprehensive method for operationalizing landowners’ corporate networks. Finally, it operationalizes multiple sociological characteristics of landowners and estimates their potential bias. The paper concludes by demonstrating several empirical analyses this methodology opens.
Mon, Feb 24, 2025
Undermatching Disparities and Portfolio Decisions: Evidence from the New York City High School Match
Kenny Peng · Guest Speaker
In the New York City High School Match, applicants rank programs from over 800 options and are placed through a centralized stable matching process. We analyze individual application ranking (portfolio) behaviors that explain undermatching, defined as the difference in selectivity between where the student matched and where they could have matched had they applied. There are substantial disparities: undermatching is over 50% higher for Black and Hispanic applicants than for Asian or white applicants, with further gaps by income and geography. However, while individual student demographic characteristics and grades alone explain only 3.8% of the variation in undermatching, including individual application behaviors explains 40.9%. Black and Hispanic students are more likely to underreach (by only listing unselective programs, or inverting the order of selective and nonselective programs), while Asian and white applicants are more likely to overreach (by applying to only selective programs). Finally, we calculate and interpret ex-ante “theoretically optimal” perturbations of each student’s portfolios, using only program-level offer rate information from the previous year. Recommended portfolio changes from this model decrease undermatching by 24%. Our results suggest the benefit and possibility of personalized feedback, and forecast the effects of different types of interventions: some applicants (disproportionately Asian and white) are more likely to benefit from interventions that encourage listing more non-selective programs and from removing list length restrictions, while others (disproportionately Black and Hispanic) are more likely to benefit from interventions that encourage listing more selective programs and avoiding inverting the ranking order of selective and non-selective programs.
Mon, Feb 10, 2025
Global Rewards in Restless Multi-Armed Bandits: an Application to Urban Food Rescue
Naveen Raman · Guest Speaker
Restless multi-armed bandits (RMAB) extend multi-armed bandits so pulling an arm impacts future states. Despite the success of RMABs, a key limiting assumption is the separability of rewards into a sum across arms. We address this deficiency by proposing restless-multi-armed bandit with global rewards (RMAB-G), a generalization of RMABs to global non-separable rewards. To solve RMAB-G, we develop the Linear- and Shapley-Whittle indices, which extend Whittle indices from RMABs to RMAB-Gs. We prove approximation bounds but also point out how these indices could fail when reward functions are highly non-linear. To overcome this, we propose two sets of adaptive policies: the first computes indices iteratively, and the second combines indices with Monte-Carlo Tree Search (MCTS). Empirically, we demonstrate that our proposed policies outperform baselines and index-based policies with synthetic data and real-world data from food rescue.
2024
Wed, Dec 18, 2024
Finding EAAMO-Suitable Projects in UDS
Working Group Activity · UDS & Equitable Cities WG
End-of-term activity to surface project ideas suitable for EAAMO across urban data science topics. Participants propose, discuss, and scope ideas aligned with group interests in equity, urban mobility, housing, policy, and related domains.
Wed, Dec 4, 2024
Guest Speakers: Andrea Vallebueno (Cancelled)
Session Cancelled — Andrea Vallebueno · (Cancelled)
This session was cancelled.
Wed, Nov 20, 2024
Reparative Urban Science: Challenging the Myth of Neutrality and Crafting Data-Driven Narratives
Wonyoung So · Guest Speaker
This talk proposes how urban planning should approach technology within the context of systemic racism, advocating for a reparative approach to address issues of urban technology perpetuating today's racial inequality and hindering efforts to redress historical oppression. It identifies three mechanisms—formalization, context removal and legitimization, and penalization and extraction—that illustrate how urban technology perpetuates historical inequalities. It then discusses methodologies of reparative urban science, aiming to use urban technology to challenge race-neutral ideologies and create data-driven narratives for reparations.
Wed, Nov 6, 2024
Who Do Large Language Models Write Like? A Sociolinguistic Perspective on AI-Generated Texts
AJ Alvero · Guest Speaker
When large language models (LLMs) generate "human-like text", which humans do they write like? The answer to this question has strong implications for their use as methodological tools across the social sciences, but they also point to new dynamics to consider in important social processes. In this presentation, we will compare LLM-generated text with personal statements written by applicants to selective colleges and universities using the LIWC dictionary and modern embedding approaches. We find LLM writing style is distinctive from humans but closer to those with higher social status, and that LLMs are stylistically narrow but have broad vocabularies. These results motivate two perspectives for future research: a sampling perspective (linguistic closeness) and a distributional perspective (variation in text features).
Wed, Oct 23, 2024
Crafting Research Abstracts: Strava Metro Dataset
Working Group Activity · UDS & Equitable Cities WG
Goal: Gauge interest to collaborate on a project using the Strava Metro dataset and outline a project proposal.
Instructions:
1. Read and familiarize yourself with the Strava Metro dataset.
2. Draft a dream research abstract as if you published a paper using the dataset:
- Include motivation, problem statement, brief methods.
- Include expected results and (optional) short bibliography.
In-session: Breakout groups discuss and analyze selected abstracts and identify shared interests.
Wed, Oct 9, 2024
Models to Correct Under-Reporting in Urban Crowdsourcing Systems
Zhi Liu and Sidhika Balachandar · Guest Speakers
Decision-makers often observe the occurrence of events through a reporting process. City governments, for example, rely on resident reports to find and then resolve urban infrastructural problems such as fallen street trees, flooded basements, or rat infestations. Without additional assumptions, there is no way to distinguish events that occur but are not reported from events that truly did not occur. Because disparities in reporting rates correlate with resident demographics, addressing incidents only on the basis of reports leads to systematic neglect in neighborhoods that are less likely to report events. We show how to overcome this challenge in three different settings and estimate reporting rates. First, we leverage the fact that events are spatially correlated and propose a latent variable Bayesian model. Second, we propose a method to fit graph neural networks with both resident report and city inspection data. And third, we incorporate the report delay and the possibility of multiple reports in a Poisson Bayesian model. Our work lays the groundwork for more equitable proactive government services, even with disparate reporting behavior.
Wed, Sep 25, 2024
Introductions and Defining Urban Data Science
Working Group Kickoff & Discussion · EAAMO Bridges — UDS & Equitable Cities
• Short introductions (1–2 minutes each)
• Discussion on a working definition of urban data science
Readings:
- Urban analytics defined
- Defining urban data science
• Gauging interest in working group research collaborations (Strava Metro dataset)
No talks match your search.