2019 Capstone Projects

Accuracy and Equity in Predictive Hot-Spot Policing

Sponsor: NYU CUSP

CUSP Mentor: Dr. Daniel Neill

CUSP Students: Anupama Santhosh, Devashish Khulbe, Yavuz Sunor & Yuchen Ding

Project Website: https://anupamasanthosh93.wixsite.com/website

Several recent studies have demonstrated the efficacy of proactive policing strategies for crime prevention. By predicting emerging geographic hot-spots of violent crime, we can target police patrols and other interventions. However, predictive policing creates moral and ethical concerns, such as fairness and equity, which have been well-documented, yet have not been typically incorporated into the design and evaluation of such systems. In this work we develop machine learning methods to predict hot-spots of crime and present a way to measure equity among those areas. We then adjust the predictions based on the defined equity metric and analyze the trade-offs between accuracy and equity. We also see the performance of the two models based on the racial distribution of the targeted population.

Do Uber/Lyft Reduce Parking Violations in NYC?

Sponsor: NYU Wagner School of Public Service; Department of Urban Planning

Mentor: NYU Wagner Professor Zhan Guo

CUSP Students: Junru Lu, Junjie Cai, Pranay Anchan, Shijia Gu, Yuxuan Wang

Project Website: https://uberlyftparkingviolation.github.io/

The fast expansion of Uber and Lyft result in people suspecting these companies’ effect on our urban system, for example, on public transportation and city congestion. However, due to the lack of open data and scientific research, most of its influence has not been proven yet.
This capstone project aims to explore one potential Uber & Lyft’s impact: whether daily Uber & Lyft trips affect parking violations. NYC daily Uber & Lyft trip and parking ticket data are collected and correlated by taxi cab zone. Three technical models, Fixed Effects, Difference in Difference (DID), and Bayesian Network, are applied on the prepared data. The results of these models show the negative correlation and causal effect between the number of Uber & Lyft trips and parking tickets, suggesting Uber & Lyft help in reducing parking violations in NYC. Given the controversial issues around these companies, this capstone project can assisting in understanding impact of Uber & Lyft and offer policy insight to the Transportation Network Company’s (TNC) regulation.

Getting to Zero: Predicting the Impact of Urban Design on Road Safety

Sponsor: NYU & State of Place

Mentor: Mariela Alfonzo

CUSP Students: Cyrus Blankinship, Xiao Jing, Pablo Mandiola, Eve Marenghi, Wei-Yun Wang 

Project Website: https://www.gettingtozeronyc.com/

Pedestrian-oriented design improves road safety and is an important component of Vision Zero, a global strategy which Mayor Bill de Blasio has adopted with the goal of eliminating all traffic fatalities and serious injuries on New York City streets by 2024. Our research aims to study the relationship between collision likelihood and quality of the built environment, measured by automated surveying techniques using open data sources. We adopt the metrics developed by our project sponsor, which capture ten dimensions of urban design known to impact people’s decision to walk, as our independent variables. We model the relationship between the design dimensions and collision rates using univariate and multivariate Logistic Regression. Our models also consider demographic and environmental factors including median income, population density, traffic volume, pedestrian volume, and speed limit, which have been shown to impact collision rates. Based on our findings, we recommend prioritized safety enhancements for areas characterized by low incomes, aesthetic quality, and connectivity, and by high pedestrian and vehicular volume. Focusing on safety enhancements for these areas will accelerate the achievement of NYC’s Vision Zero goals.

Mobility Patterns in Major US Cities

Sponsor: Arcadis

CUSP Mentors: Dr. Kaan Ozbay and Dr. Abdullah Kurkcu

CUSP Students: Jianwei Li, Karan Saini, Colin Bradley, Minqi Lu, Yuxuan Cui

Project Website: https://lijianweidesign.com/CapstoneProject/

Understanding mobility and commuting patterns can be valuable for urban policymakers, with implications for a range of urban policy issues such as public transit planning, traffic management, disaster preparedness, and zoning/development decisions. Traditionally, policymakers have relied on data gathered from travel surveys, which can have limitations such as: Cost and frequency, sample size, limited time horizons, and a failure to capture non-resident travel patterns. Our research builds on the existing mobility literature by leveraging cell phone trace data from four US cities to describe origin-destination trip demand, contextualizing trip generation and mobility patterns at a high spatial and temporal granularity. Specifically, network analysis and clustering are used to describe the structure of mobility networks – to identify source and sink locations, define central business districts and transportation hubs. The research has implications for understanding the unique mobility patterns in each of the four cities, as well as any common patterns across geographies, which may be relevant or urban planners and policymakers in their respective cities.

Remote Sensing and Data Modeling Tools to Help the City prepare for Future Stormwater Flooding

Sponsors: New York City Department of Environmental Protection, with Mayor’s Office of Recovery and Resiliency, and Town+Gown

CUSP Mentor: Andreas Karpf

CUSP Students: Tarek Arafat, Davey Ives, Vivaldi Rinaldi, Rachel Sim

Project Website: https://dhi211.wixsite.com/nyc-ufsi-map

New York City (NYC) is required to develop a map of areas vulnerable to flooding due to the 2018 enactment of Local Law 172. The team from the Center of Urban Science + Progress (CUSP) was tasked to identify areas of inland flooding from extreme rainfall. Using a bivariate method known as Frequency Ratio (FR) and multivariate methods through Machine Learning (ML) algorithms, the team created an Urban Flooding Susceptibility Index (SI) for each method and produced Susceptibility Index Maps. These maps were cross-validated with remote sensing imagery in the form of Synthetic Aperture Radar (SAR) to assess the accuracy of our SI and evaluate our methods. The Susceptibility Index Maps will contribute to the identification of flood-prone locations in NYC. In doing so, the team has also developed a replicable framework for city planning agencies to characterize and compare flooding using a variety of data sources.

Senior Pedestrians in NYC: A Diff-in-Diff Approach to Evaluating Safe Streets for Seniors

Sponsor: NYU CUSP

CUSP Mentor: Dani Hochfellner

CUSP Students: Pengzi Li, Po-Yang Kang, Sam Burns, and Asilayi Bahetibieke

Project Website: https://agingcapstone.github.io/capstone_site/

Urban senior populations are expected to grow significantly in the coming decades. This demographic trend requires adjustments to policy and infrastructure in cities. New York City has implemented its Safe Streets for Seniors (SSfS) program, which includes identifying Senior Pedestrian Focus Areas (SPFAs) and making structural improvements designed to improve safety for senior pedestrians to address its aging population. This study investigates whether or not the established SPFAs improved safety for seniors. Using difference-in-difference estimators, we find that setting up SPFA zones in NYC leads to a decrease of about 34 percent in the number of seniors killed in motor vehicle accidents. The number of accidents involving senior pedestrians relative to non-SPFA zones decreased too, however only by 5 percent. Overall, our results show that New York City’s programs addressing senior citizens is successful and suggest that other cities in the US and abroad should adopt transportation policies similar to NYC’s in order to protect senior citizens.

More Efficient at Lower Cost: Disrupting The Salt Lake County Vehicle Testing Program

Sponsor: New York University Marron Institute

Mentor: Dr. Kevin Cromar

CUSP Students: Ewelina Marcinkiewicz, Jiawen Liang, Jingxi Zhao, Ursula Kaczmarek, Yunhe Cui

Project Website: https://capstoneproject2019airquality.github.io/capstone-website/

Salt Lake County oversees a vehicle emissions testing program to ensure compliance with federal air quality standards. Motorists annually spend millions of dollars for inspections carried out under this program. However, it is not subject to cost controls and suffers from programmatic inefficiencies wherein a large number of compliant vehicles undergo testing and a small number of high-polluting vehicles forgo timely repair. Using survey-based emissions test pricing data, we assess through empirical and spatial autocorrelation analysis the potential economic impact of instituting both emissions test price caps and subsidies to fund compliance repairs. Our results indicate testing providers have an overall negative view towards regulatory price caps but a majority would voluntarily participate in a subsidized repair program. Our research also suggests imposition of a 30-dollar cap paired with subsidization of an early repair program will lower motorists’ average transactional costs while ensuring adequate geographic availability of testing services, achieving both improved cost effectiveness and pollution reduction. If effectuated in policy, demonstrated local programmatic improvements like these often serve as powerful examples of change other localities are quick to emulate.

VIOWATCH: Predicting Health Code Violations in New York City

Sponsor: Introspective Systems, LLC

CUSP Mentor: Dr. Andy Karpf

CUSP Students: Adley Kim, Qiuyao Liu, Klo’e Ng

Project Website: https://capstone2019-foodweb.github.io/FinalProject/

Restaurant inspections are critical to maintaining public health by identifying unsanitary practices that could lead to an outbreak of foodborne illness. Every year, the Department of Health and Mental Hygiene (DOHMH) conducts random inspections on New York City’s 24,000 restaurants, looking for potential health code violations. These violations vary in severity, with those that are most likely to contribute to foodborne illness being flagged as “critical violations”. Our study employs several predictive classifiers to derive a model to predict the likelihood that a restaurant will be flagged with a critical violation based on features such as past violations, and restaurant characteristics. The purpose of this study is to help the DOHMH prioritize inspections for restaurants which are at risk of violating more serious health codes.

Pattern and Anomaly Detection in Urban Temporal Networks

Sponsors: NYU CUSP & Lockheed Martin

CUSP Mentor: Stanislav Sobolevsky

CUSP Students: Urwa Muaz, Shivam Kumar Pathak, Jingtian Zhou, Saloni Saini, Mingyi He

Project Website: https://jzhou60.github.io/Capstone-Website/

Broad-spectrum of urban activities including mobility can be modeled as networks evolving over time, which potentially can capture the changes in urban dynamics caused by protests, sports events, cultural events, national holidays, disasters, weather extremities, and other disruptive events.  Presence of large numbers of nodes and edges with varying scales of values increases the noise to signal ratio in these temporal networks thus, making it difficult to detect urban events using them. In this research, we propose to resolve this challenge by learning effective representations from the networks using topological aggregation and dimensionality reduction. Through this research, we have produced methodologies for anomaly detection in urban temporal mobility networks that outperforms the legacy techniques and is generalizable to different types of temporal networks. Our motivation to pursue this problem is our belief that such a system can be used in early detection of potentially unsafe developments and enable a timely response.

Recovering the Business and Economic History of New York City Through Large-Scale City Directory Data

Sponsor: New York University Division of Libraries, Data Services

Mentor: Dr. Nicholas Wolf, NYU Division of Libraries

CUSP Students: Fekade Brook, Linda Lyu, Matthew Sauter, Vaidehi Thete, Shelly Yin

Project Website: https://fekfekus.wixsite.com/data-recovery

The New York Public Library holds within its archives dozens of directories containing historical New York City business and residential data. Under a previously commissioned project, several of these tomes – dated between 1849 and 1922 – were digitized, and the textual data had been extracted using Optical Character Recognition (OCR) techniques. Our capstone centers on the previously completed research, with three primary goals: exploring opportunities for increasing the accuracy of the data extraction methodology, improving upon the reproducibility of the methods previously used, and exploring the extracted data to demonstrate the potential that these data have for expanding our understanding of 19th century New York City.

Addressing NYC Transit Deserts Through Local Self-Organized Commuter Vans

Sponsor: Dollaride

CUSP Mentor: Joseph Chow

CUSP Students: Evgeniya Bektasheva, Hanxing Li, Kenji Uchimoto, Keundeok Park, Wenjie Zheng

Project Website: https://hl3282.wixsite.com/website

While New York City has developed public transportation networks provided by MTA, a huge number of people living in outer boroughs are left outside the networks. These people are living in “transit deserts” and have a strong demand for transportation means. Part of the demand is met by so-called “dollar vans” – a chain of thousands of privately owned commuter vans which run across the NYC. A startup called “Dollaride” connects drivers and passengers in these marginalized communities using an  innovative transportation technology.

The project identifies methodology for calculating transit deserts (transit-underserved areas in terms of specific transit supply-demand equilibrium) in a large city. The methodology is based on earlier developed attitude by Jiao and Dillivan (2013), but some different parameters characterising a first-tier American city were suggested.  The outcome of the project is a map of transit deserts in NYC, based on a solid quantitative analysis and data. Another practical implication of the project is a list of new routes for Dollaride, which would benefit its passengers, drivers and NYC transportation authorities.  

Impact of Manhattan Congestion Surcharge on Commuter Transportation Choices in New York City

Sponsor: NYU CUSP & Arcadis

CUSP Mentor: Stanislav Sobolevsky

CUSP Students: Xiaoning He, Rufei Sheng, Soham Mody, Sam Mazi, Katie Voorhees

Project Website: https://rufei-sheng.github.io/Manhattan_Congestion_Capstone_Website/

Comprehensive understanding of anticipated impacts of transportation pricing policies in an urban system is crucial for both decision-makers and stakeholders. Estimating these impacts can be challenging due to the  complexity of transportation systems like New York City’s and the lack of complete ground truth data. This paper aims to provide a data-driven framework for quantitative evaluation of such impacts with the recently introduced Manhattan Congestion Surcharge as the case in point. To do so, we will use open data to develop a predictive model to estimate the impact of pricing changes on the demand for public transportation and For-Hire Vehicles. This paper’s results include a thorough analysis of economic, social and environmental impacts anticipated as a result of the policy, aimed to enable voters and policymakers to quantitatively evaluate the potential outcomes of congestion pricing. 

Predicting Violations in NYC’s Child Care Centers

Sponsor: NYC Department of Health and Mental Hygiene

CUSP Mentor: Debra Laefer

CUSP Students: Christine Biddlecombe, Yusu Qian, Marium Sultan, Guanjia Wang

The New York City Department of Health and Mental Hygiene (DOHMH) routinely inspects all child care centers for violations of code. These violations relate to the physical characteristics of the centers or qualities of the staff. This capstone is focused on predicting violations in NYC’s commercial child care centers in order to assist DOHMH in prioritizing inspector’s time to the centers that are the most likely to have high-risk violations. Our deliverable is a random forest classification model that predicts if an inspection will result in a high-risk violation and reports on the factors which are most correlated with this outcome. Our analysis includes data provided by DOHMH as well as data from the U.S. Census Bureau American Community Survey (ACS) and American Housing Survey (AHS), Primary Land Use Tax Lot Output (PLUTO), New York Police Department (NYPD), and Center for Disease Control (CDC). After training multiple classifiers using different criteria, we found that a Random Forest Classifier performs best when classifying whether an initial inspection will identify any high-risk violations, achieving an out-of-sample Area Under the Receiver Operating Characteristic Curve (ROC AUC) score of 0.94. 

Tidal Wetland Conservation Options for Long-Term Land Use Planning for Sea Level Rise

Sponsor: Town+Gown @ NYC DDC & NYC Department of Parks and Recreation

Mentor: Dr. Navid Jafari (Louisiana State University)

CUSP Students: Andrew Hill, Chang Du, Haoming Yang, Sean Andrew Chen, Yushi Chen

Project Website: https://cuspcapstone2019.github.io/tidal_wetland_website/

According to the New York State Department of Environmental Conservation, tidal flooding is one of New York City’s most dangerous future problems. With decades separating the present and the inevitable flooding, the city has an opportunity to bolster its resiliency infrastructure, which can include hard infrastructure, soft infrastructure, and rezoning for resiliency. Our project evaluates the economic, social, and environmental costs and benefits of converting vacant lots and bought-out land into  building soft infrastructure (in this case, salt marshes) to diminish the effects of future tidal flooding in New York City. These costs and benefits will be presented through an interactive data visualization tool that our sponsor, the Department of Parks and Recreation, can utilize to inform their future conversion decision processes. Given enough time, we will also conduct our own analysis, informed by the tool, to recommend conversion sites and other policy strategies. By utilizing the tool ourselves to generate the policy recommendations, we can test the prototype ourselves so we can deliver a refined product to our stakeholders.

Exploring Gentrification and Displacement Through User-Generated Geographic Information

Sponsor: UC-Berkeley – Urban Displacement Project

Mentor: Karen Chapple, UC Berkeley

CUSP Students: Kent Pan, Tiffany Patafio, Manrique Vargas, Jiawen Wan, Tiancheng Yin

Project Website: https://ace-gabriel.github.io/twitter_gentrification/

Gentrification and displacement are pressing issues for many cities today, as urban populations continue to grow and neighborhoods change rapidly in response. While gentrification can bring new businesses, resources, and other positive changes to a neighborhood, the rapid change can be destabilizing to long-term, low-income residents already in the area. Thus, it is important to better understand the activities and behaviors in these changing areas to better provide insight for the local community and public officials. This study utilizes two such novel data sources, Twitter and Foursquare, to explore gentrification and displacement risk for neighborhoods within the 31-county NY metro region. Using methodology established by the UC Berkeley Urban Displacement Project, and expanding on last year’s CUSP capstone project, we utilize both administrative census data and user-generated social media data, to model the gentrification phenomena and in changing neighborhoods. We show that Foursquare and Twitter have the potential of improving prediction of gentrification, but their power for modelling gentrification alone is still weak compared to benefits derived from Census data. We propose alternative definitions of gentrification related to Supergentrification, people gentrification (using college education and income) and place gentrification (using housing and rent price). With these alternative, more specific definitions of change, we were able to see more impact from Foursquare and Twitter datasets on certain change types. Our results show that it is easier to model ‘people’ or ‘place’ gentrification than combining the two.

Hearing Noise Complaints: Data-Driven Optimization of Rapid Respond to Urban Noise Complaints

Sponsor: NYU CUSP

CUSP Mentors: Charlie Mydlarz, Mark Cartwright, and Vincent Lostanlen

CUSP Students: Qinyu Goh, Zoe Martiniak, Sam Ovenshine, Siddhanth Shetty, Sung Hoon Yang

Project Website: https://zem232.github.io/NoiseCapstone/

In New York City, the Department of Environmental Protection (DEP) handles outdoor noise complaints from sources ranging from construction activity to the jingle of ice cream trucks. Since 2010, the growing volume of noise complaints has increased the agency’s response times and hindered enforcement of the city’s noise code. This capstone project provides a data-driven approach to optimize the DEP’s processes to better address noise complaints.  To accomplish this, we deployed two machine learning models: an LSTM neural network to predict spatial and temporal complaint volume, and a random forest classifier model to predict complaint enforceability. Model features were 311 complaints, socioeconomics, PLUTO land use, weather conditions, and construction variances. Based on sponsor feedback, we opted for the random forest classifier model and to optimize its performance for recall.  In evaluation, the model precision was 9.3% and recall was 91.1%. The model’s enforceability predictions were incorporated into an interactive data visualization for DEP inspectors to identify clusters of unresolved noise complaints with a high likelihood of enforceability. The implications of our work are to improve noise code enforcement and reduce the DEP backlog.

Innovation Hubs: A Stories of Cities and Patents

Sponsor: NYU CUSP

CUSP Mentor: Dani Hochfellner

CUSP Students: Nathan Caplan, Christine Chang, Rohun Iyer, and Sarah Sachs

Project Website: https://sarahjune1.github.io/CUSP_innovation_capstone/

In recent years there has been great interest around cities and their new product output, where they are usually labeled as ‘innovation hubs’ or ‘tech hubs’. This capstone project will investigate the factors that facilitate innovation growth within a city using publicly available patent data. To understand how this process develops, we will analyze patent data in the United States from 2001-2012. Our regression analysis will explore many features that influence the growth of innovation. Upon running multiple analyses across the years we find that there are certain features that have higher influence on patent output amongst the top cities. We also find these features are missing among cities with less patent output. What this experiment would recommend to  cities desiring greater patent output is that they should invest in higher education, in earning Small Business Innovation Research (SBIR) grants, and looking into becoming empowerment zones.

Relationship Between LinkNYC and NYC Demographic and Household Broadband Access

Sponsor: Intersection

CUSP Mentor: Debra Laefer

CUSP Students: Xurui Chen, Angel Liu, Marvin Mananghaya, Timur Mukhtarov, Manu Pathak

Project Website: https://sites.google.com/view/cusp-capstone

Broadband access is not just a necessity, but also a human right recently recognized by the United Nations. While internet-based technologies are increasingly ingrained in the lives of city-dwellers, millions of people in major cities like New York lack access to high-speed internet. LinkNYC is an infrastructure project that replaced payphones and provides free  broadband-based Wi-Fi connection. This research seeks to start a discussion on the role this new technology carries in democratizing Internet access in New York and its place in day-to-day activities of the residents. We use bivariate analysis, clustering, and time-series methods to assess the relationships between LinkNYC, socio-economic factors, and 311 Service Requests. Additionally, we examine how has the public perception of the technology changed over the years by scraping and analyzing web articles mentioning LinkNYC. From our work, we conclude that LinkNYC has played an important role for the city community so far and that there is still room for improvement and further research.

Rich Context Algorithms & Development: Dataset Recommendation System

Sponsor: NYU CUSP

CUSP Mentors: Julia Lane, Jonathan Morgan, Andrew Gordon, and Clayton Hunter

CUSP Students: Haopeng Huang, Songjian Li, Tanya Nabila, Muci Yu

Project Website: https://rich-context-capstone-2019.github.io/Rich-Context-Capstone/

Exploring research datasets and their relational network between one and the others, and between datasets and entities of interest, such as research fields, paper titles, authors, and citation counts, are currently inefficient and lack an integrated online platform that improves this process. This capstone project connects datasets with various entities in the papers that use them and with other related papers. It improves user experience by building a dataset recommendation system based on a graph model. We create this system based on model outputs from the Rich Context Competition which showed weak mean relationship scores between datasets, publications, and research fields. We developed a novel evaluation metric to assess the performance of our system based on our what our definition of a good dataset recommendation system. Our previous network version adding keywords entity shows better connectivity based on a similarity matrix that prioritizes the shortest path between two datasets. And our latest knowledge network remapping fields of study entity and adding author entity shows even better connectivity. After we apply different connections of nodes and weighted edges in our network and define different similarity matrices, we produce a network model with the best performance and build our recommendation system on it. This recommendation system can be very useful when integrated with an interactive search engine and lead researches in all domains.