Graduate programs at CUSP offer a unique, interdisciplinary and cutting edge approach that links data science, statistics and analytics, and mathematics with complex urban systems, urban management, and policy. The curriculum addresses the necessary technical skills and critical problem solving frameworks in addition to providing research opportunities and real-world experiences through internships and practicums that enable students to be successful in a wide range of career trajectories. CUSP students understand how to work with data at all stages of the data lifecycle from acquisition to visualization. Furthermore, they gain knowledge about cities by using robust and live data in their class projects, applied research activities, and partnering with companies and NYC agencies addressing existing urban challenges.

Over the course of 1 year (3 semesters), Advanced Certificate students take courses on a concurrent schedule with the MS program. These include non-credit courses in the summer, core classes in the fall and spring, and electives in the spring.


To introduce students to the field of Urban Informatics and help prepare students for their graduate students, NYU CUSP offers preparatory resources at the beginning of their degree. These resources help students learn the skills necessary for a successful academic career, discover resources at CUSP and NYU, and connect with other students.

Free Online Programs

There are many free online courses that our students take to augment or refresh their technical skills in these topics before applying. We suggest the following free online courses:

City Challenge Week (Pre-Fall)

City Challenge Week is the start of the MS in Applied Urban Science and Informatics program and CUSP’s new student orientation. The intensive 4-day program includes a number of speakers, workshops, academic boot camps, and events that introduce students to CUSP, the field of urban informatics, and NYU resources.

Core Classes

Advanced Certificate students may choose from the following CUSP courses:

Fall Courses (6 Credits or 2 Classes)


This course is the introduction to the core disciplines of data acquisition and management, integration, and analytics.  Students will learn the major concepts, tools, and techniques for what informatics can do for cities. It includes background in computation, statistics, error analysis, data acquisition, management, integration, working with large datasets and understanding data sources.  The course covers topics in visualization, data analysis and modeling, and machine learning, focusing on their application on urban problems, and including material not usually covered in computer science courses: how to handle spatial-temporal data, GIS, issues related to citizen science and participatory sensing, instrumentation, physical sensors, imagery, and issues of data ethics, privacy. This class is Python based.


This course introduces students to computational approaches to urban challenges through the lens of city operations, public policy, and urban planning. Students are exposed to a range of analytical techniques and methods from the perspective of urban decision-making. Issues of city governance, structure, and history are presented to understand how to identify and assess urban problems, collect and organize appropriate data, utilize suitable analytical approaches, and ultimately produce results that recognize the constraints faced by city agencies and policymakers. This is not an easy task, and requires an understanding of urban social and political dynamics and a significant appreciation of data governance, privacy, and ethics. Specific attention is given to domain areas of energy and building efficiency, transportation, public health and emergency response, waste, water, and social connectivity and resilience, as well as the deployment of urban technology at the neighborhood scale. The role of civic engagement and community participation in the context of open data and citizen science is explored, as well as the evolving relationship between, and influence of, informatics on urban governance. Top-down and bottom-up models of innovative service delivery are discussed and debated in the context of public decision-making. Case studies and best practice examples from U.S. and global cities are used extensively, with a particular focus on New York City.


This course introduces students to the theory, principles and applications of mathematical and computer modeling of data as applied to cities. It will be based on two unified themes: foundations for predictive analytics and decision-making followed by applications in data science. The 1st half of the course will cover predictive modeling using a wide array of examples, including predictive modeling, an advanced treatment of regression, visualization and graphics, and automated analysis for high dimensional data. The second half will introduce students to applications in data science such as analytics of images and video as well as subjective data processing and analysis.


This supplementary lab teaches students to recognize where and understand why ethical issues can arise when applying analytics to urban problems. One of the learning objectives is to consider what ethical obligations scientists may have to those who figure in their research, as well as those to whom the lessons are later applied. The lab considers issues across the lifecycle of projects starting with collection and moving through management, sharing, and analysis of data. The goal is to learn about privacy implications, the repurposing of government data for uses not anticipated at the time of collection, and the legal framework covering these. 

Spring & Summer Courses (6 Credits or 2 Classes)


The objective of this course is to familiarize students with modern machine learning techniques and demonstrate how they can be effectively applied to urban data. The course is practice-oriented: concepts and techniques are motivated and illustrated by applications to urban problems and datasets. For that reason, it involves a significant programming component, with Python as the primary programming language. Topics include a variety of supervised and unsupervised learning methods, such as support vector machines, clustering algorithms, ensemble learning, Bayesian networks, Gaussian processes, and anomaly detection. Strategies for effective machine learning and discussion of the limitations of ML as well as a variety of supplementary techniques are also considered.


Visualization and visual analytics systems help people explore and explain data by allowing the creation of both static and interactive visual representations. A basic premise of visualization is that visual information can be processed at a much higher rate than raw numbers and text. Well-designed visualizations substitute perception for cognition, freeing up limited cognitive/memory resources for higher-level problems. This course aims to provide a broad understanding of the principals and designs behind data visualization. General topics include state-of-the-art techniques in both information visualization and scientific visualization, and the design of interactive/web-based visualization systems. Hands on experience will be provided through popular frameworks such as matplotlib, VTK and D3.js.

CUSP-GX-6004: Data and Algorithms in the Criminal Justice System

The growing use of data-centric technologies is transforming criminal justice in the United States. These technologies affect the scale and nature of collected data, enable the detection of discriminatory patterns of policing, and influence bail recommendations for pretrial detainees, among other things. Modern computational and statistical methods offer the promise of increased efficiency, equity, and transparency, but their use raises complex legal, social, and ethical questions. In this course, we will discuss the application of techniques from machine learning and statistics to a variety of criminal justice issues, analyze recent court decisions, and examine the relationships between law, public policy, and data. The course will involve readings and class discussion, short assignments, and a data-intensive semester-long project. The course will also feature guest speakers who work in the criminal justice domain. Students should have basic knowledge of statistics, programming, and supervised machine learning, but no prior knowledge of the criminal justice system will be assumed.

CUSP-GX-6004: Foundation Technology for Innovation

Learn how to develop innovative IT systems to meet customer expectations. Many organizations find that developing these systems is very difficult. Late and over budget systems, and cyber security breaches undermine customer confidence. Civic technologists can help improve this situation. Better use of foundation technology to support innovation can help organizations work smarter, improve productivity, increase accessibility, reduce cost, and reduce risk. This course will explain best practices in creating and operating IT systems. You will improve your IT IQ. Course discussions will describe the leadership, customer care, project management, and service management needed to create and support quality IT systems. We’ll discuss how to generate the participation throughout organizations needed for improving and sustaining high performance creating and operating IT systems. Class lectures will include multi-media presentations, references to assigned readings, discussions about the planned course content, weekly assessments, mid-term and final exams, and discussions about new systems planned by students throughout the term.


The goal of this course is to provide the students with the tools and methods to understand basics of traffic flow theory, modeling and simulation. The emphasis will be on the use of real-world data to supplement the understanding of the theory behind the models. Small scale simulation models will be developed, tested and validated against real-world data.


Remote sensing technologies are becoming increasingly available at better resolution levels and lower costs. This course will provide an overview of some of the most common technologies in the areas of imagery, video, sound, and hyperspectral data that can be facilitated through smart phones or other readily accessible means. Students will be given a formal introduction to the aforementioned four areas and then be afforded an opportunity for hands on training in data collection and data analysis. In the course will have the opportunity to work in small groups to investigate an urban problem of interest to them at a site of their choosing. The teams will use these new learned technologies in tandem with other publicly accessible data (either formally available or also collected by the researchers) to investigate a working hypothesis about their chosen urban problem for their site.


The goal of the Applied Data Analytics class is to develop the key computer science and data science skill sets necessary to harness the wealth of newly-available data. Its design offers hands-on training in the context of real microdata. The main learning objectives are to apply new techniques to analyze social problems using and combining large quantities of heterogeneous data from a variety of different sources. It is designed for students who are seeking a stronger foundation in data analytics.


Course description coming soon.


The world’s urban population is growing by nearly 60 million per year; equivalent to four cities like New York annually. Monitoring the chronological growth of key attributes of cities, as well as quantifying their current conditions presents a great potential for positive change. Through the acquisition of new data, there are immediate opportunities to influence the sustainable growth of small and medium size cities. There is also the potential for alleviating the extremes in Megacities, where conditions have reached a critical and unmanageable state. Looking at cities as interdependent networks of physical, natural and human systems, this course provides a perspective on how to monitor the function and wellness of these systems. Students obtain an understanding of needs assessment, planning, and technical approaches for instrumenting a city. This includes monitoring patterns of activity, mobility, energy, land use, physical and lifeline infrastructure, urban ecology, vegetation, atmosphere and air quality. The expected outcomes of this course is a comprehensive understanding of what can be instrumented and the monitoring architecture for acquiring and generating new data about cities.


The course aims to provide an understanding of big data and state-of-the-art technologies to manage and process them. General topics of this course include: big data ecosystems, parallel and streaming programming model, and spatial data processing. Hands-on labs and exercises in MapReduce, Hadoop, Spark, Hive, and Pig will be offered throughout the class to bolster the knowledge learned in each module.

CUSP-GX 4147: Large Scale Data Analysis I

CUSP-GX 4148: Large Scale Data Analysis II

The past decade has seen the increasing availability of very large scale data sets, arising from the rapid growth of transformative technologies such as the Internet and cellular telephones, along with the development of new and powerful computational methods to analyze such datasets. Such methods, developed in the closely related fields of machine learning, data mining, and artificial intelligence, provide a powerful set of tools for intelligent problem-solving and data-driven policy analysis. These methods have the potential to dramatically improve the public welfare by guiding policy decisions and interventions, and their incorporation into intelligent information systems will improve public services in domains ranging from medicine and public health to law enforcement and security. The LSDA course series will provide a basic introduction to large scale data analysis methods, focusing on four main problem paradigms (prediction, clustering, modeling, and detection). The first course (LSDA I) will focus on prediction (both classification and regression) and clustering (identifying underlying group structure in data), while the second course (LSDA II) will focus on probabilistic modeling using Bayesian networks and on anomaly and pattern detection. LSDA I is a prerequisite for LSDA II, as a number of concepts from classification and clustering will be used in the Bayesian networks and anomaly detection modules, and students are expected to understand these without the need for extensive review. In both LSDA I and LSDA II, students will learn how to translate policy questions into these paradigms, choose and apply the appropriate machine learning and data mining tools, and correctly interpret, evaluate, and apply the results for policy analysis and decision making. We will emphasize tools that can “scale up” to real-world policy problems involving reasoning in complex and uncertain environments, discovering new and useful patterns, and drawing inferences from large amounts of structured, high-dimensional, and multivariate data. No previous knowledge of machine learning or data mining is required, and no knowledge of computer programming is required. We will be using Weka, a freely available and easy-to-use machine learning and data mining toolkit, to analyze data in this course.

CUSP-GX 6003

The complexity of the urban context – defined by a rich, interlacing network of infrastructure, systems, and process that cuts across all sectors: public,private, and non-profit – requires that any large and/or innovative technology project take into account the many factors that go into developing a feasible, viable, and desirable project scope, as well as planning and managing against it. This course will be a case-based investigation into frameworks, methods, and tools for developing the strategic perspective in which to scope, plan, and manage technology projects in the urban context. Some the methods and tools we will consider include those drawn from different schools of strategy and management. They include: BOSCARD / Project Charter, Three-Point Estimating, Work Breakdown Structures (WBS), PERT, Critical Path Method, RACI Matrix, RAID Logs, MARCI Charts, Product Backlogs, Sprint Backlogs, User Stories, and Scrum Taskboards.