Sonam Bagul

Walk us through your path before CUSP. What got you interested in urban science?  

I have an engineering background with Computer Science as my major in undergraduate education. I was fortunate to work under the guidance of Prof. Sangeeta on a research project on text analytics and later published my work as well. The thorough nature and rigor involved in research shaped my outlook and inspired me to pursue data science and machine learning in college. To practice and master the skills, I helped different organizations build optimal products as a data scientist for almost 6 years. To continue my learning journey while working, I collaborated part-time on a neural network research project under the guidance of a Mathematics professor from Delhi University.

Personally, I have a keen interest in policy formulation and administrative services. The idea of formulating data driven policies is very tantalizing for me as a computer scientist. I even have my own YouTube channel, called “Coursity”, where I talk about Indian policies and help make them palatable through effective visualizations. The urban informatics course at NYU CUSP is a specialization for solving urban challenges through the lens of a data scientist and I was very glad to find such a program blending all my interests together.

Can you tell us more about your research on natural language processing?

I published a paper as an undergraduate where we developed a system called “TagStack”. Fundamentally, we generate automated tags corresponding to every Stack Overflow question. Stack Overflow is a website where you can post questions about technical issues and code errors. When the question is posted, Stack Overflow checks if similar questions were asked before and looks for those answers. So instead of posting a new version of an answer, it suggests checking the already existing ones. Basically, you can optimize the recommendation of a question that is based on the tags. This process was done manually at first, and after I published the paper, we automated the process.

My other research was related to data compression. Since urban data is so complex, the fundamental question was how we can solve the problem of complexity in real life. Past research in methodologies such as PCA (Principal Component Analysis) helps to reduce dimensionality, but you lose some information. However, newer neural network-based techniques have promising applications. Smaller dimension size leads to optimized computation and processing of any resource.

It sounds like you brought a lot of great knowledge with you to NYU! So, how did your first semester at CUSP go?

My overall experience was pretty different from my undergraduate studies, because here I learnt that to grasp more knowledge, you can approach anyone (i.e. professors, administrators, city planners) and they will help you to understand the concepts better. I think, for solving complex problems of cities we always need different perspectives and varied contributions. This can be achieved at this international institution by gaining diverse perspectives. CUSP at NYU has provided me the platform to interact with brilliant minds from different areas of expertise which is the key to breakthrough. 

For example, in the Civic Analytics and Urban Intelligence course, I got to learn more about policy-related issues. For my course project, I picked the current issue related to electric vehicles (EVs). As the demand for EVs is increasing, new infrastructure (i.e curbside charging stations) is required to fulfill this demand. I got in touch with the NYC Department of Transportation as they are the authority in NYC to install curbside EV charging stations. I created an optimization model by considering all the aspects of supply (data from NYC DOT), demand (public data) and competition (e.g. EV charging stations from other companies like Tesla, Blink). This project was a whole pipeline from data collection to visualization dashboard. I was able to successfully deliver it because I had the continuous support of the professor and NYC DOT officials.

In fall, I completed “Urban Decision Models” and “Principles of Urban Informatics” courses. These courses helped me brush up my technical skills along and familiarize with urban concepts which helped in creating a foundation for the rest of my classes.

Can you tell us more about your research with the Department of Transportation?

One of the deliverables in the Civic Analytics course was a project in public policy. I chose a topic addressing the ongoing developments in EV infrastructure at DOT. I began by conducting hour-long interviews with people across the hierarchy to understand the problem in depth and that helped me shape a data science solution involving an optimization model. 

One of the important challenges in EV infrastructure installation is ensuring equitable access to different population groups and communities. How do we ensure that each borough gets the right number of EV chargers so that maximum population can reap benefits of driving EV vehicles?

There are certain factors to consider like population demographics, income diversity and presence of POI (points of interest) in locations. For example, people visit a mall, grocery store or laundromat on a daily basis, therefore placing EV chargers in parking garages at these POIs could help more people rather than placing them in isolated areas without traffic flows. 

Another big challenge in EV adoption is EV charging “deserts” on the highways. EVs generally can run only upto 250 miles on a single charge, so you’ll be stuck on the highway in absence of an EV charger beyond 250 miles. This was one of the input factors into the optimization model I developed to ensure that each installed EV charger has a sister charging location within 100 miles, thus creating a more robust network.

You also did some independent research with Professor Debra Laefer. Can you tell us more about that? 

We have been using data collected during the COVID pandemic to predict people’s behavior and learn from it if this type of event happens in the future. The NYC GIS website has open access data depicting individuals’ movement patterns around healthcare facilities during the 2020 and 2021 COVID surges.

We visualized the movement patterns in a point cloud environment, which consist of 3D laser points representing an aerial snapshot of the Earth and objects on the surface. In this environment, each building or each concrete road is represented and defined as a 3D object in millions of laser points. It captures the top view of buildings and roads.

The orange and blue hotspots depict the distribution of males and females walking outside hospitals. This data helped us generate valuable insights on people behavior and interactions around COVID healthcare facilities. Knowing this, we can make suggestions on how to better prepare infrastructures for the massive need for medical care in this type of situation. 

And then there’s your Capstone project. What were you investigating and who were you working with?

A few of my classmates and I are working with Dr. Bello and Dr. Roman to create AR smart assistants that can help to train professionals. Let’s suppose a person starts working as a firefighter. This person will need to learn from various training, videos, shadowing – and go through all these experiences to learn how to do this job. We are trying to replace the shadowing methods and see if a person can be trained by a smart assistant and learn in a mixed reality setting using HoloLens.

For now, we are working with Subway data (as in the sandwich chain), specifically on defining a person’s patterns when preparing a Subway sandwich. There is a YouTube channel with openly-available videos posted everyday of a real-life worker at a Subway restaurant making specific menu items as he follows the verbal instructions of customers.

There is a specific pattern which is followed by the employee as he is making a sandwich based on his recognition of the customer’s speech while they place their order and customize it. Besides speech recognition there is also object detection, which means, in this scenario, if we see the red color and smooth texture of an object, we detect it is a tomato. So, if we create a data set with all the speech annotations and video analytics, we can then automate the process. Here it is specialized to Subway, and we are using Subway videos as a proof of concept, but it can also be elaborated to other kinds of tasks such as firefighting, surgical training and so on.

So, instead of shadowing a real person, the trainee will be learning in a virtual world. For example, in a firefighting scenario, there would be a virtual trainer who will explain in real time what to do and what steps can be undertaken next to extinguish the fire.

It sounds like you’ve learned a lot through all of these experiences! What would be your advice for students who are starting their CUSP journey soon?

I would suggest staying open minded. You will gain a lot of knowledge and learn a ton even though it might not be in the same direction as you expected at first. There are elements that will help your overall development and not just technical growth. The course is a specialization and the learning curve rises exponentially given the fast-paced nature of the program. For electives it is important to take courses aligned to your sphere of interest, depending on what you want to do in the future and your career goals. CUSP provides a platform which promotes technology and urban development at same time, so take your time and maximize your learning experience. Lastly, take a breath and enjoy the CUSP kitchen area chats while having lunch or coffee with friends!