Using Telematics Data to Evaluate Breakdown Risk for NYC School Buses

Project Sponsor

  • Varun Adibhatla, Head of Data Science and Analytics, NYCSBUS
  • Holly Orr, Chief Information Officer, NYCSBUS
  • Cristopher Soto, Data Engineer, NYCSBUS 
  • Ed Driscoll, Chief Fleet and Facilities Officer, NYCSBUS 


NYCSBUS is a non-profit school bus company that seeks to provide outstanding service to families and schools in New York City while innovating in student transportation. NYCSBUS operates a fleet of 1000 school buses transporting 9000+ students across the NYC Metro Area. A majority of these students have special needs. NYCSBUS vehicles are equipped with GEOTAB, a device that tracks vehicle location, engine diagnostics and other telematics. NYCBUS wants to use telematics data to detect patterns and evaluate vehicle breakdown risk in order to develop proactive maintenance programs. The overall goal is to reduce vehicle breakdowns.

Category: Urban Infrastructure

Project Description & Overview

CUSP Students are expected to:

1) Develop and Document operational context from subject matter experts. Organize and get familiar with the data (4 weeks)

2) Perform exploratory data analysis (EDA) on data provided by NYCSBUS using reproducible code and prepare a report on patterns observed for vehicles that experienced a breakdown. (3 weeks)

3) Train a machine learning model to predict the risk of a vehicle breakdown. Vehicles with a high score means that a breakdown is highly likely and that vehicle needs to come in for repairs. (4-6 weeks)

4) Validate results of the model. (2 weeks)

5) Retrain the model using updated data (new telematics and breakdown data) (2 weeks)

6) Present final model, code, and documentation (3 weeks)


For this project, NYCSBUS team will provide:

a) Vehicle breakdown events (vehicle number, location of breakdown, and date of breakdown)
b) Telematics data for *all* vehicles (Telematics includes
speed, distance traveled, accelerometer readings, engine fault codes, other driving behaviors)
c) Access to subject matter experts (Head of Fleet, Head of Data Science, Mechanics)


A good understanding of exploratory data analysis, text pattern mining, and supervised machine learning.While data contains location elements, we do not anticipate requiring any spatial data to understand signals from the telematics data to determine risk of breakdown.

Learning Outcomes & Deliverables

Machine Learning to detect risk of vehicle breakdown
Feature engineering
Text Pattern mining