Article

Predicting Car Accidents with Machine Learning

Jonathan Bass
January 14, 2025

In Jacksonville, Florida, car accidents are a problem like any other mid-sized city. On my daily commute to and from work, I see about one or two accidents a day. I began to notice the crash sites were in the same spots around town; the congested off-ramps, between the two crowded exits, or nearby the hill with an ignored caution sign. These accidents would range from a simple fender bender to potential fatalities. Regardless of the severity, my drive home would be significantly lengthened.

Predicting Accidents: The Expected vs. Unexpected

At Urban SDK, we were asked by the Florida Highway Patrol (FHP) to predict future car accidents to appropriately hire, staff, and dispatch troopers based on the predicted time and location of the incident.

As data scientists and Jacksonville locals, we began searching for factors that could potentially contribute to an incident. We broke the problem down into two categories: the expected and the unexpected. Weather, road conditions, and daily traffic are expected conditions for a daily drive. However, road curvatures and other mathematically derived features are unexpected variables that should be accounted for, so put those into the model as well.

Photo by João Silas on Unsplash

Using Historic Data to Segment Roadways

Looking outside the box, we went on a ride-along with FHP troopers to understand what they witness everyday. We observed the intuition of the trooper and decided to design the model around replicating decisions a trooper would make.

Admittedly, finding the appropriate data sets for the features we hypothesized proved to be difficult. We pulled from various data repositories like FDOT, NOAA Weather Data, Open Street Map, and other GIS data. Our first road block was identifying a way to segment the roads of interstates and state roadways.

Through the data collected, we separated the data in two groups: incident and no incident. Since FHP provided historic accident data, we applied negative sampling by creating a sequence of road segments by each hour. A negative case indicated no incident whereas a positive case indicated an incident had occurred. After joining the positive samples to the negative samples, we were able to build a model that will tell us which road segments are highly probable by the hour.

A map of Jacksonville showing which roadways are accident prone.
A chart showing each hour of the day. When each bar is hovered over, the user can see how many incidents have occurred or will occur depending on the date picked. On Friday May 29th, 5 accidents were predicted to occur at 5pm.

Sharing Data Across Organizations

Our product was never meant to stop after the initial model was built. Urban SDK is a full-stack company. From one year of crash data, we were able to build a multifaceted application. Each client receives a personalized interactive dashboard for the data they are interested in. The application can now be used by supervisors, troopers, TPOs, or any DOT employee.

A demo of the product dashboard

Predictive Analytics to Aid Staffing

Supervisors, those in charge of trooper staffing, are now able to show count of incidents by trooper zones and date. This feature allows them to encourage their troopers to hang around in high probable areas.

Interactive chart to show predicted incidents in each zone on the map. The numbers that correlate to each color are specific zoning methods used by FDOT. In zone 02P4, there are 36 potential cases.

Incident Prevention Through Forecasting

Troopers are able to station themselves based on the predicted locations where the incidents will occur and at what time. This allows troopers to strategically route themselves to have quicker response times or maybe even prevent the incident with their presence.

By hovering over each zone, the probable number of incidents are showed. Each area of Jacksonville is separated by an easy to read color coded map. For May 29th 2020, the west side of Jacksonville was a highly probably area with 36 potential crashes.

Evidence-Based Safety Decisions

For TPOs, the biggest benefit is to differentiate between which roadways are safe, and which ones are not. For example, if a certain part of I-95 proves to be a hotspot, which we can discern is most likely caused from a lack of a shoulder for cars to pull onto following an accident, a recommendation can be given to TPOs to add a shoulder to the road.

TPOs can now see what roadways troopers should be stationed at. If one roadway is considered a hot spot, they are able to investigate the roadway to find a potential fix to the problem.

By evaluating our feature importance from our model, adjustments and repairs are able to be made with well informed data. Future car accidents can be prevented through road alterations and efficient trooper delegation. We are able to dig deeper into accident prevention.

Urban SDK is a software development company geared towards serving government agencies to help with better decision making using artificial intelligence and machine learning. Schedule a demo to see how Urban SDK can save you time and improve your data confidence.

Related Blogs

Sign up for a demo & data trial

Sign up to access a trial account and demo with our team.