skip to main content

<< Return to Webinar Files

Webinar Presentation

HTML version of the presentation
Image descriptions are contained in brackets. [ ]

A Digital Twin for Traffic Monitoring & Proactive Incident Management
(May 11, 2021)

Predicting Traffic Crashes Using Probe Data
Presenter: Zihe Zhang
Presenter’s Org: College of Engineering, University of Alabama

T3 and T3e webinars are brought to you by the Intelligent Transportation Systems (ITS) Professional Capacity Building (PCB) Program of the U.S. Department of Transportation (USDOT)’s ITS Joint Program Office (JPO). References in this webinar to any specific commercial products, processes, or services, or the use of any trade, firm, or corporation name is for the information and convenience of the public, and does not constitute endorsement, recommendation, or favoring by the USDOT.


[The slides in this presentation contain the University of Alabama’s logo.]

Slide 1: Student Presentation 3

Title: Predicting Traffic Crashes Using Probe Data

Students: Zihe Zhang, PhD Candidate; Qifan Nie, Postdoctoral Researcher

Advisors: Jun Liu, PhD, Assistant Professor; Alex Hainen, PhD, Associate Professor

  • The goal of the research is to develop models to predict the risk of an incident or crash
  • Hypothesis ‐ Crash risk is a function of traffic flow dynamics, network location, and physical road environments along with other factors

Not all types of traffic incidents!!!

Key Work:

  • HERE probe data extraction
  • Data integration (traffic, incident, crash)
  • Data mining ‐ Machine learning

[This slide contains a photo of two cars, one has crashed into the back of the other car.]

Slide 2: Predicting Traffic Crashes Using Probe Data

  • Sub‐objective 1 ‐ Predict crashes before reported:
    • All crashes
    • Rear‐end crashes
  • Sub‐objective 2 ‐ Identify how prediction accuracy varies using different machine learning models given varying temporal ranges
  • Sub‐objective 3 ‐ Interpret the machine learning model

[This slide contains three images: (1) a photo of the damaged back end of a car, (2) a computer‐generated image of a brain with many icons overlaying the brain, and (3) a grid arrangement of four screenshots of graphs.]

Slide 3: Current Practices

  • New York City, NY
    • Highway Emergency Local Patrol (HELP) program, Citywide Incident Management System (CIMS)
  • Florida
    • 911 calls, TMCs’ cameras and sensors, Computer‐aided dispatch (CAD) system, Public‐facing mobile roadway navigation applications
  • Wisconsin
    • CCTV Cameras, Dynamic messaging signs (DMS), Broadcast, WisDOT 511 Mobile App, Twitter, Computer‐aided dispatch (CAD) system
  • Los Angeles, CA
    • The Automated Traffic Surveillance and Control (ATSAC) Center, Los Angeles Regional Transportation Management Center (LARTMC)

[This slide contains two photos: (1) a photo of a utility vehicle and worker on a road, with an arrow light on top of the truck directing traffic and with a person in a safety vest laying out traffic cones and (2) a photo of an emergency response van in front of a building.]

Slide 4: Scholarly Research

Model Author, Year Study Area Data Used
Support Vector Machine (SVM) Xiao et al., 2021 I‐880 Freeway data in California Detector data (speed, occupancy, volume)
Gakis et al., 2014
Yuan and Chen, 2013
Convolutional neural network (CNN) Huang et al., 2020 I‐235 in Des Moines, IA TMC data, GPS data
Generative adversarial networks (GANs) Cai et al., 2020 SR 408 expressway in Orlando, FL Spot speed, volume, and occupancy of each lane in real‐time, and crash data
Lin et al., 2020 I‐880 Freeway data in California Detector data (speed, occupancy, volume)

[This slide contains three diagrams: (1) a plot point diagrams showing vectors on a plane, (2) a diagram that visualizes the steps between an input and receiving an output, and (3) a flowchart.]

Slide 5: Available data from ALDOT

HERE Traffic Data

  • Speed data

ALGO Incident Data

  • Incident information

CARE Crash Data

  • Tabulated crash report data

Highway Performance Monitoring System (HPMS) Data

  • Freeway segments and ramps characteristics

[This slide contains four logos: (1) the logo of the Alabama Department of Transportation, (2) the logo of HERE, (3) the logo of ALGO, (4) the logo of CARE.]

Slide 6: ALGO Incident Data

  • Incident data from ALGO Traffic
    • Owned and maintained by ALDOT
    • In store incident information like incident time, location, response agencies, etc.
  • The University of Alabama (UA) through the Center for Advanced Public Safety (CAPS) involved in the development of ALGO Traffic Platform

[This slide contains three images and one logo: (1) one image is comprised of four snapshots of highways, (2) a digital map showing several roads, (3) a diagram of an overhead view of a road showing the direction of travel of different lanes in the road, and (4) the ALGO logo.]

Slide 7: HERE Traffic Data

  • ALDOT recently purchased mobility data from HERE
    • Real‐time speed observation based on probe vehicle
    • Over 10,000 traffic message channels (TMCs)
    • Dynamic sub‐segment updated every minute

[This slide contains two images: (1) the logo of HERE and (2) a table showing the TMC Segments and Dynamic Sub‐segments.]

Slide 8: CARE Crash Data

  • The key crash information includes
    • Location and time
    • Crash types and injury severities
    • Other standard crash‐related attributes

[This slide contains two images: (1) the logo of CARE and (2) a screenshot showing a spreadsheet, two bar graphs, and a pie graph.]

Slide 9: Data Processing Framework

[This slide contains a diagram that shows four logos and arrows that lead to labels of different data types they provide and how they lead to incident/queue prediction. The logos are, HERE, ALGO, CARE, and The Highway Performance Monitoring System.]

Slide 10: Traffic Speed Extraction Tool

Inputs

Database
Incident route name
Incident mile marker
Incident time
TMC information
Speed based on dynamic sub‐segment
Parameter setting
Spatial range
Temporal range

Outputs

Speed matrix Spatial ‐ temporal speed graph Speed drop plot

[This slide contains three graphs: (1) a speed matrix, (2) a spatial‐temporal speed graph, and (3) a speed drop plot.]

Slide 11: Spatial‐temporal Speed Graph

[This slide contains two images: (1) a spatial‐temporal speed graph titled “Major Crash at Mile Marker 146.16.” (2) a spatial‐temporal speed graph titled “Major Crash at Mile Marker 255.0.”]

Slide 12: Data for Modeling

  • Route
    • I‐65 North in AL
  • Time Range
    • 2019 January–2020 June
  • Crash records
    • 2,586 crash records
    • 790 rear‐end crash records
  • Non‐crash records
    • Same amount, same spatial locations
    • Time varies

[This slide contains a satellite shows a map of Alabama and its main roads.]

Slide 13: Variable Creation

Speed‐related variables generated in various time ranges (60‐min, 30‐min, 15‐min, 10‐min, 5‐min)

Variable Name Description
spddropmean60 Mean of speed drop within 1 hour at crash location
spddropvc60 Variance of speed drop within 1 hour at crash location
spddropmax60 Max speed drop within 1 hour
spddropup60 75% speed drop within 1 hour
spddroplow60 25% speed drop within 1 hour
spddropmin60 Min speed drop within 1 hour
tmaxspddrop60 Timestamp of max speed drop within 1 hour
spmean60_5m Mean of speed within 1 hour within 5 miles
spvc60_5m Variance of speed in 1 hour within 5 miles
Tmcave60 Mean of #TMC/min in 60 minutes
Tmcvar60 Variance of #TMC/min in 60 minutes

[This slide contains three graphs: (1) a speed matrix, (2) a spatial‐temporal speed graph, and (3) a speed drop plot.]

Slide 14: Other Variables

Crash‐related

Variable Name Description
Time of day a.m. peak (06:00–10:00), daytime (10:00–16:00), p.m. peak (16:00–20:00), night (20:00–06:00)
Day of week Weekdays, weekend

HPMS

Variable Name Description
AADT AADT for location, and Ramp AADT
Nlanes Number of lanes
RampDis Distance to the nearest ramp
Ramp Ramp or not
Urban Rural/Urban
AccessCon Access control

Slide 15: Modeling

  • Logistic regression
  • Random Forest
  • Support Vector Machine

[This slide contains three diagrams: (1) a flowchart titled “Test Sample Input.” (2) a scatterplot, and (3) a scatterplot.]

Slide 16: Crash Risk Modeling

  • All Crash Models
    • To predict the risk of any type of crashes at a time and location, given the instantaneous traffic flow dynamics
  • Rear‐End Crash Models
    • To predict the risk of having a rear‐end crash at a time and location, given the instantaneous traffic flow dynamics

[This slide contains a background image of an aerial image showing several levels of roads with cars on them.]

Slide 17: All Crash Model Results

  • Accuracy
  • F1 score
  • 30‐min before crash
  • Random Forest

Accuracy = Number of correct predictions ÷ Total number of predications

F1 = 2 x Precision × Recall ÷ Precision + Recall

[This slide contains two bar graphs: (1) a graph at left compares accuracy and the time before the crash and (2) a graph that compares the F1 score and the time before the crash.]

Slide 18: All Crash Model Results

  • Feature importance (permutation importance)
    • Describes which features are relevant
    • Provides a highly compressed, global insight into the model’s behavior
Variable Name Description
spvc30_5m Variance of speed within 30 mins within 5 miles
spddropup30 75% speed drop within 30 mins
logaadt AADT for location, and Ramp AADT
spddropmean30 Mean of speed drop within 30 mins at crash location
spddropmax30 Max speed drop within 30 mins
tmaxspddrop30 Timestamp for max speed drop within 30 mins
Tmcvar30 Variance of #TMC/min in 30 mins
spddroplow30 25% speed drop within 30 mins
Tmcave30 Mean of #TMC/min in 30 mins
spddropmin30 Min speed drop within 30 mins
Weight Feature
0.0951±0.0082 spvc30_5m
0.0477±0.0044 spddropup30
0.0420±0.0031 logaadt
0.0387±0.0055 spddropmean30
0.0330±0.0041 spddropmax30
0.0309±0.0025 tmaxspddrop30
0.0305±0.0059 Tmcvar30
0.0301±0.0038 spddroplow30
0.0298±0.0035 Tmcave30
0.0281±0.0041 spddropmin30
0.0231±0.0024 spddropvc30
0.0218±0.0032 spmean30_5m
0.0155±0.0027 Near_Dist_FT
0.0109±0.0033 timeind_1
0.0097±0.0009 Through_La
0.0045±0.0023 RuralUrban
0.0044±0.0009 timeind_3
0.0043±0.0012 weekday

Slide 19: All Crash Model Results

Partial dependence (PD) plots

[This slide contains four line graphs: (1) shows the variance of speed within 30 minutes within 5 miles, (2) shows the 75% speed drop within 30 minutes, (3) the Log (AADT), and (4) the mean of speed drop within 30 minutes at crash location.]

Slide 20: Rear‐End Crash Model Results

  • Accuracy
  • F1 score
  • 15‐min before crash
  • Random Forest

[This slide contains two bar graphs: (1) a graph compares accuracy and the time before crash and (2) a graph compares the F1 score and the time before crash.]

Slide 21: Rear‐End Crash Model Results

Variable Name Description
spvc15_5m Variance of speed within 15 mins within 5 miles
spmean15_5m Mean of speed within 15 mins within 5 miles
Tmcave15 Mean of #TMC/min in 15 mins
logaadt AADT for location, and Ramp AADT
spddropup15 75% speed drop within 15 mins
spddropmax15 Max speed drop within 15 mins
spddropmin15 Min speed drop within 15 mins
spddroplow15 25% speed drop within 15 mins
spddropmean15 Mean of speed drop within 15 mins at crash location
spddropvc15 Variance of speed drop within 15 mins at crash location
Weight Feature
0.2110±0.0102 spvc15_5m
0.0478±0.0127 spmean15_5m
0.0474±0.0056 Tmcave15
0.0469±0.0058 logaadt
0.0441±0.0043 spddropup15
0.0438±0.0081 spddropmax15
0.0438±0.0083 spddropmin15
0.0431±0.0070 spddroplow15
0.00422±0.0072 spddropmean15
0.0398±0.0038 spddropvc15
0.0348±0.0069 timeind_1.0
0.0290±0.0039 Near_Dist_FT
0.0276±0.0055 Near_Dist_FT
0.0248±0.0047 timeind_1
0.0150±0.0021 Through_La
0.0138±0.0049 RuralUrban
0.0076±0.0020 timeind_2.0
0.0040±0.0030 weekday
0.0033±0.0007 timeind_3.0
0.0024±0.0013 RampNot

Slide 22: Rear‐End Crash Model Results

Partial dependence (PD)

[This slide contains four line graphs: (1) the variance of speed within 15 minutes within 5 miles, (2) the mean of speed within 15 minutes within 5 miles, (3) the mean of #TMC/min in 15 minutes, and (4) shows Log(AADT).]

Slide 23: Takeaways

Rear‐end crashes are more predictable.

The traffic flow speeds (reductions and variances) prior to the event of a crash are highly related to the crash risk.

Random forest is associated with improved performance compared with other models.

For rear‐end crash model, traffic speed variance at 15 minutes prior to a crash (HERE data) appears to be critical for predicting the crash risk.

[This slide contains a background image of two cars in a rear‐end collision.]

↑ Return to top

SUPPORT

Technical Assistance is available to Federal, State and local transportation agencies through:

ITS Peer Program - The ITS Peer-to-Peer Program puts you in touch with technical experts or experienced peers.

ITS Help Line - The ITS Help Line provides technical support by email or telephone at 866-367-7487.

STAY CONNECTED

go to twitter    go to Facebook    go to Instagram    go to Linkedin    go to email