Project Members: Patrick Maedgen & Chirag Karkera

Project Sponsor: Dr. Ali Mostafavi

The goal of our project was to investigate the connection between flooding and congestion on a traffic network using crowd-sourced data. We were given two data sets – street data that contained names and coordinates of street segments and crowd-sourced alert data where users can upload traffic alerts. The alert data spanned across a month but we only wanted to look at 10 days in total split into two 5 day periods (Before Flooding and After Flooding).


First step was to shrink the data set by removing unnecessary alerts. We took out alerts that did not indicate flooding or a traffic jam or that were logged outside of both time periods. This step removed about two thirds of the alert data set.

Next we needed to “link” each alert to a street segment that way we knew which street segment it was logged on. We used a Divide-and-Conquer approach. We split the map into a grid, dividing it into smaller subsections. Then in each subsection, measured the distance from each alert to nearby street segments. This drastically reduced processing time. What was an all day process could now be accomplished in 30 minutes.

We then took these results and recorded where and when each alert was logged for both flooding and traffic jams. We stored this information as vectors for each segment. Then we measured the similarity between street segments for different types of alerts before flooding and after flooding by taking the cosine similarity of every single pair of vectors. This allowed us to see if there was a connection between the times and places of different types of alerts.

We found evidence that there is a relation between flooding and traffic jams, and also determined that findings such as this can come from crowd-sourced data.