Data visualization plays a vital role in data science. It enables you to investigate the connections between many variables and get a greater comprehension of the information. Effective visualizations assist in clearly and compellingly presenting ideas and conclusions to stakeholders. One can easily spot patterns, trends, and outliers in the data by using various visualization techniques, such as bar charts, line graphs, scatter plots, etc., which may not be immediately obvious from the raw data. To develop insights, test hypotheses, and arrive at informed decisions, data visualization is a critical step in the data science pipeline.Â
What is the overall trend in the number of traffic accidents over recent years?Â
The annual accident counts over the recent years were analyzed, and the below plot illustrated the trend.
Fig 1
The plot clearly shows that the trend of accident rates year on year basis is increasing. This trend highlights the need for a comprehensive approach to improving road safety. This trend can have significant consequences, not just in terms of property damage and physical injury but also in terms of increased traffic congestion, emergency response time, and economic costs.Â
Do extreme weather conditions account for the most number of accidents?Â
To investigate whether harsh weather conditions contribute to a higher number of accidents, the number of recorded accidents by weather condition was plotted. There was a prior bias that harsh weather conditions, such as snow or rain, may lead to more accidents.
Fig 2
The radial plot indicates that the highest number of accidents occurred during clear weather conditions, which challenges the idea that accidents primarily happen during harsh weather. However, it is crucial to acknowledge that this information does not establish a direct causal relationship. It is possible that various other factors, like increased traffic volume or reckless driving, could be contributing to the high number of accidents recorded during clear weather.
Are accidents more likely to occur on weekends compared to weekdays?
An analysis was conducted on the spread of accidents by month, day of the week, and hour to explore any potential insights.
Fig 3
Fig 4
The visuals reveal that Friday has the highest number of accidents among all weekdays, and accidents are more likely to occur during the week than on weekends. The trend can be attributed to factors such as increased traffic volume during weekdays due to commuting and distractions, as well as reckless and impulsive driving behavior caused by the rush to start the weekend. Social gatherings and events could also be a factor for Friday leading the other weekdays. The peak time for accidents is between 3 PM and 6 PM, likely due to a combination of factors such as decreased visibility, fatigue from commuting, and reckless driving behavior.
Fig 5
December records the most accidents among other months. There could be several reasons why December might have higher numbers of accidents compared to other months. A few possible factors to consider includeÂ
Weather conditions: December is often associated with winter weather, which can lead to hazardous driving conditions such as snow, ice, and rain.Â
Holiday travel: December is a busy travel month, with many people traveling to visit family and friends for the holidays. This can lead to increased traffic on the roads, which can increase the likelihood of accidents.Â
Increased alcohol consumption: The holiday season is also associated with increased alcohol consumption, which can lead to impaired driving and a higher risk of accidents.
Reduced visibility: The shorter days in December can also lead to reduced visibility on the roads, making it more difficult for drivers to see other vehicles and potential hazards.Â
Which states and cities have the highest number of traffic accidents?Â
The packed circle plot and choropleth plots below displayed the cities and states that have recorded the most accidents.
Fig 6
Miami has the maximum count of accidents followed by Los Angeles.Â
Further analysis would be necessary to understand why Miami and Los Angeles have higher counts of accidents and what factors are contributing to this trend. This could involve exploring additional variables such as population density, traffic volume, weather conditions, and road infrastructure, among others. Understanding the underlying causes of accidents can help inform decision-making and contribute to the development of solutions to reduce accidents in the future.
Fig 7
California has a high accident rate with over 700k recorded accidents, which is not surprising given its big population and massive road system. However, this also means that there is a significant scope for the state to improve road safety and reduce the number of accidents. The high accident rate emphasizes the need for an all-encompassing strategy to increase road safety, with an emphasis on lowering the number of accidents and encouraging safe driving practices. Florida and Texas rank 2nd and 3rd after California respectively.Â
What is the distribution of accidents by their severity level?Â
A pie chart was used to answer this question since there were only four levels of severity in which an accident could be classified.
Fig 8
The plot shows that most of the accidents are categorized as severity 2 on a scale of 4, where 4 denotes the maximum impact of the accident on the traffic(i.e., long delay).Â
After discovering the higher frequency of severity 2 incidents, a word cloud map was used to examine their impact on traffic.
Fig 9
We can clearly see that in a majority of cases, the traffic had to slow down or exit the road as a result of the accident. Stationary traffic is also observed in many instances.
What are some of the top accident-prone roads in the country?
Fig 10
Interstate 5 seems to have the most accidents closely followed by Interstate 95.Â
Interstates are high-speed highways that are designed for efficient travel and high volumes of traffic. However, high speeds can also increase the likelihood of accidents, especially if drivers are not driving safely or if road conditions are not ideal.
Is there any significant difference between the accidents taking place on Interstate roads vs other roads?Â
The first step involved identifying and labeling Interstate Highways from the Description and Street Name. The below plot visualizes all the accidents that occurred on interstate highways.
Fig 11
A violin plot with density estimation was used to explore if there is a notable difference in accident severity and distance based on whether the accident occurred on an interstate or not.
Fig 12
Fig 13
From the above figure 12, it was observed that there is very less difference in the Severity of an accident based on if it occurred on an interstate or not. On the other hand, in figure 13 we see the average distance of an accident by road type. A violin plot of distance versus road type (above) confirmed that the mean distance of an accident on an interstate (1.248 miles) is longer than other road types (0.459 miles).Â
What is the variation in driver involvement in fatal traffic accidents by gender?
Fig 14
Looking at the plot, we can see that female drivers are less likely to be involved in fatal traffic crashes as compared to males. Women are much better drivers in terms of safety.
Is alcohol the reason for higher Traffic Fatality Rate?
Fig 15
From the plot, it is clear that though Mississippi state has less number of drunk drivers compared to New York state, the total fatalities per 100K drivers is significantly higher in Mississippi when compared to New York State. This indicates that alcohol is not the reason for some states having higher traffic fatality rates.Â
Upon further investigation, it was interesting to find out that Pickup cars are one of the reasons why some states have higher fatality rates.
Fig 16
The higher fatality rates for pickup trucks are likely due to a combination of factors, including the vehicle's design, the type of roads and driving conditions in areas where pickups are more prevalent, and the behaviors and demographics of the people who drive them. For example, pickup truck drivers may be more likely to engage in risky behaviors such as speeding or not wearing a seatbelt, and they may also be more likely to drive on rural roads with higher speeds and more obstacles. Additionally, the fact that many pickups are used for work or recreation in rural areas means that they may be more likely to be involved in accidents while carrying heavy loads or towing trailers. All of these factors contribute to the higher fatality rates for pickup trucks.
Spatial Analysis
To analyze and visualize the data on a spatial level, a folium plot was utilized by sampling the data for one County within a state, specifically Ventura county in California state. This method allowed for a deeper analysis on a community and street level, revealing the occurrences and hotspots for accidents.
Fig 17
The following Tree Plot illustrates the number of accidents at the county level. Due to the vast size of the data, a sample was selected for visualization purposes.