NHTSA Vehicle Safety Risk Analysis

Project Overview

This project analyzes 1M+ NHTSA vehicle safety complaints to identify the highest-risk vehicle components, recurring mechanical failure conditions, and manufacturer-level defect trends using severity-weighted risk scoring.

Rather than relying on raw complaint volume alone, this analysis incorporates injuries, fatalities, crashes, and fires to surface true safety risk and guide targeted interventions.

Business Problem

Consumer vehicle complaints contain critical safety signals, but complaint volume alone does not reflect true risk. High-frequency issues may be low severity, while rare issues may pose extreme danger.

Automakers and regulators need a way to prioritize defects that cause the most harm, not just the most noise.

Problem Statement

How can we pinpoint the components, failure conditions, and vehicle populations driving the most severe safety complaints so automakers can reduce monthly NHTSA complaints by 10 percent within one year?

Data

Source: National Highway Traffic Safety Administration (NHTSA)
Records: 1M+ consumer vehicle complaints
Time Range: 2015–2025

Key Fields

Manufacturer
Model
Model Year
Component
Complaint Description
Injuries
Deaths
Crashes
Fires

Methodology

Cleaned and standardized 1M+ NHTSA complaint records using Python
Engineered a severity-weighted risk index incorporating deaths, injuries, crashes, and fires
Normalized results to avoid bias from vehicle age and raw complaint volume
Performed component-level, model-year, and manufacturer-level risk analysis
Extracted recurring mechanical failure themes using keyword analysis on complaint text

Analytical Framework

Branch 1 – Component Risk Interpretation

Identifies the highest-risk vehicle components using a weighted severity index that blends complaint volume with injuries, deaths, crashes, and fires.

Key Findings

Air Bags show the highest severity per complaint due to injury and fatality concentration
Engines have the highest overall risk due to massive complaint volume and chronic failures
Electrical Systems consistently rank high across multiple severity dimensions

Branch 2 – Recurring Mechanical Failure Conditions

Analyzes complaint text to identify recurring mechanical failure themes across high-risk components.

Key Findings

Fail, Oil, Power, Stall, and Brake are the most common recurring failure themes
The same failure conditions appear across multiple components, suggesting shared design or engineering weaknesses
Normalized keyword rates reveal which failures persist beyond high complaint volume

Branch 3 – Model Year Defect Patterns

Identifies model years and brand-model-year combinations with disproportionately high defect risk after normalizing for vehicle age.

Key Findings

Normalization reveals true problem years, not just older vehicles with more time on the road
Certain production years repeatedly surface across engine, electrical, and powertrain failures
Highlights the worst model year for each high-risk component and the top 15 highest-risk combinations overall

Branch 4 – Manufacturer Defect Trends

Determines the dominant defect component for each manufacturer and how much of the brand’s total complaint volume it represents.

Key Findings

Ford, Hyundai, and Chevrolet are dominated by Engine-related defects
Honda, Jeep, Volkswagen, and Mercedes-Benz show high dependency on Electrical System issues
Subaru stands out with Visibility/Wiper dominance
Tesla shows elevated Forward Collision Avoidance complaints, reflecting software-heavy defect patterns
Many brands have a single component responsible for 20–30 percent of total complaints

Overall Insight

Together, the four analytical branches form a unified strategy for reducing national complaint volume:

Identify the highest-risk components
Identify recurring mechanical failure conditions
Identify high-risk model years
Identify each manufacturer’s dominant defect category

This layered approach enables targeted, high-impact interventions rather than broad, unfocused recalls.

Tools Used

Python (Pandas, NumPy)
Jupyter Notebook
Data cleaning and feature engineering
Exploratory analysis and visualization

Outcome

This analysis produces ranked, severity-weighted safety risk insights suitable for:

Executive and regulatory review
Manufacturer defect prioritization
Model-year-specific recalls
Supplier and engineering corrective actions
Downstream dashboard and reporting applications

By focusing on the right components, vehicles, and failure conditions, automakers can realistically reduce monthly NHTSA complaints by 10 percent within one year.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
nhtsa_vehicle_safety_risk_analysis.ipynb		nhtsa_vehicle_safety_risk_analysis.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NHTSA Vehicle Safety Risk Analysis

Project Overview

Business Problem

Problem Statement

Data

Key Fields

Methodology

Analytical Framework

Branch 1 – Component Risk Interpretation

Branch 2 – Recurring Mechanical Failure Conditions

Branch 3 – Model Year Defect Patterns

Branch 4 – Manufacturer Defect Trends

Overall Insight

Tools Used

Outcome

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NHTSA Vehicle Safety Risk Analysis

Project Overview

Business Problem

Problem Statement

Data

Key Fields

Methodology

Analytical Framework

Branch 1 – Component Risk Interpretation

Branch 2 – Recurring Mechanical Failure Conditions

Branch 3 – Model Year Defect Patterns

Branch 4 – Manufacturer Defect Trends

Overall Insight

Tools Used

Outcome

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages