A comprehensive analysis of the Titanic disaster, focusing on identifying key factors that influenced passenger survival rates using machine learning and data visualization techniques. Date: 17/Jun/2024

Title: Analyzing Titanic Survival Variables
Industry_Focus: Maritime Safety and Transportation
Problem Statement: To identify the key factors that influenced passenger survival during the Titanic disaster.
Business Use-Case: Improving safety measures and emergency protocols for maritime travel by understanding the variables that affected survival rates in a historical context. This analysis aims to provide insights that can be applied to modern safety regulations and passenger management.
Goals Metrics:
Deliverables:
Dataset List:
Websites to scrape data_needed:
The main dataset is the train.csv file to be used to build your machine learning models.
Input
import pandas as pd
# Load the CSV file using a raw string
file_path = r"train.csv"
data = pd.read_csv(file_path)
# Display the headers of the table
print(data.columns)
Output
Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp','Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],
dtype='object')
Some column headers are not clear, so data dictionary was searched and these are the following information on the collumn headers.