Analysis of Titanic Disaster with Random Forest Classification

The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. This sensational tragedy shocked the international community and led to better safety regulations for ships. This case study divides entire passenger list into training and test set and uses training set to train statistical model to predict whether a person survived from test set based on (passenger age, sex, class of travel, port of embarkation etc). We have achieved above 80% accuracy in predicting survival accuracy. As in today’s world markets are shrinking and many businesses are becoming irrelevant to their clients such prediction techniques go a long way in helping business decide which future technology to chose over others and how to remain relevant longer in the game and thus avoid Titanic like disaster.

Training Data- to train prediction model

Training Data: used to discover potentially predictive relationships has response variable (survived or not survived)

titanic training data

Test Data: data with no information on response variable (survived or not survived) model accuracy is
tested on prediction made on test data

titanic training data

Prediction Based On Decision Tree Model

decision tree model

Prediction Based On Feature Engineering

Zapier Start screen

Random Forest Model

Random Forest Model creates a decision tree with small subset of data. Then it replaces one or more data elements from remaining set
with replacement and refines its decision tree. This process keeps on repeating until we get a very deep detailed decision tree.

How Business Can Learn From Titanic Disaster

Comparision Of Results

Zapier Start screen


Thank You


Want to know more? Click here.