Analysis of Titanic Disaster with Random Forest Classification
The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. This sensational tragedy shocked the international community and led to better safety regulations for ships. This case study divides entire passenger list into training and test set and uses training set to train statistical model to predict whether a person survived from test set based on (passenger age, sex, class of travel, port of embarkation etc). We have achieved above 80% accuracy in predicting survival accuracy. As in today’s world markets are shrinking and many businesses are becoming irrelevant to their clients such prediction techniques go a long way in helping business decide which future technology to chose over others and how to remain relevant longer in the game and thus avoid Titanic like disaster.
Training Data- to train prediction model
Training Data: used to discover potentially predictive relationships has response variable (survived or not survived)
Test Data: data with no information on response variable (survived or not survived) model accuracy is
tested on prediction made on test data
Prediction Based On Decision Tree Model
Prediction Based On Feature Engineering
Random Forest Model
Random Forest Model creates a decision tree with small subset of data. Then it replaces one or more data elements from remaining set
with replacement and refines its decision tree. This process keeps on repeating until we get a very deep detailed decision tree.
Comparision Of Results
Want to know more? Click here.