ML Titanic Challenge - 4 Classification Task

Dear Sciaku Learner you are not logged in or not enrolled in this course.

Please Click on login or enroll now button.

If you have any query feel free to chat us!

Happy Coding! Happy Learning!

Lecture 76:- ML Titanic Challenge - 4 Classification Task

In the Titanic challenge, you're dealing with a binary classification task where the goal is to predict whether a passenger survived (1) or not (0) based on various features. In this example, we'll use the Random Forest classifier as the machine learning algorithm. Here's how you can perform the classification task using the Titanic dataset:

 

pythonCopy code

import pandas as pd from sklearn.model_selection import train_test_split from sklearn.ensemble import RandomForestClassifier from sklearn.metrics import accuracy_score, classification_report, confusion_matrix # Load the dataset train_df = pd.read_csv('train.csv') # Drop unnecessary columns drop_columns = ['PassengerId', 'Name', 'Ticket', 'Cabin'] train_df = train_df.drop(columns=drop_columns) # Handling missing values train_df['Age'].fillna(train_df['Age'].median(), inplace=True) train_df['Embarked'].fillna(train_df['Embarked'].mode()[0], inplace=True) # Encoding categorical variables train_df = pd.get_dummies(train_df, columns=['Pclass', 'Sex', 'Embarked'], drop_first=True) # Splitting data into features (X) and target variable (y) X = train_df.drop(columns=['Survived']) y = train_df['Survived'] # Splitting the dataset into training and validation sets X_train, X_valid, y_train, y_valid = train_test_split(X, y, test_size=0.2, random_state=42) # Initialize and train the Random Forest classifier clf = RandomForestClassifier(n_estimators=100, random_state=42) clf.fit(X_train, y_train) # Make predictions on the validation set y_pred = clf.predict(X_valid) # Evaluate the model accuracy = accuracy_score(y_valid, y_pred) conf_matrix = confusion_matrix(y_valid, y_pred) class_report = classification_report(y_valid, y_pred) print(f'Model Accuracy: {accuracy:.2f}') print('Confusion Matrix:') print(conf_matrix) print('Classification Report:') print(class_report)

Here's what the code does:

  1. Load the dataset and perform data preprocessing (similar to previous steps).

  2. Encode categorical variables using one-hot encoding.

  3. Split the data into features (X) and the target variable (y).

  4. Split the dataset into training and validation sets.

  5. Initialize and train a Random Forest classifier using the training data.

  6. Make predictions on the validation set.

  7. Evaluate the model's performance by calculating accuracy, generating a confusion matrix, and printing a classification report.

The classification report provides more detailed information about precision, recall, F1-score, and support for each class (survived or not survived). The confusion matrix helps you understand the distribution of true positive, true negative, false positive, and false negative predictions.

Keep in mind that this is a basic example. You can further fine-tune hyperparameters, experiment with different algorithms, and perform more advanced techniques like feature selection, cross-validation, and hyperparameter tuning to improve your model's performance on the Titanic challenge.

9. Projects

Comments: 0

Frequently Asked Questions (FAQs)

How do I register on Sciaku.com?
How can I enroll in a course on Sciaku.com?
Are there free courses available on Sciaku.com?
How do I purchase a paid course on Sciaku.com?
What payment methods are accepted on Sciaku.com?
How will I access the course content after purchasing a course?
How long do I have access to a purchased course on Sciaku.com?
How do I contact the admin for assistance or support?
Can I get a refund for a course I've purchased?
How does the admin grant access to a course after payment?