If you have any query feel free to chat us!
Happy Coding! Happy Learning!
Here's a simple example of how you can implement the Naive Bayes algorithm using Python and the scikit-learn
library for a text classification task:
pythonCopy code
from sklearn.datasets import fetch_20newsgroups from sklearn.feature_extraction.text import CountVectorizer from sklearn.model_selection import train_test_split from sklearn.naive_bayes import MultinomialNB from sklearn.metrics import accuracy_score # Load the 20 Newsgroups dataset (you can replace this with your own dataset) newsgroups = fetch_20newsgroups(subset='all', categories=['alt.atheism', 'soc.religion.christian']) # Split data into training and testing sets X_train, X_test, y_train, y_test = train_test_split(newsgroups.data, newsgroups.target, test_size=0.2, random_state=42) # Convert text data to numerical features using CountVectorizer vectorizer = CountVectorizer() X_train_vectorized = vectorizer.fit_transform(X_train) X_test_vectorized = vectorizer.transform(X_test) # Create and train the Naive Bayes classifier model = MultinomialNB() model.fit(X_train_vectorized, y_train) # Make predictions on the test set y_pred = model.predict(X_test_vectorized) # Calculate accuracy accuracy = accuracy_score(y_test, y_pred) print(f"Accuracy: {accuracy:.2f}")
This code snippet demonstrates the following steps:
sklearn
modules).CountVectorizer
.MultinomialNB
(Multinomial Naive Bayes) model from scikit-learn
.Please note that in practice, you might need to preprocess your text data by removing punctuation, converting to lowercase, and applying other techniques to clean the text. Additionally, you can explore different variants of Naive Bayes or other text vectorization methods like TF-IDF.
Replace the dataset loading and preprocessing steps with your own data and preprocessing pipeline if you're working with a different dataset.
Comments: 0