Movatterモバイル変換


[0]ホーム

URL:


Open In App
Next Article:
Facebook Sentiment Analysis using python
Next article icon

Twitter Sentiment Analysis is the process of using Python to understand the emotions or opinions expressed in tweets automatically. By analyzing the text we can classify tweets as positive, negative or neutral. This helps businesses and researchers track public mood, brand reputation or reactions to events in real time. Python libraries like TextBlob, Tweepy and NLTK make it easy to collect tweets, process the text and perform sentiment analysis efficiently.

How is Twitter Sentiment Analysis Useful?

Step by Step Implementation

Step 1: Install Necessary Libraries

This block installs and imports the required libraries. It usespandasto load and handle data,TfidfVectorizerto turn text into numbers andscikit learn to train model.

Python
pipinstallpandasscikit-learnimportpandasaspdfromsklearn.feature_extraction.textimportTfidfVectorizerfromsklearn.model_selectionimporttrain_test_splitfromsklearn.naive_bayesimportBernoulliNBfromsklearn.linear_modelimportLogisticRegressionfromsklearn.svmimportLinearSVCfromsklearn.metricsimportaccuracy_score,classification_report

Step 2: Load Dataset

  • Here we loads the Sentiment140 dataset from a zipped CSV file, you can download it from Kaggle.
  • We keep only the polarity and tweet text columns, renames them for clarity and prints the first few rows to check the data.
Python
df=pd.read_csv('training.1600000.processed.noemoticon.csv.zip',encoding='latin-1',header=None)df=df[[0,5]]df.columns=['polarity','text']print(df.head())

Output:

Output
Output

Step 3: Keep Only Positive and Negative Sentiments

  • Here we removes neutral tweets where polarity is 2, maps the labels so 0 stays negative and 4 becomes 1 for positive.
  • Then we print how many positive and negative tweets are left in the data.
Python
df=df[df.polarity!=2]df['polarity']=df['polarity'].map({0:0,4:1})print(df['polarity'].value_counts())

Output:

Screenshot-2025-07-09-092140
Output

Step 4: Clean the Tweets

  • Here we define a simple function to convert all text to lowercase for consistency, applies it to every tweet in the dataset.
  • Then shows the original and cleaned versions of the first few tweets.
Python
defclean_text(text):returntext.lower()df['clean_text']=df['text'].apply(clean_text)print(df[['text','clean_text']].head())

Output:

Output
Output

Step 5: Train Test Split

  • This code splits the clean_text and polarity columns into training and testing sets using an 80/20 split.
  • random_state=42 ensures reproducibility.
Python
X_train,X_test,y_train,y_test=train_test_split(df['clean_text'],df['polarity'],test_size=0.2,random_state=42)print("Train size:",len(X_train))print("Test size:",len(X_test))

Output:

Train size: 1280000
Test size: 320000

Step 6: Perform Vectorization

  • This code creates a TF IDF vectorizer that converts text into numerical features using unigrams and bigrams limited to 5000 features.
  • It fits and transforms the training data and transforms the test data and then prints the shapes of the resulting TF IDF matrices.
Python
vectorizer=TfidfVectorizer(max_features=5000,ngram_range=(1,2))X_train_tfidf=vectorizer.fit_transform(X_train)X_test_tfidf=vectorizer.transform(X_test)print("TF-IDF shape (train):",X_train_tfidf.shape)print("TF-IDF shape (test):",X_test_tfidf.shape)

Output:

TF-IDF shape (train): (1280000, 5000)
TF-IDF shape (test): (320000, 5000)

Step 7: Train Bernoulli Naive Bayes model

  • Here we train aBernoulli Naive Bayes classifier on the TF IDF features from the training data.
  • It predicts sentiments for the test data and then prints the accuracy and a detailed classification report.
Python
bnb=BernoulliNB()bnb.fit(X_train_tfidf,y_train)bnb_pred=bnb.predict(X_test_tfidf)print("Bernoulli Naive Bayes Accuracy:",accuracy_score(y_test,bnb_pred))print("\nBernoulliNB Classification Report:\n",classification_report(y_test,bnb_pred))

Output:

Output
Output

Step 9: Train Support Vector Machine (SVM) model

  • This code trains aSupport Vector Machine (SVM)with a maximum of 1000 iterations on the TF IDF features.
  • It predicts test labels then prints the accuracy and a detailed classification report showing how well the SVM performed.
Python
svm=LinearSVC(max_iter=1000)svm.fit(X_train_tfidf,y_train)svm_pred=svm.predict(X_test_tfidf)print("SVM Accuracy:",accuracy_score(y_test,svm_pred))print("\nSVM Classification Report:\n",classification_report(y_test,svm_pred))

Output:

Output
Output

Step 10: Train Logistic Regression model

  • This code trains aLogistic Regression model with up to 100 iterations on the TF IDF features.
  • It predicts sentiment labels for the test data and prints the accuracy and detailed classification report for model evaluation.
Python
logreg=LogisticRegression(max_iter=100)logreg.fit(X_train_tfidf,y_train)logreg_pred=logreg.predict(X_test_tfidf)print("Logistic Regression Accuracy:",accuracy_score(y_test,logreg_pred))print("\nLogistic Regression Classification Report:\n",classification_report(y_test,logreg_pred))

Output:

Output
Output

Step 11: Make Predictions on sample Tweets

  • This code takes three sample tweets and transforms them into TF IDF features using the same vectorizer.
  • It then predicts their sentiment using the trained BernoulliNB, SVM and Logistic Regression models and prints the results for each classifier.
  • Where 1 stands for Positive and 0 for Negative.
C++
sample_tweets=["I love this!","I hate that!","It was okay, not great."]sample_vec=vectorizer.transform(sample_tweets)print("\nSample Predictions:")print("BernoulliNB:",bnb.predict(sample_vec))print("SVM:",svm.predict(sample_vec))print("Logistic Regression:",logreg.predict(sample_vec))

Output:

Output
Output

We can see that our models are working fine and giving same predictions even with different approaches.

You can download the Source code from here-Twitter Sentiment Analysis using Python


Twitter Sentiment Analysis Using Python
Video Thumbnail

Twitter Sentiment Analysis Using Python

Video Thumbnail

Twitter Sentiment Analysis with Python

N

Nikhil Kumar 13
Improve
Article Tags :
Practice Tags :

Similar Reads

We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood ourCookie Policy &Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences

[8]ページ先頭

©2009-2025 Movatter.jp