- Notifications
You must be signed in to change notification settings - Fork4
[NLP] Unsupervised User Stance Detection on Twitter.
License
elaaf/stance-detect
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
A python implementation of the paper "Unsupervised User Stance Detection on Twitter" by Darwish et al.arxiv.This unofficial repo simply consolidates the code used in the paper for detecting the stance of prolific Twitter users with respect to controversial topics.
Given a Twitter dataset containing Tweets regarding a divisive/controversial topic
Construct Feature Vectors for each user. (Hashtags, Retweeted Accounts, Unique Tweets)
Apply Dimensionality Reduction. (t-SNE, UMAP)
Cluster low-dim data (Mean-Shift, DBSCAN)
The low dimensional clusters can be visualized to see nicely separated user clusters, which then can be assigned "Stance" labels based on their orignal descriptors/features.
# Create a Python 3.6+ virtual environment and runpip install -r requirements.txtClone this repo.
git clone https://github.com/elaaf/stance-detect.gitPlace your Twitter Dataset CSV in ./datasets/ folder.Set Data Pipeline Parameters in main.pyFor Standard Twitter API Dataset CSV, simply run.
python3 stance_detect/main.pyData Loading
fromdata_loading.load_dataimportload_datasetload_dataset(dataset_path="./datasets/twitter_dataset.csv",features=["user_id","username","tweet","mentions","hashtags"],num_top_users=1000,min_tweets=0,random_sample_size=0,rows_to_read=None,user_col="user_id",str2list_cols=["mentions","hashtags"])
Feature Extraction
fromfeature_extraction.feat_extractimportFeatureExtractionFEATURES_TO_USE= ["T","R","H"]ft_extract=FeatureExtraction()user_feature_dict=ft_extract.get_user_feature_vectors(FEATURES_TO_USE,users_list,tweets_list,mentions_list,hashtags_list,feature_size=None,relative_freq=True)
Dimensionality Reduction
fromdimensionality_reduction.umapimportget_umap_embeddinglow_dim_user_feature_dict=get_umap_embedding(user_feature_dict,n_neighbors=20,n_components=3,min_distance=0.1,distance_metric="correlation")
Clustering
fromclustering.mean_shiftimportmean_shift_clusteringuser_feature_label_dict=mean_shift_clustering(low_dim_user_feature_dict )
Get User Labels for Interactive Plot (Optional)
user_info_label_dict=ft_extract.get_user_info_labels(users_list,user_info_list=hashtags_list,top_n=5)user_hover_labels=list(user_info_label_dict.values() )
Interactive Scatter Plot
fromgraph_plots.plot_3dimportscatter_plot_3dscatter_plot_3d(user_feature_label_dict,title="Twitter Users Scatter Plot",hover_info=user_hover_labels,plot_save_path="./stance_detect/results/3d_scatter_plot.html")
Each datapoint in the scatter plot represents a Twitter User,with their top 5 most used hashtags displayed as hover labels.
Click to open interactive view !
Darwish, K., Stefanov, P., Aupetit, M., & Nakov, P. (2020). Unsupervised User Stance Detection on Twitter. Proceedings of the International AAAI Conference on Web and Social Media, 14(1), 141-152. Retrieved fromhttps://www.aaai.org/ojs/index.php/ICWSM/article/view/7286
About
[NLP] Unsupervised User Stance Detection on Twitter.
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Uh oh!
There was an error while loading.Please reload this page.
