- Notifications
You must be signed in to change notification settings - Fork11
Visualization Module for Natural Language Processing
License
takapy0210/nlplot
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
nlplot: Analysis and visualization module for Natural Language Processing 📈
Facilitates the visualization of natural language processing and provides quicker analysis
You can draw the following graph
- N-gram bar chart
- N-gram tree Map
- Histogram of the word count
- wordcloud
- co-occurrence networks
- sunburst chart
(Tested in English and Japanese)
pip install nlplot
I've posted onthis blog about the specific use. (Japanese)
And, The sample code is also availablein the kernel of kaggle. (English)
The column to be analyzed must be a space-delimited string
# sample datatarget_col="text"texts= ["Think rich look poor","When you come to a roadblock, take a detour","When it is dark enough, you can see the stars","Never let your memories be greater than your dreams","Victory is sweetest when you’ve known defeat" ]df=pd.DataFrame({target_col:texts})df.head()
text | |
---|---|
0 | Think rich look poor |
1 | When you come to a roadblock, take a detour |
2 | When it is dark enough, you can see the stars |
3 | Never let your memories be greater than your dreams |
4 | Victory is sweetest when you’ve known defeat |
importnlplotimportpandasaspdimportplotlyfromplotly.subplotsimportmake_subplotsfromplotly.offlineimportiplotimportmatplotlib.pyplotasplt%matplotlibinline# target_col as a list type or a string separated by a space.npt=nlplot.NLPlot(df,target_col='text')# Stopword calculations can be performed.stopwords=npt.get_stopword(top_n=30,min_freq=0)# 1. N-gram bar chartfig_unigram=npt.bar_ngram(title='uni-gram',xaxis_label='word_count',yaxis_label='word',ngram=1,top_n=50,width=800,height=1100,color=None,horizon=True,stopwords=stopwords,verbose=False,save=False,)fig_unigram.show()fig_bigram=npt.bar_ngram(title='bi-gram',xaxis_label='word_count',yaxis_label='word',ngram=2,top_n=50,width=800,height=1100,color=None,horizon=True,stopwords=stopwords,verbose=False,save=False,)fig_bigram.show()# 2. N-gram tree Mapfig_treemap=npt.treemap(title='Tree map',ngram=1,top_n=50,width=1300,height=600,stopwords=stopwords,verbose=False,save=False)fig_treemap.show()# 3. Histogram of the word countfig_histgram=npt.word_distribution(title='word distribution',xaxis_label='count',yaxis_label='',width=1000,height=500,color=None,template='plotly',bins=None,save=False,)fig_histgram.show()# 4. wordcloudfig_wc=npt.wordcloud(width=1000,height=600,max_words=100,max_font_size=100,colormap='tab20_r',stopwords=stopwords,mask_file=None,save=False)plt.figure(figsize=(15,25))plt.imshow(fig_wc,interpolation="bilinear")plt.axis("off")plt.show()# 5. co-occurrence networksnpt.build_graph(stopwords=stopwords,min_edge_frequency=10)# The number of nodes and edges to which this output is plotted.# If this number is too large, plotting will take a long time, so adjust the [min_edge_frequency] well.# >> node_size:70, edge_size:166fig_co_network=npt.co_network(title='Co-occurrence network',sizing=100,node_size='adjacency_frequency',color_palette='hls',width=1100,height=700,save=False)iplot(fig_co_network)# 6. sunburst chartfig_sunburst=npt.sunburst(title='sunburst chart',colorscale=True,color_continuous_scale='Oryel',width=1000,height=800,save=False)fig_sunburst.show()# other# The original data frame of the co-occurrence network can also be accesseddisplay(npt.node_df.head(),npt.node_df.shape,npt.edge_df.head(),npt.edge_df.shape)
TBD
cd testspytest
Plotly is used to plot the figure
co-occurrence networks is used to calculate the co-occurrence network
wordcloud uses the following fonts
About
Visualization Module for Natural Language Processing
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors8
Uh oh!
There was an error while loading.Please reload this page.