Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Models from masters dissertation: Author profiling from texts using artificial neural networks, EACH-USP 2019

NotificationsYou must be signed in to change notification settings

rafaelsandroni/author-profiling-models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Author Profiling (AP) is a computational task of recognizing the characteristics of textauthors based on their linguistic patterns. The use of computer computational models allowsus to infer social characteristics from the text, even if the authors do not consciously chooseto place indicators of these characteristics in the text. The AP task can be importantfor many practical applications, such as forensic analysis, criminal investigation, andmarketing. Traditional AP approaches often use language knowledge, which requires priorknowledge and requires manual effort to extract features. Recently, the use of artificialneural networks has shown satisfactory results in natural language processing (NLP)problems, however, for author profiling, presents a varied level of success. This paper aimsto organize, define and explore various authorial characterization tasks from the textualcorpus considered, covering three languages (i.e, Portuguese, English and Spanish) andfive textual domains (ie, social networks, questionnaires, SMS etc). Six models based onneural networks and word embeddings were proposed, performance of models are compared with baseline systems.

Masters dissertation

Download masters dissertation latest version

Implementation models

Here you can find implemented models with containing both data pipeline and machine learning pipeline.

  • lr_tfidf: logistic regression + tfidf, /src/models/baseline1

  • cnn_tfidf: 1D conv net + tfidf, /src/models/baseline2

  • cnn_wv: multichannel 1D conv net + word vectors, /src/models/baseline3

  • cnn_wv, Kim implementation: multichannel 1D conv net + word vectors, /src/models/baseline4

  • lstm_wv: LSTM + word vectors, /baseline5

  • lstm_attention_wv: LSTM self attention mechanism + word vectors, /src/models/baseline6

  • gru_wv: GRU + word vectors, /src/models/baseline7

  • cnn_char: multichannel 1D conv net + char vectors, /src/models/baseline9

  • lstm_attention_char: LSTM self attention mechanism + char vectors, /src/models/baseline9

Corpus

Those textual datasets supports 6 author profiling tasks: gender, age, education level, religious, IT formation and politics position, in three languages: portuguese, english and spanish.

This dissertation have structured and defined datasets to author profiling tasks, such as classes distribution and definition of the problems.

  • b5-post
  • BRMoral
  • BlogSet-BR
  • Nus-SMS
  • The Blog Authorship
  • PAN 2013 (PAN-CLEF)

Dataset are splited into stratificated training and test subsets

You can request access to structured datasets to the author.

Utils evaluation functions

Utils functions build to help implementations, pre-build models, reports etc

/src/functions/

  • utils: related to helpers functions
  • plot: related to plot functions, using matplotlib and metrics calc
  • word vectors: related to embeddings algorithms, training and load pre trained models
  • etc

Reference

@MASTERSDISSERTATION{sandroni-dias,  title        = "Author profiling from texts using artificial neural networks",  author       = "Rafael Felipe Sandroni Dias",  year         = "2019",  type         = "Master's Dissertation",  school       = "University of São Paulo",  address      = "São Paulo, SP, Brazil",}

About

Models from masters dissertation: Author profiling from texts using artificial neural networks, EACH-USP 2019

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp