Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

chakki's Aspect-Based Sentiment Analysis dataset

License

NotificationsYou must be signed in to change notification settings

chakki-works/chABSA-dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

We developed a Aspect-Based Sentiment Analysis dataset, named chABSA dataset.

annotation.png

annotation2.png

The annotation target is "overview of business result" of each companies. Specifically, theOverviewOfBusinessResultsTextBlock part in the Japanese annual reports. Japanese annual reports are published onEDINET, and its definitions of format are available on theFinancial Service Agency (it's called "タクソノミ").

TheEntity andAttribute pair is like following.

generalsalesprofitamountpricecost
market✔️
company✔️✔️✔️✔️✔️✔️
business✔️✔️✔️✔️✔️✔️
product✔️✔️✔️✔️✔️✔️
NULL✔️✔️✔️✔️✔️✔️
OOD✔️

To see the detail definition, please referannotation guideline.

Download the data

Download Link

The 230 / 2,260 companies are annotated (10% of all company).
The annotation target companies are selected from each category. Please refer the detail from here.

Annotation Target

Paper

Jupyter Notebooks

You can try these on Kaggle Kernel!

Data organization

Annotation Format

Annotation Tools is available here

Annotation result is provided by json file.

(under constructing)

{"header": {"document_id":"E00008","document_name":"ホクト株式会社","doc_text":"有価証券報告書","edi_id":"E00008","security_code":"13790","category33":"水産・農林業","category17":"食品","scale":"6"  },"sentences": [    {"sentence_id":0,"sentence":"当連結会計年度におけるわが国経済は、政府の経済政策や日銀の金融緩和策により、企業業績、雇用・所得環境は改善し...","opinions": [        {"target":"わが国経済","category":"NULL#general","polarity":"neutral","from":11,"to":16        },        {"target":"企業業績","category":"NULL#general","polarity":"positive","from":38,"to":42        },...      ],    },    {"sentence_id":1,"sentence":"当社グループを取り巻く環境は、実質賃金が伸び悩むなか、消費者の皆様の...","opinions": [        {"target":"実質賃金","category":"NULL#general","polarity":"negative","from":15,"to":19        },...      ]    },...  ]}
ParameterTypeDescription
headerobjアノテーション対象文書のヘッダー情報
sentencesarray[obj]文書内の各文に行われたアノテーション結果

header

ParameterTypeDescription
document_idstr一意の文書id(edi_idと等しい)
document_namestr文書名(=企業名)
doc_textstr文書種別名
edi_idstr企業のEDINETコード
security_codestr企業の証券コード
category33str企業の33業種区分
category17str企業の17業種区分
scalestr企業の規模区分

sentences

ParameterTypeDescription
sentence_idint文書内の各文に振られた文id
sentencestrアノテーション対象の文
opinionsarray[obj]アノテーションの配列
targetstrpolarityの対象となっているEntity
categorystrEntity#Attributeのラベル
polaritystrpolarityのラベル
frominttargetの開始位置
tointtargetの終了位置

License

Creative Commons Attribution 4.0 License.

About

chakki's Aspect-Based Sentiment Analysis dataset

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors3

  •  
  •  
  •  

[8]ページ先頭

©2009-2025 Movatter.jp