Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Preferences for Zero Shot Classification Result Display#40

Answeredbylalitpagaria
cdpierse asked this question inQ&A
Discussion options

Hi everyone,

I'm currently working on an implementation of an explainer for zero shot classification tasks as previously discussed in#19.

I find myself at an interesting crossroads with regards to one particular design decision relating to the classification and how to display both the word attributions and visualization.

For those not familiar with the trick employed by Hugging Face to achieve zero shot classification the way this works is by exploiting the "entailment" label ofNLI models.

So if we have the sentence:

  • Today apple released the new Macbook showing off a range of new features found in the proprietary silicon chip computer.

And want to classify it with one the labels:

  • ["sport", "technology", "current affairs"]

The way this explainer will work is similar to how the zero shot pipeline does in the transformers package - it will test out all three labels as hypothesis with the original text and measure which scores highest for entailment.

The hypothesis texts might look something like:

  • [CLS]Today apple released the new Macbook showing off a range of new features found in the proprietary silicon chip computer. [SEP] this text is about sport [SEP]
  • [CLS]Today apple released the new Macbook showing off a range of new features found in the proprietary silicon chip computer.[SEP] this text is about technology [SEP]
  • [CLS]Today apple released the new Macbook showing off a range of new features found in the proprietary silicon chip computer.[SEP] this text is about current affairs [SEP]

In this casetechnology would score highest.

So this brings me to the issue of this new explainer which is related to the "entailment trick" which is that there are two ways I can represent the classification:

  1. how the model and explainer actually see the text which will include the hypothesis text.

Screenshot 2021-05-14 at 18 37 08

  1. a edited version that includes only the text to be classified, in this case I would also edit the attribution scores to only include the tokens being displayed/returned.

Screenshot 2021-05-14 at 18 40 32

Of course I could make both of these options available via an argument of some sort to the method call but it still leaves me with the decision of which would be be the default.

I have my own preference but I'd love to hear some thoughts or suggestions on what seems the most natural choice to make here.

Thanks.

You must be logged in to vote

Thank you Charles for detailed explanation and looking into this.
To me [2] looks good but again it is personal choice :)

Not related to this task but just want to share another repo which showing visualisation in different ways. You might find interestinghttps://github.com/sergioburdisso/pyss3

Replies: 1 comment 1 reply

Comment options

Thank you Charles for detailed explanation and looking into this.
To me [2] looks good but again it is personal choice :)

Not related to this task but just want to share another repo which showing visualisation in different ways. You might find interestinghttps://github.com/sergioburdisso/pyss3

You must be logged in to vote
1 reply
@cdpierse
Comment options

cdpierseMay 21, 2021
Maintainer Author

This is my preference too so I think I will go with this with the option to pass a parameter that allows for the entire sequence to be included. Thanks@lalitpagaria .

Answer selected bycdpierse
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Category
Q&A
Labels
None yet
2 participants
@cdpierse@lalitpagaria

[8]ページ先頭

©2009-2026 Movatter.jp