Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Enhancement: snapshot testing for lint rule failures#6499

bradzacher announced inRFCs
Discussion options

Resurrecting#56
At Canva they use a (home grown) packageeslint-snapshot-test to snapshot the lint rule errors.

I found it's a really great way to work because it allows you to visualise the error ranges with the code instead of abstractly referencing the line/cols.

However the ergonomics of the package are off, IMO, because (a) it uses its own custom class to do testing which doesn't have any of the features of ESLint'sRuleTester and (b) it prefers you organise rule tests as if they were bog standard tests (i.e. within describe/it) which sucks because there's so much boilerplate to reuse.

I propose we do two things:

(1) fork the base ESLint rule tester properly. Right now we are subclassing it so that we can add our customisations and it makes the code pretty complicated and convoluted. As I established in#6456 - I think we're getting to the scale with a lot of things where it's just plain better for us to fork instead of extending.

(2) add a new option to theRuleTester constructor likesnapshot?: boolean = false which will automatically compute a snapshot for each invalid test.

For example

consttester=newRuleTester({snapshot:true});tester.run('no-methods',rule,{valid:[ ...],invalid:[{description:'Some meaningful label',code:`class Foo {  myMethodName() {    // ...  }}      `,errors:[{messageId:'noMethod'}],},],});

Would create a snapshot file like:

exports[`no-methods invalid Some meaningful label`]=`class Foo{myMethodName() {  ~~~~~~~~~~~~~~~~// ...~~~~~~~~~~  }~~~myMethodNameisdeclaredasamethod,wepreferalwaysusingarrowfunctionsbecausereasons}`;exports[`no-methods invalid Some meaningful label`]=`class Foo{myMethodName() {  ~~~~~~~~~~~~~~~~// ...~~~~~~~~~~  }~~~myMethodNameisdeclaredasamethod,wepreferalwaysusingarrowfunctionsbecausereasons}`;

Which is much, much more readable than the current alternative test:

invalid:[{description:'Some meaningful label',code:`class Foo {  myMethodName() {    // ...  }}      `,errors:[{messageId:'noMethod',data:{name:'myMethodName'},line:3,endLine:5,column:3,endColumn:4},],},],

Pros:

  • easier to understand the message users will actually see and validate it composes as expected
  • easier to validate that the rule actually reports where you want it to
  • easier for contributors to understand what the rule should do
  • changes to report locations require "zero" maintenance - just snapshot updates

Cons:

  • changes to message format require snapshot updates
  • requires a separate snapshot file which adds some mental mapping to map a specific test object to a specific snapshot for review purposes.
    • to make this easier I propose we enforce that when snapshots are on, invalid tests must include thedescription attribute to help with identification and that each description is unique.

This system could also be extended to snapshot fixers.
For example you could envision a separate snapshot emitted for each fixer which is just the raw source output. We could do something similar to what we dowith our AST snapshots and emit line-by-line diff of the code and output to make it clearer what the changes were.

In this case we'd emit 3..n snapshots per invalid case - one showing the error report location, (or clearly stating there was no fix performed), one showing the fix diff, and two for each suggestion fixer.


Problem to solve - sane organisation of snapshots.

We might have to think through how this should be structured though because emitting 3 snapshots per invalid test could be REALLY difficult to understand and mentally match up - especially if there are a lot of test cases. TBH just emitting 1 per test could be hard enough given the scale of invalid tests for some rules.

As mentioned before - enforcing the use ofdescription could be pretty maintenance-heavy and cumbersome as well. Imagine trying to come up with 100 unique descriptions for tests that are slight syntactic variations. It could easily devolve into users just doing "Test Case 1", "Test Case 2", ... - which isn't really great (even though it would uniquely identify each test case, it would turn into a mess as you reorder the tests).

Even though our AST fixture system emits 6 snapshots per fixture (IMO) it is really easy to understand things because each snapshot which is a separate file on disk and is clearly named.

Spitballing possible solutions:
(1) just emit one monolithic snapshot file and trust that users will get used to it and "figure it out"
Pros:

  • no work for us to do
  • can leverage standard jest tooling to manage snapshots
  • would probably work really well for incremental changes that add a few new test cases because the PR diff will be small and clearly show the new/changed cases

Cons:

  • huge files are painful for the user to navigate.
  • it will suck for brand new rules that introduce dozens or hundreds of test cases.

(2) create a unique, filename-safe name per test and emit one snapshot file for each test case.
Pros:

  • shrink the size of each snapshot file and make it easy to marry things up because the user just needs to look for a specific file on disk, rather than a set of snapshots in a monolithic file.

Cons:

  • creating a unique name that's easy to marry up with the original test would be difficult.
    • Perhaps this can be a requirement for the test case config instead of adescription? I.e. push that burden onto the user?
  • we would have to build logic to automatically manage these files as well as jest/test frameworks wouldn't do it out-of-the-box
    • Creating is easy (eg we can just leveragejest-specific-snapshot like we currently do)
    • Finding cases that no longer exist is harder and requires us to maintain a mapping of files and track which ones were/weren't tested
    • Would need to figure out a way to not falsely report skipped tests as not emitted.

(3) use the description to create folders on disk and emit individual snapshots.
Pros:

  • This would be much the same as our AST fixture system, so we know what we're doing to some degree

Cons:

  • Same problems as (2) though - would need a unique folder name per test and would ALSO need a unique file name per suggestion fixer.

(4) break compatibility entirely and instead use a similar structure to our AST fixture system - one file per fixture.
Pros:

  • This would save us doing anything uniquely (aside from suggestion fixer filenames) as we'd place the naming burden directly on the user.
  • Really clear system because the snapshots can live in a subfolder along-side the fixture

Cons:

  • Might be difficult for the user to manage though - lots of files that they need to create and manage.
  • Makes "grepping" for cases pretty hard as you need to grep across multiple files.
  • We'd need to create a system for adding config to the test cases
    • Perhaps could just go the "dumb" route here and use eslint config comments in the fixture file like:
      /* eslint @plugin/no-methods: ['error', { config: value, ... }] */classFoo{ ...}
    • Could also do a "fourslash" style system to add config directly in the fixture?
    • Could also just accept a separateconfig.ts file which exports a config object and provide some utils to easily type the object.

(5) do a similar system toeslint-snapshot-test in which the user decides how to name tests and how to properly handle the snapshot side of things.

For example they could use inline snapshots if they wanted:

consttester=newSnapshotRuleTester();tester.run('rule-name',rule,{valid:[ ...],invalid:[tester.invalid('foos-the-bar',(test)=>{constresult=test({code:' ... ',options:[ ...],errors:[{messageId:'foo'}],});expect(result).toMatchInlineSnapshot();});],});

Pros:

  • flexible
  • allows the user to use any test framework
  • saves us from the complexity of managing things
  • opens the door to custom test logic if the user wants more power

Cons:

  • de-standardises the ecosystem
  • adds a lot of boilerplate
  • adds a learning curve for users to figure out how to do things
  • would require creating of lint rules to help set some best-practices.
You must be logged in to vote

Replies: 5 comments 8 replies

Comment options

ESLint may support snapshot test in futureeslint/eslint#14936

FYI: I build snapshot tester foreslint-plugin-unicorn.

https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/test/utils/snapshot-rule-tester.mjs
https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/test/snapshots/no-lonely-if.mjs.md

You must be logged in to vote
1 reply
@bradzacher
Comment options

bradzacherApr 28, 2023
Maintainer Author

"may support" - the last comment was over a year ago!

Last I heard from the team they wanted to try and make the tooling system independent - so snapshot testing mightn't fit in to that vision (this was the response I get when I suggested adding dependency constraints to tests).

Those snapshots look pretty neat! I kind of want one per test to make it easier to find specific snapshots but IDK how that'll work exactly.
I do quite like the markdown approach ava uses because it lets you render code blocks and get syntax highlighting.

There's still a bit to nut out but either way - we've officially forkedRuleTester in v6 and I'm committed to building this to improve maintainability of tests (manual line/col assertions suck ass for maintenance) and also to help provide better visibility into what rules are doing (right now there's zero visibility into the actual messages generated or the actual look of a report with respect to the code)

Comment options

bradzacher
May 5, 2023
Maintainer Author

We have to build something test-framework agnostic (the entireRuleTester implementation is designed to be framework-agnostic).
This does increase complexity, but it also means we can shed the shackles of how jest does things and think of something better.

In the past I've leveragedjest-specific-snapshot to do things which just wrote a separate snapshot for every single case. However this ofc used jest's snapshot format which is likeexports['TEST NAME'] = `snapshot contents`;
This format means the framework implementation is super simple (because you don't need a parser - justrequire the snapshot file) but it means that the review-time devx is kinda shit (no syntax highlighting, sometimes things get backslash escaped).

Considering these snapshots are 110% meant to be human-readable, I think we need to go in a different direction and do a custom snapshot format so that we can have syntax highlighting and a nicer experience.

I kind of like how AVA does things - a markdown file with the snapshot contents inside code fences. However to avoid parsing markdown ava also dumps out a separate binary blob which is the test metadata - so it compares against the blob and writes the markdown separately for humans.

Do we think that two files being written is okay to trade-off for increased human readability?
If not - do we have any ideas about how we can have good human readability and simpler implementation?

Ava examples -#6499 (comment)

You must be logged in to vote
3 replies
@JoshuaKGoldberg
Comment options

binary blob ... two files

🤔 My gut instinct is -1 on having two files. I have't worked with Ava snapshots before, so maybe I'm just being a Luddite... but having multiple files checked in feels off to me. If possible it'd be nice to keep it leaner with just one.

to avoid parsing markdown

How bad is this, though? We already do Markdown parsing work in our website rule docs generators. The code inhttps://github.com/typescript-eslint/typescript-eslint/blob/4bf2d7360eaf74c9ef87b196ff4c459b8f50800b/packages/website/plugins/generated-rule-docs.ts isn't so bad.

My personal current favorite approach would be:

@bradzacher
Comment options

bradzacherJul 3, 2023
Maintainer Author

The binary approach makes for a dead simple implementation because essentially we're just doingexpect(newResults).toEqual(JSON.pasre(loadFile())) (this isn't an understatement). This is a big win for us maintenance wise.

The other benefit of the binary approach is that we can store data for the test in it - for example the line/column information for the error message. This is good because it means we can test on exact values and report messages like "Expected the report to start on line number 1, but received a report starting on line number 2".

With the markdown approach we have to hand-craft a deserialiser (MD -> JSON) which is the mirrored inverse of the serialiser (JSON -> MD). There's a maintenance burden here because we need to ensure that each change we make to one is also made in reverse to the other. This is also complex because serialising to MD is easy (can do it very simplyvia template strings) - but deserialising requires a parser+AST for accuracy. This means the two algorithms will be asymmetric which increases the burden.
There are complexities because markdown doesn't group a heading's content within the heading - it's a very flat AST. This means the deserialiser has to manually track the "array start" and "array end" cases as it walks the AST - which adds some more complexity to the deserialiser.

Finally we also cannot store random data within the markdown contents. For example if we dump this code fence into MD as the human-readable representation of the error:
image
Then in order to extract the information we need to also build a parser to deserialise the code frame to extract{line: 6, endLine: 6, column: 2, endColumn: 38, message: "All foo signatures should be adjacent."}. This is certainly doable, but has complexities. There's also certain data that we can't extract from this - for example we cannot extract themessageId, nor can we extract thedata placeholder values.
Maybe there's a way we can hack around this to encode hidden information in the markdown eg by using comments or something - but that seems super hacky!
There is the thought that maybe we just don't do that and we instead just diff the code frames - but this can create some really complicated output. For example here's one diff output with just a single character change on the error line:
image

@JoshuaKGoldberg
Comment options

Hmm, got it. That does sound like a lot of work. Agreed on the points for code complexity - at least for the 1.0 I'd rather focus efforts elsewhere. Maybe a good followup to build a separate standalone package for this kind of thing (I do love adding to our list of Tidelift-funded packages 😂)...

For the diff differences, I honestly don't mind. A big visual diff can be quite helpful when trying to understand what's wrong. That last screenshot makes me happy!

Comment options

bradzacher
May 5, 2023
Maintainer Author

The other consideration is how many snapshots to create for each test. There's ofc two options - one snapshot per test case, or one snapshot for the entire test file.

I can see pros and cons to both - increased readability for the former due to less scrolling / shorter file traded off against having many more files on disk.

I'm kind of leaning towards having just one file right now (same as ava and jest do by default) - but I'm open to doing the other way (it's how I've done all of our other snapshot implementations).

What do we think?

You must be logged in to vote
1 reply
@JoshuaKGoldberg
Comment options

I'm definitely +1 on having one snapshot file per test file. Some of these tests can have not just dozens but >100 cases. Lots of potential file noise if we go with multiple snapshot files per test file.

Comment options

In angular-eslint I have managed to solve some of these problems in a different way, and there are additional wins in terms of docs. I think it could be expanded even further with some more custom tooling around the rule tester.

Each rule has aspec.ts andcases.ts:

image

spec.ts is just responsible for wiring up thevalid andinvalid cases

image

cases.ts as you might guess contains thevalid andinvalid cases as two exported arrays:

image

Importantly for theinvalid cases, I have a utility function which allows me to write the casewith the squigglies already included in the source and the squigglies are what will be used as the source of truth for the expected error ranges:

image

The additional really powerful thing this facilitates is that I generated my docs from the unit tests. So every single covered case is automatically documented and guaranteed to be up to date and verified so long as the unit tests are green:

image

image

You can still useonly on the test cases if you want to focus on one. The only nice to haves missing from this setup is being able to use IDE powered test tooling like the jest extension adding arun test button inline etc

You must be logged in to vote
3 replies
@JamesHenry
Comment options

(In case it wasn't clear, the separate files are needed because the docs are generated by actually evaluating the contents of the cases.ts file. If everything were in one file then the RuleTester would end up being executed too).

@bradzacher
Comment options

bradzacherMay 6, 2023
Maintainer Author

this is cool, for sure!
the thing I wanted to avoid was making people rewrite their tests though.
ideally this was meant to be a "progressive enhancement" sort of thing that people can adopt with a one-line config change.

perhaps this is something we should build in later as a separate option if people want?

@bradzacher
Comment options

bradzacherMay 6, 2023
Maintainer Author

Theeslint-plugin-eslint-plugin rules also won't work on the utility funciton. But that can be worked around ofc.

We could always add a new method likerunWithAnnotatedErrors or just work a union type into the existingrun method.

The thing you miss out on from this is that you need to explicitly assert the fixer outputs (auto and suggestion), whereas snapshots let you do that implicitly for free.

Comment options

bradzacher
May 6, 2023
Maintainer Author

I have a POC ready which creates one markdown snapshot and one binary snapshot for each test file:
https://github.com/typescript-eslint/typescript-eslint/tree/7a70253/packages/eslint-plugin/tests/rule-snapshots

It looks pretty nice and is pretty easy to understand, I think.
Some of the markdown files are pretty huge though, which does make things more difficult than I'd like.

As I saidhere I'd like to make this a progressive enhancement - but maybe it'd be better to just create a brand new way of doing tests?

You must be logged in to vote
0 replies
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Category
RFCs
Labels
None yet
4 participants
@bradzacher@fisker@JamesHenry@JoshuaKGoldberg
Converted from issue

This discussion was converted from issue #6498 on February 20, 2023 01:40.


[8]ページ先頭

©2009-2025 Movatter.jp