Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Add Error format support, and JSON output option#11396

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
JelleZijlstra merged 48 commits intopython:masterfromtusharsadhwani:output-json
May 10, 2024

Conversation

@tusharsadhwani
Copy link
Contributor

@tusharsadhwanitusharsadhwani commentedOct 27, 2021
edited
Loading

Description

Resolves#10816

The changes this PR makes are relatively small.
It currently:

  • Adds an--output option to mypy CLI
  • Adds aErrorFormatter abstract base class, which can be subclassed to create new output formats
  • Adds aMypyError class that represents the external format of a mypy error.
  • Adds a check for--output being'json', in which case theJSONFormatter is used to produce the reported output.

Demo:

$mypy mytest.pymytest.py:2: error: Incompatible types in assignment (expression has type "str", variable has type "int")mytest.py:3: error: Name "z" is not definedFound 2 errors in 1 file (checked 1 source file)$mypy mytest.py --output=json{"file": "mytest.py", "line": 2, "column": 4, "severity": "error", "message": "Incompatible types in assignment (expression has type \"str\", variable has type \"int\")", "code": "assignment"}{"file": "mytest.py", "line": 3, "column": 4, "severity": "error", "message": "Name \"z\" is not defined", "code": "name-defined"}

A few notes regarding the changes:

  • I chose to re-use the intermediateErrorTuples created during error reporting, instead of using the more generalErrorInfo class, because a lot of machinery already exists in mypy for sorting and removing duplicate error reports, which producesErrorTuples at the end. The error sorting and duplicate removal logic could perhaps be separated out from the rest of the code, to be able to useErrorInfo objects more freely.
  • ErrorFormatter doesn't really need to be an abstract class, but I think it would be better this way. If there's a different method that would be preferred, I'd be happy to know.
  • The--output CLI option is, most probably, not added in the correct place. Any help in how to do it properly would be appreciated, the mypy option parsing code seems very complex.
  • The ability to add custom output formats can be simply added by subclassing theErrorFormatter class inside a mypy plugin, and adding aname field to the formatters. The mypy runtime can then check through the__subclasses__ of the formatter and determine if such a formatter is present.
    The "checking for thename field" part of this code might be appropriate to add within this PR itself, instead of hard-codingJSONFormatter. Does that sound like a good idea?

damienallen, 2bndy5, lstolcman, agoncharov-reef, thetic, Pavkazzz, N-Schaef, GabDug, endrebak, and Mr-Sunglasses reacted with thumbs up emojims-jpq, subham-deepsource, not-my-profile, iskyd, endrebak, Mr-Sunglasses, and saurbhc reacted with rocket emoji
returnjson.dumps({
'file':file,
'line':line,
'column':column,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

line,column seems short-sighed.

MaybestartLine,startColumn, which would leave room to later addendLine,endColumn. This information is useful for IDEs to know how what span to highlight.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yeah, good point.
But mypy currently only does line and column, and it might be very long before spans are added in. It could be argued that the change tostartLine andstartColumn can be made when the feature exists.

Copy link
Contributor

@intgrintgrOct 28, 2021
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Changing the field names later will break tools though. AndstartLine,startColumn would already be accurate right now, because mypy currently points out the start of the span.

tusharsadhwani, agoncharov-reef, and antonagestam reacted with thumbs up emoji
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'll leave this convention for the separate--output=sarif format.

@intgr
Copy link
Contributor

intgr commentedOct 28, 2021
edited
Loading

I'd like to make the case again for the existing JSON-based standard:Static Analysis Results Interchange Format (SARIF)

While that does not preclude support for a custom JSON format as well, perhaps if mypy were to support SARIF, there would be no need for a custom format?

Pros of SARIF:

  • There is no need to reinvent the wheel.

  • Can already integrate with existing tools likeGitHub code scanning, Visual Studio, VS Code

  • Every IDE shouldn't have to implement its own mypy-specific integration. This is exemplified by the situation in IntelliJ IDEA/PyCharm: there aretwo 3rd party mypy-specific plugins, both of which have major problems that aren't getting fixed. [1]

    Clearly the momentum isn't there to provide good mypy-specific IDE integrations. Adding mypy support for SARIF, and letting IDEs take care of parsing SARIF and implementing a good UI on top, seems like a far more sustainable option.

Cons:

  • SARIF is clearly infected with "design by committee", the data structures are verbose and deeply nested. However, advanced features are optional, a minimal implementation is straightforward, see example below.
  • No streaming output; all results would have to be collected before the JSON output can be serialized.

[1] The plugin ratings are 3.4 and 2.9 out of 5 😖https://plugins.jetbrains.com/search?search=mypy

Example minimal mypy SARIF output
{"version":"2.1.0","$schema":"https://schemastore.azurewebsites.net/schemas/json/sarif-2.1.0.json","runs": [    {"tool": {"driver": {"name":"mypy","version":"0.910"        }      },"results": [        {"ruleId":"assignment","level":"error","message": {"text":"Incompatible types in assignment (expression has type\"str\", variable has type\"int\")"          },"locations": [            {"physicalLocation": {"artifactLocation": {"uri":"mytest.py"                },"region": {"startLine":2,"startColumn":4                }              }            }          ]        },        {"ruleId":"name-defined","level":"error","message": {"text":"Name\"z\" is not defined"          },"locations": [            {"physicalLocation": {"artifactLocation": {"uri":"mytest.py"                },"region": {"startLine":3,"startColumn":4                }              }            }          ]        }      ]    }  ]}
antonagestam, dgutson, garbelini, VIKTORVAV99, and penguinolog reacted with thumbs up emojinvuillam, tusharsadhwani, Timmmm, thetic, dgutson, and garbelini reacted with heart emoji

@nvuillam
Copy link

+1 for SARIF format, let's use a common format instead of a mypy custom one ! :)

@tusharsadhwani
Copy link
ContributorAuthor

I'm willing to turn it into SARIF (basing it on the json snippet provided by@intgr), if that's what I need to get this into mypy 😄

@ethanhs@TH3CHARLie what do you think?

@intgr
Copy link
Contributor

After sitting on it a little, I feel that there's room for more than one machine-readable format. The simpler json-line-based format is probably better for simple tools that only care about mypy. I'm willing to implement the SARIF part myself.

But I shouldn't be the one to decide that, I'm just an occasional lurker here. No mypy developers have chipped in yet.

@nvuillam
Copy link

I'm just a tourist here, but i'm currently activating SARIF output for all linters ofMegaLinter, and having native SARIF output is a great benefit for linters ^^ ( + the Github CodeQL that natively understands SARIF format ^^ )

Some other python linters already have SARIF output, likebandit , maybe there is some code to reuse to manage the format ?

@tusharsadhwani
Copy link
ContributorAuthor

tusharsadhwani commentedDec 8, 2021
edited
Loading

@nvuillam this PR introduces an ErrorFormatter class. By the time this PR is finalized, even if it doesn't use SARIF you will be able to define your own ErrorFormatter class in a plugin probably, and tell it to output the SARIF format, it'll be really easy.

Pylint has the same setup.

@nvuillam
Copy link

@tusharsadhwani thank but... I don't have the bandwidth to implement SARIF output on all linters that do not manage it yet 😅

@intgr
Copy link
Contributor

I think one property that machine-readable formats should have is: if one error causes multiple lines of output, then that should appear as one result item rather than multiple.

So for example withmypy --show-error-context

_local/multi_error.py: note: In function "foo":_local/multi_error.py:5:9: error: No overload variant of "get" of "Mapping" matches argument type "str"  [call-overload]_local/multi_error.py:5:9: note: Possible overload variants:_local/multi_error.py:5:9: note:     def get(self, key: Any) -> Any_local/multi_error.py:5:9: note:     def [_T] get(self, key: Any, default: Union[Any, _T]) -> Union[Any, _T]Found 1 error in 1 file (checked 1 source file)

should maybe be output as

{"file":"_local/multi_error.py","line":5,"column":8,"severity":"error","context":"In function\"foo\":","message":"No overload variant of\"get\" of\"Mapping\" matches argument type\"str\"","hint":"Possible overload variants:\n    def get(self, key: Any) -> Any\n    def [_T] get(self, key: Any, default: Union[Any, _T]) -> Union[Any, _T]","code":"call-overload"}

But again, maybe that shouldn't be a blocker for this PR, getting machine-readable output can be useful even when findings aren't accurately grouped.

The negative line/column numbers right now seem awkward though:

{"file":"_local/multi_error.py","line":-1,"column":-1,"severity":"note","message":"In function\"foo\":","code":null}

@tusharsadhwani
Copy link
ContributorAuthor

That's definitely a bug.

The mypy code that generates this output was ridiculously coupled. I'll take a look at if this can be fixed.

@TomMD
Copy link

It has been many months as the project has hoped for a best solution over an existing solution. Can we merge this and accept future improvement using--output serif if someone comes along and implements it?

nvuillam, Timmmm, and sabiroid reacted with heart emoji

@tusharsadhwani
Copy link
ContributorAuthor

@intgr I did it properly this time. Hints should be fixed now.

@sehyun-hwang
Copy link

--output json option gets ignored after some usage withmypy.api.run_dmypy . Pasing json option to dmypy CLI worked fine, but invoking run_dmypy function in a Python script returns plain string output after some usage. Can someone help me troubleshoot this?

@tusharsadhwani
Copy link
ContributorAuthor

@sehyun-hwang sure thing. If you could provide a reproducible example I can look into it right away.

@sehyun-hwang
Copy link

@tusharsadhwani Thank you! I tried linting jobs in a container, and was unable to reproduce it. The problem occurs only when my IDE invokes dmypy, so I'm suspecting this has to do with concurrent execution. Do you see a chance where--output json option gets ignored when multiple clients are connected, or clients are threaded or running in a child process?

@sehyun-hwang
Copy link

I found that the first file linted output a json format, but from the second one it goes back to the string format. Can you try to reproduce this?

centos@www /m/tax-automation (99-basic-types) [1]> dmypy check -- batch_api.py{"file": "batch_api.py", "line": 5, "column": 0, "message": "Cannot find implementation or library stub for module named \"LAMBDA_CONFIG\"", "hint": "See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports", "code": "import"}{"file": "batch_api.py", "line": 6, "column": 0, "message": "Cannot find implementation or library stub for module named \"common.utils\"", "hint": "", "code": "import"}centos@www /m/tax-automation (99-basic-types) [1]> dmypy check -- foo.pyfoo.py:2: error: Name "bar" is not definedfoo.py:5: error: Argument 1 to "foo" has incompatible type "str"; expected "int"```

@tusharsadhwani
Copy link
ContributorAuthor

@sehyun-hwang This seems like a dmypy bug. It seems that dmypy ignores cli flags or config file flags in certain cases.

Here's a file structure:

$tree..├── a.py├── b.py└── mypy.ini

Andmypy.ini contains the following:

[mypy]strict = True

a.py andb.py both contain:

deffoo():pass

This waymypy a.py won't show errors butmypy --strict a.py will.
When alternating betweendmypy check a.py anddmypy check b.py, at some pointdmypy stops showing strict output. For the same reason it might be forgetting the--output flag being set as well.

@sehyun-hwang
Copy link

#11396 (comment)

@intgr Could you take a look at the comment by@tusharsadhwani ?

@intgr
Copy link
Contributor

intgr commentedFeb 23, 2022
edited
Loading

I'm afraid I can't help much with this. Just to clarify, I think my name shows up in the "reviewers" list only because I left a code comment about this PR earlier. I'm not a maintainer or reviewer in an official capacity and I don't have much knowledge of mypy's internals.

I'm just a user interested in consuming structured output from mypy (and potentially adding SARIF support later on). That's why I shared my opinions in this comment thread.

theY4Kman reacted with heart emoji

@2bndy5
Copy link

Yeah, the scope of this project has nothing to do with integrating various git servers' CI runners. That would be a rabbit hole of tech debt.

# No error tuple found for this hint. Ignoring it
continue

iferror.hint=="":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Could we make hint into a list of strings instead?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It's often that a single hint is wrapped into multiple lines. For internal representaiton we can keep it as a list of strings but for the user I think it makes most sense to display it as a single string.

error_location= (file_path,line,column)
error=latest_error_at_location.get(error_location)
iferrorisNone:
# No error tuple found for this hint. Ignoring it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Why not instead generate a MypyError with some field that lets us indicate that this is a note?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Addressed.

self.force_union_syntax=False

# Sets custom output format
self.output=""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Would None be a better default?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yup.

foo('42')

def bar() -> None: ...
bar('42')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Add a test case withreveal_type(); let's make sure that shows up in the output.

tusharsadhwani reacted with thumbs up emoji
@intgr
Copy link
Contributor

intgr commentedMay 10, 2024
edited
Loading

I'm not sure we'd accept support for other output formats in mypy itself, since we'd take on additional maintenance load. If this PR is merged, such formats could be supported in external tools that consume mypy's JSON output.

If mypy will only support 1 machine-parsable output format, then please choose a standard or de facto recognized format. Don't create yet another tool-specific mypy output format that everybody then has to write another set of adapters for.

For example, tools already exist to convert SARIF into GitLab "CodeQuality" reportshttps://gitlab.com/ahogen/sarif-to-codequality.

dgutson, thetic, and vladiliescu reacted with thumbs up emoji2bndy5 reacted with thumbs down emoji

@github-actions
Copy link
Contributor

According tomypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@tusharsadhwani
Copy link
ContributorAuthor

FWIW I prefer current output as it is leaner and it's trivial for any tool to translate it to SARIF (pyright rejected the idea of adding SARIF forthe same reason).

But I will add a SARIF formatter in the same PR if maintainers agree to it.

[out]
{"file": "main", "line": 12, "column": 0, "message": "No overload variant of \"foo\" matches argument type \"str\"", "hint": "Possible overload variants:\n def foo() -> None\n def foo(x: int) -> None", "code": "call-overload"}
{"file": "main", "line": 15, "column": 0, "message": "Too many arguments for \"bar\"", "hint": null, "code": "call-arg"}
{"file": "main", "line": 12, "column": 12, "message": "Revealed type is \"Overload(def (), def (x: builtins.int))\"", "hint": null, "code": "misc", "is_note": true}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Could we change this to"severity": "note", mirroring mypy's Python code (and severity error for the errors)? That will also make it easier to extend to other kinds of severity in the future.

tusharsadhwani reacted with thumbs up emoji
@dgutson
Copy link

FWIW I prefer current output as it is leaner and it's trivial for any tool to translate it to SARIF (pyright rejected the idea of adding SARIF forthe same reason).

which I just commented.

But I will add a SARIF formatter in the same PR if maintainers agree to it.

Otherwise it will require yet another converter-to-sarif tool. There is plenty of support already in IDEs, linters aggregators, and other analysis tools. If you think that SARIF could be leaner, then I do propose you to participate in its committee (I'm not being cinic, I mean it, you may have a good idea), make your proposal, and discuss it. That would be the right approach so the whole static analysis community gets benefited.

@JelleZijlstra
Copy link
Member

If I merge this PR, I take on the responsibility (at least in my mind) for triaging and reviewing future fixes to this area. I'm OK with doing that for this simple JSON format; I'm not OK with having to maintain compliance with the external SARIF spec. If another maintainer wants to add SARIF support, I'd welcome it, since clearly it will help some users, but I'm not interested in taking ownership of it.

tusharsadhwani, 2bndy5, 8tm, hauntsaninja, and vladiliescu reacted with thumbs up emojihauntsaninja reacted with heart emoji

@github-actions
Copy link
Contributor

According tomypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@JelleZijlstra
Copy link
Member

I think this is close to ready.

We'll also want to document the JSON format in the mypy documentation, but that can wait for a separate PR.

2bndy5, tusharsadhwani, and 8tm reacted with hooray emoji

@tusharsadhwani
Copy link
ContributorAuthor

@dgutson Thanks for bringing this PR to my attention again :)

Otherwise it will require yet another converter-to-sarif tool

I just made and verified a scipt to run mypy with SARIF output, dropping it here incase it is useful to someone:https://gist.github.com/tusharsadhwani/5fd7a3c32d73f10fcb86e8deac4b60cf

vladiliescu reacted with thumbs up emoji

@dgutson
Copy link

@dgutson Thanks for bringing this PR to my attention again :)

Otherwise it will require yet another converter-to-sarif tool

I just made and verified a scipt to run mypy with SARIF output, dropping it here incase it is useful to someone:https://gist.github.com/tusharsadhwani/5fd7a3c32d73f10fcb86e8deac4b60cf

could you please add it to the source tree? Otherwise it will be lost in a PR comment :)
Moreover, adding it as part of this PR would make sense for me.

@tusharsadhwani
Copy link
ContributorAuthor

I think Jelle has made his stance clear. I'd rather not have this feature delayed potentially a few more months over that.

Although I don't mind SARIF support, I'm not particularly looking for it, and neither are the maintainers super stoked about it yet, it seems.

At this moment I think it'd be best to make a different issue for SARIF support. Feel free to use the gist as a basis for the PR against that issue.

2bndy5, stdedos, and 8tm reacted with thumbs up emoji

@JelleZijlstraJelleZijlstra merged commit35fbd2a intopython:masterMay 10, 2024
@tusharsadhwanitusharsadhwani deleted the output-json branchMay 10, 2024 23:45
@patrick91
Copy link
Contributor

patrick91 commentedMay 27, 2024
edited
Loading

@JelleZijlstra sorry for always being that person, but is this released?

Was just working on some automated tests for Strawberry and having JSON output would be make it much easier (we use that for pyright as well)

tusharsadhwani reacted with heart emoji

@AlexWaygood
Copy link
Member

AlexWaygood commentedMay 27, 2024
edited
Loading

@JelleZijlstra sorry for always being that person, but is this released?

not yet released, but you can track the progress of the next release in#17285. It should include this PR.

patrick91 and tusharsadhwani reacted with heart emoji

@JelleZijlstra
Copy link
Member

General tip: you can click on the commit page to see whether a commit has been included in a release. In this case, the commit (35fbd2a) shows no releases. A random earlier commit (304997b) shows that the commit is included in a number of releases:
Screenshot 2024-05-27 at 4 34 33 PM

AlexWaygood, patrick91, tusharsadhwani, ColeRutledge, and edgarrmondragon reacted with thumbs up emojipatrick91 and ColeRutledge reacted with heart emoji

hauntsaninja added a commit that referenced this pull requestSep 27, 2024
Documents the new option added by#11396---------Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
@claudeviool
Copy link

claudeviool commentedFeb 5, 2025
edited
Loading

Nice, would it be possible to have mypy write this to a file (and otherwise retain human-readable output)?

Or, have this JSON output format work like the other 'reports' that MyPy provides?

@tusharsadhwani
Copy link
ContributorAuthor

@x0f0 what's the use case?

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@ilevkivskyiilevkivskyiilevkivskyi left review comments

@hauntsaninjahauntsaninjahauntsaninja left review comments

@JelleZijlstraJelleZijlstraJelleZijlstra approved these changes

@JukkaLJukkaLAwaiting requested review from JukkaL

+1 more reviewer

@intgrintgrintgr left review comments

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Labels

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

Support more output formats

20 participants

@tusharsadhwani@intgr@nvuillam@TomMD@sehyun-hwang@stdedos@sobolevn@ilevkivskyi@hauntsaninja@AngellusMortis@sabiroid@8tm@dgutson@JelleZijlstra@mschoettle@2bndy5@patrick91@AlexWaygood@claudeviool@tushar-deepsource

[8]ページ先頭

©2009-2025 Movatter.jp