Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Request: change hist bins default to 'auto' #16403

Closed as not planned
Closed as not planned
Labels
API: changesstatus: closed as inactiveIssues closed by the "Stale" Github Action. Please comment on any you think should still be open.status: inactiveMarked by the “Stale” Github Actiontopic: hist
@amueller

Description

@amueller

This is revisiting#4487 in which@jakevdp suggested changing the default ofbins to 'auto'.
Since automatic determination is now supported in matplotlib via numpy, I think it would be great to make it the default.

The main reason for wanting the change is that many people use this for data analysis, and the behavior ofbins=10 is pretty terrible in many cases (seeJake's example, still many people use the defaults.
Good defaults matter. I'd love to keep educating people but no amount of educating will prevent people from using the defaults (we found this true in sklearn when mining github).

Many people use this from pandas and the actual implementation is in numpy, and@jklymak makes the case that matplotlib ideally delegates as much to numpy as possible. I am very sympathetic to this position.

My main claim is thatsomewhere the default should change.

Currently my position is that matplotlib is the best place for that. I don't think having pandas change the default would be as good as it would lead to inconsistencies between pandas and matplotlib. I would be happy with numpy changing the default, but the use cases of numpy are not necessarily related to visualization or even data analysis at all, so it's less clear to me that 'auto' is a good default there.

Also, from my perspective (and yours might be different), changing the default in numpy is more likely to break people's code and might require code changes, so the case for changing there needs to be really strong, and I think it's weaker than for matplotlib.

If you have good reasons to suggest changing the defaults in numpy, I'm happy for us all to figure this out together (data science user + numpy + matplotlib). But right now, the default behavior leads to people making bad inferences.

Metadata

Metadata

Assignees

No one assigned

    Labels

    API: changesstatus: closed as inactiveIssues closed by the "Stale" Github Action. Please comment on any you think should still be open.status: inactiveMarked by the “Stale” Github Actiontopic: hist

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp