Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Scatter plots are very slow when using multiple colors #9053

Open
Labels
PerformancekeepItems to be ignored by the “Stale” Github Action
@jzwinck

Description

@jzwinck

This program plots 3 million dots with random colors:

import matplotlib.pyplot as pltimport numpy as npN = 3000000x = np.random.random(N)y = np.random.random(N)c = np.random.random((N, 3)) # RGBplt.scatter(x, y, 1, c, '.')plt.show()

The initial display is very slow. Even more problematic: zooming is very slow. If you setc toNone it will use a single color for all points and it will be fast, with zooming taking about 1 second, vs about 20 seconds with multiple colors.

If you zoom until only a few points are visible, the single-color plot will respond instantly, but the multi-color one will still take 20 seconds. It's as if all 3 million colors are being slowly remapped every time--even for points which can't be seen.

I would expect multi-color scatter plots to be only marginally slower than single-color ones. A 10x slowdown or worse makes me want to disable colors, but then I can't visualize my data properly.

#2156 (four years ago) was aimed at scatter plot performance but seems to have neglected the multi-color case, which as@ChrisBeaumont pointed out is a main use case forscatter():#2156 (comment)

Unfortunately, the biggest speedup in this PR (the blitting) essentially replicates what plot can do already. The compelling functionality of scatter is, IMO, the ability to map color and/or size onto data. I can envision two "medium"-hanging fruit optimizations, that might push this kind of functionality into the 10^5-6 points range [...]

My real data has more points but only 12 distinct colors, so I'd be happy with a speedup even if it only applies when there are, say, up to 50 distinct colors. I also useColorMap in my real application, but again I only take a few distinct choices from the map (whereas in the example above, every point has a unique color).

I'm using Matplotlib 2.0.2, NumPy 1.12.1, and Python 3.5.3 on 64-bit Linux with 128 GB of RAM.

Metadata

Metadata

Assignees

No one assigned

    Labels

    PerformancekeepItems to be ignored by the “Stale” Github Action

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp