- Notifications
You must be signed in to change notification settings - Fork4
Rust high performace utilities for YUV format handling and conversion.
License
BSD-3-Clause, Apache-2.0 licenses found
Licenses found
awxkee/yuvutils-rs
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Fast and simple YUV approximation conversion in pure Rust. At most the same as libyuv does.Performance will be equal to libyuv or slightly higher on platforms where SIMD is implemented. Otherwise, equal or slower.
Mostly implemented AVX-512, AVX2, SSE, NEON, WASM. Waiting for SVE to be available innightly
.
x86 targets with SSE and AVX uses runtime dispatch to detect available cpu features.
Supports:
- YCbCr ( aka YUV )
- YCgCo
- YUY2
- Identity ( GBR )
- Sharp YUV
All the methods support RGB, BGR, BGRA and RGBA
If need scaling support thenpic-scale is available.
Runtime dispatch is used for use if availablesse4.1
,avx2
,avx512bw
,avx512vbmi
,avx512vnni
,avxvnni
,rdm
for NEON.
For thesse4.1
,avx2
,rdm
there is no action needed, however for AVX-512 activated featurenightly_avx512
alongnightly
rust channel is required.
Always consider to compile withnightly_avx512
when compiling for x86, it adds noticeable gain on supported cpus.
Wasmsimd128
as target feature should be enabled for implemented SIMD wasm paths support,
Some paths have multi-threading support. However, YUV conversion usually makes more sense in single-threaded mode, thus use this feature with care.
cargo add yuvutils-rs
letmut planar_image =YuvPlanarImageMut::<u8>::alloc(widthasu32, heightasu32,YuvChromaSubsampling::Yuv420);rgb_to_yuv422(&mut planar_image,&src_bytes, rgba_strideasu32,YuvRange::Limited,YuvStandardMatrix::Bt601,).unwrap();
yuv420_to_rgb(&yuv_planar_image,&mut rgba, rgba_strideasu32,YuvRange::Limited,YuvStandardMatrix::Bt601,).unwrap();
The following planar formats are used for HDR content, with either10-bit or12-bit color depth:
- I010, I012, P010, P012, P016
Half width, half height - I210, I212, P210, P212, P216
Half width, full height - I410, I412, P410, P412
Full width, full height
- I: Represents the color space (as defined above), and these formats use3 planes: Y (luminance), U (chrominance), and V (chrominance).
- ICgC: Represents the color space (as defined above), and these formats use3 planes: Y (luminance), Cg (chrominance), and Co (chrominance).
- P: Abiplanar format (similar toNV12), but with16-bit precision, where the valid bits are stored in the high bits. This format has:
- AY plane
- AUV plane (with U and V interleaved)
- 0: Represents4:2:0 chroma subsampling.
- 2: Represents4:2:2 chroma subsampling.
- 4: Represents4:4:4 chroma subsampling.
- 10,12,14,16: Indicates thebits per channel.
Tests performed on the image 1997x1331
YUV 8 bit-depth conversion
aarch64
tested on Mac Pro M3.
AVX2 tests performed onm4.large
with Ubuntu 24.04 droplet.AVX2 Win test performed on Windows 11 Intel Core i9-14900HX.
cargo bench --bench yuv8 --manifest-path ./app/Cargo.toml --features fast_mode,professional_mode
AVX-512 tests performed onAWS c5.large
with Ubuntu 24.04 instance with command
cargo +nightly bench --bench yuv8 --manifest-path ./app/Cargo.toml --features nightly_avx512,fast_mode,professional_mode
Conversion | time(NEON) | Time(AVX2 Win) | Time(AVX2) | Time(AVX-512) |
---|---|---|---|---|
utils RGB->YUV 4:2:0 | 261.66µs | 354.25µs | 973.17µs | 722.87µs |
libyuv RGB->YUV 4:2:0 | 306.80µs | 3.62ms | 1.10ms | 972.52µs |
utils RGBA->YUV 4:2:0 | 266.66µs | 367.47µs | 781.54µs | 785.93µs |
libyuv RGBA->YUV 4:2:0 | 429.36µs | 2.70ms | 846.69µs | 707.33µs |
utils RGBA->YUV 4:2:2 | 362.72µs | 464.82µs | 1.06ms | 964.87µs |
libyuv RGBA->YUV 4:2:2 | 616.29µs | 4.3003ms | 1.09ms | 947.97µs |
utils RGBA->YUV 4:4:4 | 378.49µs | 534.87µs | 1.09ms | 974.29µs |
Conversion | time(NEON) | Time(AVX2 Win) | Time(AVX2) | Time(AVX-512) |
---|---|---|---|---|
utils YUV NV12->RGBA | 399.37µs | 359.11µs | 917.52µs | 736.21µs |
utils YUV NV12->RGB | 312.38µs | 339.53µs | 817.71µs | 654.37µs |
libyuv YUV NV12->RGB | 621.11µs | 5.46ms | 979.96µs | 800.85µs |
utils YUV 4:2:0->RGB | 346.13µs | 375.34µs | 923.83µs | 614.19µs |
libyuv YUV 4:2:0->RGB | 747.18µs | 5.8231ms | 1.42ms | 1.16ms |
utils YUV 4:2:0->RGBA | 396.58µs | 380.13µs | 942.92µs | 733.88µs |
libyuv YUV 4:2:0->RGBA | 710.88µs | 557.44µs | 1.03ms | 870.64µs |
utils YUV 4:2:2->RGBA | 449.25µs | 410.71µs | 1.10ms | 824.69µs |
libyuv YUV 4:2:2->RGBA | 734.25µs | 579.24µs | 1.04ms | 851.24µs |
utils YUV 4:4:4->RGBA | 432.32µs | 366.85µs | 1.10ms | 819.17µs |
libyuv YUV 4:4:4->RGBA | 596.55µs | 525.90µs | 963.66µs | 803.62µs |
utils YUV 4:0:0->RGBA | 220.67µs | 254.83µs | 651.88µs | 600.43µs |
libyuv YUV 4:0:0->RGBA | 522.71µs | 1.8204ms | 635.61µs | 525.56µs |
YUV 16 bit-depth conversion
cargo bench --bench yuv16 --manifest-path ./app/Cargo.toml
AVX-512 tests performed onAWS c5.large
with Ubuntu 24.04 instance with command
cargo +nightly bench --bench yuv16 --manifest-path ./app/Cargo.toml --features nightly_avx512
10-bit encoding is not implemented inlibyuv
time(NEON) | Time(AVX2 Win) | Time(AVX2) | Time(AVX-512) | |
---|---|---|---|---|
utils RGB10->YUV10 4:2:0 | 502.65µs | 745.02µs | 1.88ms | 1.21ms |
libyuv RGB10->YUV10 4:2:0 | x | x | x | x |
utils RGBA10->YUV10 4:2:0 | 605.45µs | 807.89µs | 2.03ms | 1.57ms |
libyuv RGBA10->YUV10 4:2:0 | x | x | x | x |
utils RGBA10->YUV10 4:2:2 | 769.08µs | 1.04ms | 2.51ms | 1.83ms |
libyuv RGBA10->YUV10 4:2:2 | x | x | x | x |
utils RGBA10->YUV10 4:4:4 | 600.56µs | 1.12ms | 2.82ms | 2.03ms |
time(NEON) | Time(AVX2 Win) | Time(AVX2) | Time(AVX-512) | |
---|---|---|---|---|
utils YUV10 4:2:0->RGB10 | 596.78µs | 662.38µs | 1.76ms | 1.23ms |
libyuv YUV10 4:2:0->RGB10 | x | x | x | x |
utils YUV10 4:2:0->RGBA10 | 727.52µs | 670.15µs | 1.92ms | 1.59ms |
utils YUV10 4:2:0->RGBA8 | 765.23µs | 492.63µs | 1.36ms | 900.10µs |
libyuv YUV10 4:2:0->RGBA8 | 1.54ms | 6.8641ms | 1.16ms | 966.94µs |
utils YUV10 4:2:2->RGBA10 | 746.51µs | 787.41µs | 2.02ms | 1.61ms |
utils YUV10 4:2:2->RGBA8 | 799.25µs | 438.14µs | 1.23ms | 972.19µs |
libyuv YUV10 4:2:2->RGBA8 | 1.55ms | 6.8374ms | 1.21ms | 991.23µs |
utils YUV10 4:4:4->RGBA10 | 953.73µs | 848.02µs | 2.26ms | 1.8639ms |
cargo bench --bench geometry --manifest-path ./app/Cargo.toml
time(NEON) | Time(AVX2 Win) | Time(AVX2) | |
---|---|---|---|
utils Rotate 90 RGBA8 | 560.74µs | 655.43µs | 2.50ms |
libyuv Rotate 90 RGBA8 | 1.84ms | 1.01ms | 2.86ms |
utils Rotate 90 Plane8 | 215.43µs | 387.44µs | 1.05ms |
libyuv Rotate 90 Plane8 | 291.15µs | 416.52µs | 1.06ms |
utils Rotate 180 RGBA8 | 326.44µs | 506.35µs | 657.93µs |
libyuv Rotate 180 RGBA8 | 378.70µs | 512.83µs | 669.12µs |
utils Rotate 180 Plane8 | 61.02µs | 127.26µs | 227.66µs |
libyuv Rotate 180 Plane8 | 94.11µs | 418.83µs | 233.66µs |
utils Rotate 270 RGBA8 | 564.78µs | 652.83µs | 2.57ms |
libyuv Rotate 270 RGBA8 | 1.99ms | 1.03ms | 3.06ms |
This project is licensed under either of
at your option.
About
Rust high performace utilities for YUV format handling and conversion.