Commit Graph

14 Commits

Author SHA1 Message Date
Rémi Denis-Courmont
19baf4e009 swscale/rgb2rgb: R-V V deinterleaveBytes 2023-10-03 22:53:20 +03:00
Rémi Denis-Courmont
ede3215115 swscale/rgb2rgb: fix extra iteration in R-V V interleave
There was an additional iteration doing nothing for each line,
due to checking the selected vector length instead of the available
vector length.
2023-10-03 22:53:20 +03:00
Rémi Denis-Courmont
d14130aea3 swscale/rgb2rgb: unroll R-V V interleave_bytes 2023-10-03 20:48:47 +03:00
Rémi Denis-Courmont
6269c4a440 swscale/rgb2rgb: unroll RISC-V V uyvytoyuv422 2023-10-03 20:48:39 +03:00
Rémi Denis-Courmont
e50f8e861b swscale/rgb2rgb: avoid S-regs in RISC-V V uyvytoyuv422
We can make do with callee-clobbered registers only now.
As an added bonus, this makes the code XLEN-independent.
2023-10-03 20:48:39 +03:00
Rémi Denis-Courmont
be37a2e364 swscale/rgb2rgb: rework RISC-V V uyvytoyuv422
This avoids using relatively slow register strides.
2023-10-03 20:48:39 +03:00
Rémi Denis-Courmont
1a4bd76ea5 swscale/rgb2rgb: remove R-V V shuffle_bytes_3012
This is slower than the Zbb version on real hardware due to register
strides. Proper support for vector byte-swap requires the Zvbb
extension, but it's much too early for me to worry about it.
2023-10-02 22:28:38 +03:00
Rémi Denis-Courmont
c2b38619c0 swscale/rgb2rgb2: rework RISC-V V shuffle_bytes_{1230,3012}
This avoids strided loads.

Before:
shuffle_bytes_1230_rvv_i32: 308.7
shuffle_bytes_3012_rvv_i32: 308.7

After:
shuffle_bytes_1230_rvv_i32: 46.7
shuffle_bytes_3012_rvv_i32: 46.7
2023-07-21 22:18:02 +03:00
Rémi Denis-Courmont
15982554e6 swscale/rgb2rgb2: rework RISC-V V shuffle_bytes_{0321,2103}
This avoids strided loads.

Before:
shuffle_bytes_0321_rvv_i32: 307.7
shuffle_bytes_2103_rvv_i32: 308.7

After:
shuffle_bytes_0321_rvv_i32: 59.7
shuffle_bytes_2103_rvv_i32: 61.5
2023-07-21 22:18:02 +03:00
Rémi Denis-Courmont
d3948e4db5 swscale: inline ff_shuffle_bytes_3210_rvv
No functional changes.
2023-07-21 22:18:02 +03:00
Khem Raj
a7b3c0203f libswscale/riscv: fix syntax of vsetvli
Add missing operand which clang complains about but GCC assumes it to be
'm1' if not specified.

Works around build failure with Clang:
| src/libswscale/riscv/rgb2rgb_rvv.S:88:25: error: operand must be e[8|16|32|64|128|256|512|1024],m[1|2|4|8|f2|f4|f8],[ta|tu],[ma|mu]
|         vsetvli t4, t3, e8, ta, ma
|                         ^

Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2023-07-13 22:01:24 +03:00
Rémi Denis-Courmont
a1bfb5290e sws/rgb2rgb: RISC-V 64-bit V packed YUYV/UYVY to planar 4:2:2
This is currently 64-bit only because the stack spilling code would not
assemble on RV32I (and it would corrupt s0 and s1 on RV128I, in theory).

This could be added later in the unlikely that someone wants it.
2022-09-30 07:25:44 +02:00
Rémi Denis-Courmont
9181835a24 sws/rgb2rgb: RISC-V V interleaveBytes 2022-09-30 07:24:09 +02:00
Rémi Denis-Courmont
66a03f4053 sws/rgb2rgb: RISC-V V shuffle_bytes_xxxx functions 2022-09-30 07:24:09 +02:00