FFmpeg

mirror of https://git.ffmpeg.org/ffmpeg.git synced 2024-10-20 07:05:07 +00:00

History

Rémi Denis-Courmont cd7b352c53 lavc/sbrdsp: R-V V autocorrelate With 5 accumulator vectors and 6 inputs, this can only use LMUL=2. Also the number of vector loop iterations is small, just 5 on 128-bit vector hardware. The vector loop is somewhat unusual in that it processes data in descending memory order, in order to save on vector slides: in descending order, we can extract elements to carry over to the next iteration from the bottom of the vectors directly. With ascending order (see in the Opus postfilter function), there are no ways to get the top elements directly. On the downside, this requires the use of separate shift and sub (the would-be SH3SUB instruction does not exist), with a small pipeline stall on the vector load address. The edge cases in scalar are done in scalar as this saves on loads and remains significantly faster than C. autocorrelate_c: 669.2 autocorrelate_rvv_f32: 421.0		2023-11-12 14:03:09 +02:00
..
aacpsdsp_init.c
aacpsdsp_rvv.S
ac3dsp_init.c
ac3dsp_rvb.S
alacdsp_init.c
alacdsp_rvv.S
audiodsp_init.c
audiodsp_rvf.S
audiodsp_rvv.S
bswapdsp_init.c
bswapdsp_rvb.S
bswapdsp_rvv.S
exrdsp_init.c
exrdsp_rvv.S
fmtconvert_init.c
fmtconvert_rvv.S
g722dsp_init.c
g722dsp_rvv.S
h264_chroma_init_riscv.c
h264_mc_chroma.S
huffyuvdsp_init.c
huffyuvdsp_rvv.S
idctdsp_init.c
idctdsp_rvv.S
jpeg2000dsp_init.c
jpeg2000dsp_rvv.S
Makefile
opusdsp_init.c
opusdsp_rvv.S
pixblockdsp_init.c
pixblockdsp_rvi.S
pixblockdsp_rvv.S
sbrdsp_init.c	lavc/sbrdsp: R-V V autocorrelate	2023-11-12 14:03:09 +02:00
sbrdsp_rvv.S	lavc/sbrdsp: R-V V autocorrelate	2023-11-12 14:03:09 +02:00
utvideodsp_init.c
utvideodsp_rvv.S
vorbisdsp_init.c
vorbisdsp_rvv.S