Commit Graph

116768 Commits

Author SHA1 Message Date
Lynne
aea4d4b423
hwcontext_vulkan: rewrite upload/download
This commit was long overdue. The old transfer dubiously tried to
merge as much code as possible, and had very little in the way
of optimizations, apart from basic host-mapping.

The new code uses buffer pools for any temporary bufflers, and
handles falling back to buffer-based uploads if host-mapping fails.

Roundtrip performance difference:
ffmpeg -init_hw_device "vulkan=vk:0,debug=0,disable_multiplane=1" -f lavfi \
-i color=red:s=3840x2160 -vf hwupload,hwdownload,format=yuv420p -f null -

7900XTX:
Before: 224fps
After: 502fps

Ada, with proprietary drivers:
Before: 29fps
After: 54fps

Alder Lake:
Before: 85fps
After: 108fps

With the host-mapping codepath disabled:
Before: 32fps
After: 51fps
2024-08-11 05:13:11 +02:00
Lynne
81c5d4ea0e
hwcontext_vulkan: remove unused struct 2024-08-11 05:13:10 +02:00
Lynne
6757cdb535
vulkan_video: remove NIH pooled buffer implementation
The code predates ff_vk_get_pooled_buffer().
2024-08-11 05:13:10 +02:00
Lynne
a30b7c0158
hwcontext_vulkan: initialize optical flow queues if available
Lets us implement FPS conversion.
2024-08-11 05:13:10 +02:00
Lynne
8790a30882
hwcontext_vulkan: rewrite queue picking system for the new API
This allows us to support different video ops on different queues,
as well as any other arbitrary queues we need.
2024-08-11 05:13:09 +02:00
Lynne
bedfabc437
vulkan: use the new queue family mechanism 2024-08-11 05:13:09 +02:00
Lynne
13489c8a21
hwcontext_vulkan: add a new mechanism to expose used queue families
The issue with the old mechanism is that we had to introduce new
API each time we needed a new queue family, and all the queue families
were functionally fixed to a given purpose.

Nvidia's GPUs are able to handle video encoding and compute on the
same queue, which results in a speedup when pre-processing is required.

Also, this enables us to expose optical flow queues for frame interpolation.
2024-08-11 05:13:03 +02:00
Osamu Watanabe
d88a988d3d
avcodec/jpeg2000dec: Fix HT decoding
Fixes incorrect handling of MAGB_P value in Ccap15.
Fixes bugs in HT block decoding.

Signed-off-by: Pierre-Anthony Lemieux <pal@palemieux.com>
2024-08-10 09:22:51 -07:00
Osamu Watanabe
48b14732d8
avcodec/jpeg2000dec: Add support for placeholder passes
See Rec. ITU-T T.814 | ISO/IEC 15444-15, Annex B.

Signed-off-by: Pierre-Anthony Lemieux <pal@palemieux.com>
2024-08-10 09:22:44 -07:00
Osamu Watanabe
fe1b196499
avcodec/jpeg2000dec: Add support for CAP and CPF markers
Signed-off-by: Pierre-Anthony Lemieux <pal@palemieux.com>
2024-08-10 09:20:15 -07:00
Michael Niedermayer
1b8d95da3a
tools/target_dec_fuzzer: Use av_buffer_allocz() to avoid missing slices to have unpredictable content
This matches production code which also zeros these buffers

Fixes: use of uninitialized values
Fixes: 70885/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VP6F_fuzzer-4610946029387776 (and likely others)

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-10 15:04:46 +02:00
Fei Wang
cda5f5c5ed lavc/qsv: Use vendor id to create device
New kernel driver "xe" will be supported from Lunar Lake instead of
"i915".

"xe" kernel driver:
https://github.com/torvalds/linux/tree/master/drivers/gpu/drm/xe

Signed-off-by: Fei Wang <fei.w.wang@intel.com>
2024-08-09 13:40:26 +08:00
Fei Wang
eab4a9e9f8 lavu/hwcontext_qsv: Use vendor id to create device
New kernel driver "xe" will be supported from Lunar Lake instead of
"i915".

"xe" kernel driver:
https://github.com/torvalds/linux/tree/master/drivers/gpu/drm/xe

Signed-off-by: Fei Wang <fei.w.wang@intel.com>
2024-08-09 13:40:26 +08:00
Fei Wang
dbd74ba3c8 lavu/hwcontext_vaapi: Add option to allow to specify vendor id when init hw device
Vendor id will help to select desired device in case of kernel driver is
unknow or unsupported, for vendor may support different kernel driver on
different platforms.

Signed-off-by: Fei Wang <fei.w.wang@intel.com>
2024-08-09 13:40:24 +08:00
Michael Niedermayer
c390234da2
avformat/wtvdec: Check length of read mpeg2_descriptor
Fixes: Use of uninitialized value
Fixes: 70900/clusterfuzz-testcase-minimized-ffmpeg_dem_WTV_fuzzer-6286909377150976

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-08 19:10:05 +02:00
Michael Niedermayer
c95ea03104
avformat/wtvdec: clear sectors
The code can leave uninitialized holes in the array.
Fixes: use of uninitialized values
Fixes: 70883/clusterfuzz-testcase-minimized-ffmpeg_dem_WTV_fuzzer-6698694567591936

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Peter Ross <pross@xvid.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-08 18:24:31 +02:00
Kacper Michajłow
b534e402d8
avformat/mov: ensure required number of bytes is read
Fixes: use-of-uninitialized-value

Found by OSS-Fuzz.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-08 18:23:39 +02:00
Michael Niedermayer
d2a25dc2bf
add tools/target_swr_fuzzer
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-08 15:26:52 +02:00
James Almer
94165d1b79 avformat/iamf: use aligned intreadwrite macros where possible
Signed-off-by: James Almer <jamrial@gmail.com>
2024-08-07 00:16:21 -03:00
James Almer
49a6e448d7 avformat/movenc: use stream indexes when generating track ids
In some scenarios nb_tracks isn't the same as nb_streams, so a given id may end
up being used for two separate streams.

e.g. when muxing an IAMF track followed by a video track, if the IAMF track
consists of several streams, the video track would end up having an id of 2,
which may also be used by one of the IAMF streams.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-08-07 00:16:21 -03:00
James Almer
210740b4ed avutil/frame: use the maximum compile time supported alignment for strides
This puts lavu frame buffer allocator helpers in sync with lavc's decoder frame
buffer allocator's STRIDE_ALIGN define.

Remove the comment about av_cpu_max_align() while at it as using it is not
ideal when CPU flags can be changed mid process.

Should fix ticket #11116.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-08-07 00:16:21 -03:00
Kacper Michajłow
792a9979eb
avformat/rtpproto: free ip filters on open error
Found by OSS-Fuzz.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-07 00:59:19 +02:00
Kacper Michajłow
8485f7a378
avformat/srtpproto: pass options to nested protocol
This fixes passing options dict.

Fixes some timeouts found by OSS-Fuzz.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-07 00:59:19 +02:00
Kacper Michajłow
9876158ee2
avcodec/wmavoice: use av_clipd for double values
Fixes Clang warning.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-07 00:59:18 +02:00
Kacper Michajłow
1165c14444
avcodec/vp9mvs: fix misaligned access when clearing VP9mv
Fixes runtime error: member access within misaligned address
<addr> for type 'av_alias64', which requires 8 byte alignment.

VP9mv is aligned to 4 bytes, so instead doing 8 bytes clear, let's do
2 times 4 bytes.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-07 00:59:18 +02:00
Andreas Rheinhardt
bfcee368e2 avcodec/cbs_sei: Always zero-initialize SEI payload
Fixes: Use-of-uninitialized value
Fixes: clusterfuzz-testcase-minimized-ffmpeg_BSF_H264_METADATA_fuzzer-5458626041413632

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-08-06 20:25:23 +02:00
Kacper Michajłow
5dfc0cc841
avcodec/parser: ensure input padding is zeroed
Fixes use of uninitialized value, reported by MSAN.

Found by OSS-Fuzz.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>

Fixes: 70852/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5179190066872320
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-05 23:17:46 +02:00
Kacper Michajłow
2b5f000d3f
avformat/jpegxl_anim_dec: ensure input padding is zeroed
Fixes use of uninitialized value, reported by MSAN.

Found by OSS-Fuzz.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>

Fixes: 70837/clusterfuzz-testcase-minimized-ffmpeg_dem_JPEGXL_ANIM_fuzzer-5089407768526848
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-05 23:17:46 +02:00
Michael Niedermayer
3978e81809
avformat/img2dec: Clear padding data after EOF
Fixes: use-of-uninitialized-value
Fixes: 70852/clusterfuzz-testcase-minimized-ffmpeg_IO_DEMUXER_fuzzer-5179190066872320

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: Kacper Michajlow <kasper93@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-05 23:17:46 +02:00
Michael Niedermayer
79a1cf30d1
avformat/wavdec: Check if there are 16 bytes before testing them
Fixes: use-of-uninitialized-value
Fixes: 70839/clusterfuzz-testcase-minimized-ffmpeg_dem_W64_fuzzer-5212907590189056

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-08-05 23:17:45 +02:00
Rémi Denis-Courmont
e0f9f4d491 lavu/cpu: deprecate RISC-V F, D and zba CPU flags 2024-08-05 21:16:26 +03:00
Rémi Denis-Courmont
d1326b6347 lavu/riscv: drop probing for zba CPU capability 2024-08-05 21:16:26 +03:00
Rémi Denis-Courmont
210877c5fd sws/riscv: depend on RVB and simplify accordingly 2024-08-05 21:16:26 +03:00
Rémi Denis-Courmont
f30e5bf1f5 lavfi/riscv: depend on RVB and simplify accordingly 2024-08-05 21:16:26 +03:00
Rémi Denis-Courmont
616fdeaea3 lavc/riscv: depend on RVB and simplify accordingly
There is no known (real) hardware with V and without the complete B
extension. B was indeed required in the RISC-V application profile from
2022, earlier than V. There should not be any relevant hardware in the
future either.

In practice, different R-V Vector optimisations in FFmpeg already depend on
every constituent of the B extension anyhow, so it would not work well.
2024-08-05 21:16:26 +03:00
Rémi Denis-Courmont
cb31f17ca8 lavu/riscv: depend on RVB and simplify accordingly 2024-08-05 21:16:26 +03:00
Nathan E. Egge
ba88e8174a lavu: Set default FF_TIMER_UNITS to "ns"
Signed-off-by: Nathan E. Egge <unlord@xiph.org>
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
2024-08-05 21:16:26 +03:00
Rémi Denis-Courmont
4edfc11a28 lavc/h264dsp: R-V V idct4_add8 (all depths)
These are really just wrappers for idct4_add16intra functions, which are in
turn mostly wrappers for idct4_add and idct4_dc_add functions.

For benchmarks refer to the later two sets.
2024-08-05 21:16:26 +03:00
James Almer
eb3cc508d8 fate/mov: add an IAMF+video muxing test
Signed-off-by: James Almer <jamrial@gmail.com>
2024-08-04 12:09:40 -03:00
James Almer
5b87869c09 avformat/mov: fix track handling when mixing IAMF and video tracks
Fixes crashes when muxing the two together.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-08-04 12:09:40 -03:00
Timo Rothenpieler
9a2171318d avcodec/nvenc: fix signedness of timing fields 2024-08-03 20:04:31 +02:00
James Almer
4a56b5f3d8 avcodec/cbs_h265: don't attempt to read 0 length elements in sei_3d_reference_displays_info
Fixes: 70458/clusterfuzz-testcase-minimized-ffmpeg_BSF_TRACE_HEADERS_fuzzer-5259339779080192
Fixes: Assertion width > 0 && width <= 32 failed at libavcodec/cbs.c:608

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: James Almer <jamrial@gmail.com>
2024-08-03 11:59:14 -03:00
Rémi Denis-Courmont
de7f999481 lavc/videodsp: work-around LLVM-as
For some reason, it can't handle the normal syntax for an address operand
without an offset, so add a dummy zero offset.
2024-08-02 21:24:01 +03:00
Rémi Denis-Courmont
677f28b310 lavc/h264dsp: stick R-V V weight to 16-bit precision
T-Head C908 (ns):
h264_weight2_8_c:        1607.8
h264_weight2_8_rvv_i32:   515.0 (before)
h264_weight2_8_rvv_i32:   348.5 (after)
h264_weight4_8_c:        2255.8
h264_weight4_8_rvv_i32:  1015.0 (before)
h264_weight4_8_rvv_i32:   691.0 (after)
h264_weight8_8_c:        3857.5
h264_weight8_8_rvv_i32:  2218.8 (before)
h264_weight8_8_rvv_i32:  1561.3 (after)
h264_weight16_8_c:       7431.5
h264_weight16_8_rvv_i32: 2737.3 (before)
h264_weight16_8_rvv_i32: 1848.3 (after)

SpacemiT X60 (ns):
h264_weight2_8_c:        1624.1
h264_weight2_8_rvv_i32:   352.6 (before)
h264_weight2_8_rvv_i32:   259.3 (after)
h264_weight4_8_c:        2259.3
h264_weight4_8_rvv_i32:   685.8 (before)
h264_weight4_8_rvv_i32:   530.3 (after)
h264_weight8_8_c:        4103.3
h264_weight8_8_rvv_i32:  1581.8 (before)
h264_weight8_8_rvv_i32:  1238.6 (after)
h264_weight16_8_c:       7624.3
h264_weight16_8_rvv_i32: 2738.1 (before)
h264_weight16_8_rvv_i32: 1853.3 (after)
2024-08-02 21:24:01 +03:00
Rémi Denis-Courmont
afd45c7ff7 lavc/h264dsp: stick R-V V biweight to 16-bit
T-Head C908 (ns):
h264_biweight2_8_c:        2414.5
h264_biweight2_8_rvv_i32:   701.8 (before)
h264_biweight2_8_rvv_i32:   468.5 (after)
h264_biweight4_8_c:        4655.3
h264_biweight4_8_rvv_i32:  1377.5 (before)
h264_biweight4_8_rvv_i32:   931.8 (after)
h264_biweight8_8_c:        9701.5
h264_biweight8_8_rvv_i32:  2896.0 (before)
h264_biweight8_8_rvv_i32:  2070.5 (after)
h264_biweight16_8_c:      18025.0
h264_biweight16_8_rvv_i32: 3460.8 (before)
h264_biweight16_8_rvv_i32: 1978.0 (after)

SpacemiT X60 (ns):
h264_biweight2_8_c:        2415.5
h264_biweight2_8_rvv_i32:   478.2 (before)
h264_biweight2_8_rvv_i32:   362.8 (after)
h264_biweight4_8_c:        4655.3
h264_biweight4_8_rvv_i32:   946.7 (before)
h264_biweight4_8_rvv_i32:   727.3 (after)
h264_biweight8_8_c:        9061.8
h264_biweight8_8_rvv_i32:  2071.7 (before)
h264_biweight8_8_rvv_i32:  1685.8 (after)
h264_biweight16_8_c:      18020.5
h264_biweight16_8_rvv_i32: 3457.2 (before)
h264_biweight16_8_rvv_i32: 1935.8 (after)
2024-08-02 21:24:01 +03:00
Zhao Zhili
670ff6c7ce avcodec/nvenc: rework on DTS generation
Before the patch, the method to generate DTS only works with
timebase equal to 1/fps. With timebase like 1/1000

./ffmpeg -i foo.mp4 -an -c:v h264_nvenc -enc_time_base 1/1000 bar.mp4

pts 0    dts -3
pts 160  dts 37
pts 80   dts 77
pts 40   dts 117 <-- invalid
pts 120  dts 157
pts 320  dts 197
pts 240  dts 237
pts 200  dts 277 <-- invalid
pts 280  dts 317 <-- invalid

The generated DTS can be larger than PTS, since it only reorder the
input PTS and minus the number of frame delay, which doesn't take
timebase into account. It should minus the "time" of frame delay.

9a245bd trying to fix the issue, but the implementation is incomplete,
which only use time_base.num. Then it got reverted by ac7c265b33.

After this patch:

pts 0    dts -120
pts 160  dts -80
pts 80   dts -40
pts 40   dts 0
pts 120  dts 40
pts 320  dts 80
pts 240  dts 120
pts 200  dts 160
pts 280  dts 200

Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2024-08-02 17:57:19 +02:00
Roman Arzumanyan
bcea693f75 avcodec/cuviddec: more accurately guess probed sw pixel format
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2024-08-02 17:38:46 +02:00
Gnattu OC
d50f9701b6 avutil/hwcontext_videotoolbox: Correctly set trc
The color trc key was assigned a color primaries value which causes
the resulting colorspace is always SDR.

Fixes #10884.

Signed-off-by: Gnattu OC <gnattuoc@me.com>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-08-02 10:24:09 +08:00
Rémi Denis-Courmont
1b2a925e94 lavc/riscv: drop probing for F & D extensions
F and D extensions are included in all RISC-V application profiles ever
made (so starting from RV64GC a.k.a. RVA20). Realistically they need to be
selected at compilation time.

Currently, there are no consumers for these two flags. If there is ever a
need to reintroduce F- or D-specific optimisations, we can always use
__riscv_f or __riscv_d compiler predefined macros respectively.
2024-08-01 22:56:50 +03:00
Rémi Denis-Courmont
2f083fd581 lavc/audiodsp: drop R-V F vector_clipf
This is now firmly slower than C.

SiFive-U74 (cycles):
audiodsp.vector_clipf_c:   31.2
audiodsp.vector_clipf_rvf: 39.5
2024-08-01 19:29:40 +03:00