Commit Graph

114423 Commits

Author SHA1 Message Date
Martin Storsjö
6d384298ec aarch64: hevc: Implement a neon version of put_hevc_epel_h*_8
AWS Graviton 3:
put_hevc_epel_h4_8_c: 64.7
put_hevc_epel_h4_8_neon: 25.0
put_hevc_epel_h4_8_i8mm: 21.2
put_hevc_epel_h6_8_c: 130.0
put_hevc_epel_h6_8_neon: 40.7
put_hevc_epel_h6_8_i8mm: 36.5
put_hevc_epel_h8_8_c: 209.0
put_hevc_epel_h8_8_neon: 45.2
put_hevc_epel_h8_8_i8mm: 41.2
put_hevc_epel_h12_8_c: 465.5
put_hevc_epel_h12_8_neon: 104.5
put_hevc_epel_h12_8_i8mm: 86.5
put_hevc_epel_h16_8_c: 830.7
put_hevc_epel_h16_8_neon: 134.2
put_hevc_epel_h16_8_i8mm: 114.0
put_hevc_epel_h24_8_c: 1844.7
put_hevc_epel_h24_8_neon: 282.2
put_hevc_epel_h24_8_i8mm: 277.2
put_hevc_epel_h32_8_c: 3227.5
put_hevc_epel_h32_8_neon: 501.5
put_hevc_epel_h32_8_i8mm: 396.0
put_hevc_epel_h48_8_c: 7229.2
put_hevc_epel_h48_8_neon: 1120.2
put_hevc_epel_h48_8_i8mm: 901.2
put_hevc_epel_h64_8_c: 12869.0
put_hevc_epel_h64_8_neon: 1999.2
put_hevc_epel_h64_8_i8mm: 1610.5

Signed-off-by: Martin Storsjö <martin@martin.st>
2024-03-26 08:58:29 +02:00
Martin Storsjö
8f03c30a17 aarch64: hevc: Use ld1r instead of ldr+dup in hevc_qpel_uni_w_h
Signed-off-by: Martin Storsjö <martin@martin.st>
2024-03-26 08:58:20 +02:00
Martin Storsjö
717cc82d28 aarch64: hevc: Specialize put_hevc_\type\()_h*_8_neon for horizontal looping
For widths of 32 pixels and more, loop first horizontally,
then vertically.

Previously, this function would process a 16 pixel wide slice
of the block, looping vertically. After processing the whole
height, it would backtrack and process the next 16 pixel wide
slice.

When doing 8tap filtering horizontally, the function must load
7 more pixels (in practice, 8) following the actual inputs, and
this was done for each slice.

By iterating first horizontally throughout each line, then
vertically, we access data in a more cache friendly order, and
we don't need to reload data unnecessarily.

Keep the original order in put_hevc_\type\()_h12_8_neon; the
only suboptimal case there is for width=24. But specializing
an optimal variant for that would require more code, which
might not be worth it.

For the h16 case, this implementation would give a slowdown,
as it now loads the first 8 pixels separately from the rest, but
for larger widths, it is a gain. Therefore, keep the h16 case
as it was (but remove the outer loop), and create a new specialized
version for horizontal looping with 16 pixels at a time.

Before:                  Cortex A53      A72      A73  Graviton 3
put_hevc_qpel_h16_8_neon:     710.5    667.7    692.5   211.0
put_hevc_qpel_h32_8_neon:    2791.5   2643.5   2732.0   883.5
put_hevc_qpel_h64_8_neon:   10954.0  10657.0  10874.2  3241.5
After:
put_hevc_qpel_h16_8_neon:     697.5    663.5    705.7   212.5
put_hevc_qpel_h32_8_neon:    2767.2   2684.5   2791.2   920.5
put_hevc_qpel_h64_8_neon:   10559.2  10471.5  10932.2  3051.7

Signed-off-by: Martin Storsjö <martin@martin.st>
2024-03-26 08:58:11 +02:00
Martin Storsjö
e3a54cabde aarch64: hevc: Merge consecutive stores in put_hevc_\type\()_h16_8_neon
This gets rid of a couple instructions, but the actual performance
is almost identical on Cortex A72/A73. On Cortex A53, it is a
handful of cycles faster.

Signed-off-by: Martin Storsjö <martin@martin.st>
2024-03-26 08:58:01 +02:00
Martin Storsjö
78db8405c0 aarch64: hevc: Don't iterate with sp in ff_hevc_put_hevc_qpel_uni_w_hv32/64_8_neon_i8mm
Many of the routines within hevcdsp_epel_neon and hevcdsp_qpel_neon
store temporary buffers on the stack. When consuming it,
many of these functions use the stack pointer as incremental pointer
for reading the data (instead of storing it in another register),
which is rather unusual.

Technically, this is fine as long as the pointer remains properly
aligned.

However in the case of ff_hevc_put_hevc_qpel_uni_w_hv64_8_neon_i8mm,
after incrementing sp when reading data (within each 16 pixel
wide stripe) it would then reset the stack pointer back to a lower
value, for reading the next 16 pixel wide stripe, expecting the
data to remain untouched.

This can't be assumed; data on the stack below the stack pointer
can be clobbered (e.g. by a signal handler). Some OS ABIs
allow for a little margin that won't be touched, aka a red zone,
but not all do. The ones that do, guarantee 16 or 128 bytes, not
9 KB.

Convert this function to use a separate pointer register to
iterate through the data, retaining the stack pointer to point
at the bottom of the data we require to remain untouched.

Signed-off-by: Martin Storsjö <martin@martin.st>
2024-03-26 08:57:55 +02:00
Martin Storsjö
e66858fbab aarch64: hevc: Reorder a misplaced function init line
Group the epel and qpel functions together.

Signed-off-by: Martin Storsjö <martin@martin.st>
2024-03-26 08:57:50 +02:00
Andreas Rheinhardt
ced5c5fdb8 fftools/ffmpeg_mux_init: Fix double-free on error
MATCH_PER_STREAM_OPT iterates over all options of a given
OptionDef and tests whether they apply to the current stream;
if so, they are set to ost->apad, otherwise, the code errors
out. If no error happens, ost->apad is av_strdup'ed in order
to take ownership of this pointer.

But this means that setting it originally was premature,
as it leads to double-frees when an error happens lateron.
This can simply be reproduced with
ffmpeg -filter_complex anullsrc  -apad bar -apad:n baz -f null -
This is a regression since 83ace80bfd.

Fix this by using a temporary variable instead of directly
setting ost->apad. Also only strdup the string if it actually
is != NULL.

Reviewed-by: Marth64 <marth64@proxyid.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:48:35 +01:00
Andreas Rheinhardt
4a4dcde339 avformat/internal: Move FF_FMT_INIT_CLEANUP to demux.h
and rename it to FF_INFMT_INIT_CLEANUP. This flag is demuxer-only,
so this is the more appropriate place for it.
This does not preclude adding internal flags common to both
demuxer and muxer in the future.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:36:43 +01:00
Andreas Rheinhardt
27af88fb7f avformat/vqf: Return 0 on success in read_packet
Demuxers are not supposed to return the size of the packet read.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:36:43 +01:00
Andreas Rheinhardt
29aa499fc9 avformat/cdg: Don't store avio_size() return value in int
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:36:43 +01:00
Andreas Rheinhardt
cee70b9f1b avformat/lafdec: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:36:43 +01:00
Andreas Rheinhardt
aa8c7dc3d8 avformat/argo_cvg: Avoid relocations for ArgoCVGOverride
The average length of the strings used here does not differ much
from the length of the longest string; therefore it makes sense
to use an array big enough for the longest string and not
a pointer to a string. This also moves this array into .rodata
(from .data.rel.ro).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:36:43 +01:00
Andreas Rheinhardt
69b85a69bd avformat/wady: Combine skips
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:36:43 +01:00
Andreas Rheinhardt
cdff5a2c0c avformat/avr: Combine skips
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:36:43 +01:00
Andreas Rheinhardt
56ba83ff2d avformat/fsb: Don't set data_offset manually
It is set generically to the value that it is to here.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:36:43 +01:00
Andreas Rheinhardt
88f803cf64 avformat/wvedec: Inline constant
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:36:43 +01:00
Andreas Rheinhardt
8768188581 avformat/g722: Inline constants
Forgotten in 5f0e161dd6.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:36:43 +01:00
Andreas Rheinhardt
b93ed5c28e avformat/fitsdec: Don't use AVBPrint for temporary storage
Most of the data in the temporary storage ends up being
returned to the user as AVPacket.data, so it makes sense
to avoid using the AVBPrint for temporary storage altogether
(in particular in light of the fact that the blocks read here
are too big for the small-string optimization anyway) and
read the data directly into AVPacket.data. This also avoids
another memcpy() from a stack buffer to the AVBPrint in ts_image()
(that could always have been avoided with av_bprint_get_buffer()).

These changes also allow to use av_append_packet(), which
greatly simplifies the code; furthermore, one can avoid cleanup
code on error as the packet is already unreferenced generically
on error.

There are two user-visible changes from this patch:
1. Truncated packets are now marked as corrupt.
2. AVPacket.pos is set (it corresponds to the discarded header
line, 80 bytes before the position corresponding to the
actual packet data).

Furthermore, this patch also removes code that triggered
a -Wtautological-constant-out-of-range-compare warning
from Clang (namely a comparison of an unsigned and INT64_MAX
in an assert).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:36:43 +01:00
Andreas Rheinhardt
5144455c20 avformat/hls: Don't access FFInputFormat.raw_codec_id
It is an implementation detail of other input formats whether
they use raw_codec_id or not. The HLS demuxer should not rely
on this.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:36:43 +01:00
Andreas Rheinhardt
8d8b5947c3 configure: Make hls demuxer select AAC, AC3 and EAC3 demuxers
The code relies on their presence and would presumably crash
when retrieving in_fmt->name if an encrypted stream with a codec id
without demuxer were encountered.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:36:43 +01:00
Andreas Rheinhardt
a990e6fa01 avformat/mux: Remove check for AVFMT_ALLOW_FLUSH
Due to the bump it is now certain that all devices
that support flushing have the proper internal flag set.
(Notice that the check for LIBAVFORMAT_VERSION was wrong.)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:32:52 +01:00
Andreas Rheinhardt
e95dd6f53e avformat/file: Combine all CONFIG_ANDROID_CONTENT_PROTOCOL blocks
Besides improving readability this also ensures that
a developer who has the android content protocol enabled
and works on the other parts of the file will not
forget to add necessary inclusions just because of
(indirect) inclusions from the files included only
when said protocol is enabled.

Reviewed-by: Matthieu Bouron <matthieu.bouron@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:31:58 +01:00
Andreas Rheinhardt
ebe8326409 avformat/file: Constify android content protocol
(The discrepancy between the definition and the declaration
in protocols.c is actually UB.)

Reviewed-by: Matthieu Bouron <matthieu.bouron@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:31:40 +01:00
Andreas Rheinhardt
a6189ba896 avcodec/mpegutils: Simplify indenting
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:30:45 +01:00
Andreas Rheinhardt
5eda98f382 avcodec/mpegutils: Avoid allocations when using AVBPrint
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-26 06:30:45 +01:00
James Almer
0963ef4996 fftools/ffmpeg_filter: remove prototype for non existent function
Signed-off-by: James Almer <jamrial@gmail.com>
2024-03-25 23:23:27 -03:00
James Almer
767e7d3d2b fftools/ffmpeg_filter: remove unused struct from InputFilterPriv
It's already in InputFilterOptions.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-03-25 23:23:27 -03:00
James Almer
abcdd3aed7 avformat/mov: don't use cur_item_id as array index
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
2024-03-25 23:20:51 -03:00
Michael Niedermayer
dd733b2be4
avformat/concatdec: clip outpoint - inpoint overflow in get_best_effort_duration()
An alternative would be to limit all time/duration fields to below 64bit

Fixes: signed integer overflow: -93000000 - 9223372036839000000 cannot be represented in type 'long long'
Fixes: 64546/clusterfuzz-testcase-minimized-ffmpeg_dem_CONCAT_fuzzer-5110813828186112

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-03-26 01:19:17 +01:00
Michael Niedermayer
b54c9a9c8f
avcodec/osq: avoid several signed integer overflows
Fixes: signed integer overflow: 178459578 + 2009763270 cannot be represented in type 'int'
Fixes: 62285/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_OSQ_fuzzer-5013423686287360

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-03-26 01:19:17 +01:00
Michael Niedermayer
e83e8d443b
avformat/jacosubdec: clarify code
add comments, rename variables and indent things differently

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-03-26 01:19:16 +01:00
Jun Zhao
5ebcca4e08 lavf/movenc: small cleanup for style
Small cleanup for style, indent, switch case lables.
BTW, the preferred way to ease multiple indentation levels in a
switch statement is to align the switch and its subordinate
case labels in the same column

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2024-03-26 07:52:53 +08:00
Michael Niedermayer
b792e4d4c7
avformat/cafdec: Check that data chunk end fits within 64bit
Fixes: signed integer overflow: 64 + 9223372036854775803 cannot be represented in type 'long long'
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_CAF_fuzzer-6536881135550464
Fixes: 62276/clusterfuzz-testcase-minimized-ffmpeg_dem_CAF_fuzzer-6536881135550464

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-03-26 00:08:25 +01:00
Michael Niedermayer
b8e754525c
avformat/iff: Saturate avio_tell() + 12
Fixes: signed integer overflow: 9223372036854775796 + 12 cannot be represented in type 'long long'
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_IFF_fuzzer-4898373660704768

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-03-26 00:08:25 +01:00
Michael Niedermayer
50d8e4f273
avformat/dxa: Adjust order of operations around block align
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_DXA_fuzzer-5730576523198464
Fixes: signed integer overflow: 2147483566 + 82 cannot be represented in type 'int'

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-03-26 00:08:25 +01:00
Michael Niedermayer
d973fcbcc2
avformat/cafdec: dont seek beyond 64bit
Fixes: signed integer overflow: 64 + 9223372036854775807 cannot be represented in type 'long long'
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_CAF_fuzzer-6418242730328064
Fixes: 62276/clusterfuzz-testcase-minimized-ffmpeg_dem_CAF_fuzzer-6418242730328064

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-03-26 00:08:25 +01:00
Michael Niedermayer
d384af5226
avformat/avidec: support huge durations
Fixes: signed integer overflow: 109817402400 * 301990077 cannot be represented in type 'long long'
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_AVI_fuzzer-6706191715139584
Fixes: 62276/clusterfuzz-testcase-minimized-ffmpeg_dem_AVI_fuzzer-6706191715139584

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-03-26 00:08:25 +01:00
Mark Thompson
4743c9e87a lavc/get_buffer: Add a warning on failed allocation from a fixed pool
For hardware cases where we are forced to have a fixed pool of frames
allocated up-front (such as array textures on decoder output), suggest
a possible workaround to the user if an allocation fails because the
pool is exhausted.
2024-03-25 20:44:30 +00:00
Michael Niedermayer
c0f4abe2aa
avformat/id3v2: read_uslt() check for the amount read
Fixes: timeout
Fixes: 66783/clusterfuzz-testcase-minimized-ffmpeg_dem_GENH_fuzzer-5356884892647424

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-03-25 21:41:26 +01:00
Michael Niedermayer
b7cdaff7e2
tools/target_dec_fuzzer: Adjust RKA threshold up further
Fixes: timeout
Fixes: 66636/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_RKA_fuzzer-5030913165557760

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-03-25 21:41:26 +01:00
Michael Niedermayer
70b26b693e
avcodec/vmixdec: Check shift before use
Fixes: shift exponent 32 is too large for 32-bit type 'unsigned int'
Fixes: 65909/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VMIX_fuzzer-519459745831321

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-03-25 21:41:26 +01:00
Michael Niedermayer
3c43299e9e
avformat/mov: Check sample_count and auxiliary_info_default_size to be 0
This combination causes 0 size arrays to be allocated and to leak later

Fixes: memleak
Fixes: 64342/clusterfuzz-testcase-minimized-ffmpeg_dem_MOV_fuzzer-4520993686945792

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-03-25 21:41:25 +01:00
Michael Niedermayer
6f9e90ab0b
avformat/wady: Check >0 samplerate and channels 1 || 2.
The WADY decoder only supports mono and stereo

This fixes a probetest failure

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-03-25 21:41:25 +01:00
Marton Balint
8c936e9b43 avutil/timestamp: change precision of av_ts_make_time_string()
By calling the av_ts_make_time_string2() from the function we can fix the
precision issue.

Signed-off-by: Marton Balint <cus@passwd.hu>
2024-03-25 21:30:51 +01:00
Marton Balint
5df901ffa5 avutil/timestamp: introduce av_ts_make_time_string2 for better precision
av_ts_make_time_string() used "%.6g" format, but this format was losing
precision even when the timestamp to be printed was not that large. For example
for 3 hours (10800) seconds, only 1 decimal digit was printed, which made this
format inaccurate when it was used in e.g. the silencedetect filter. Other
detection filters printing timestamps had similar issues. Also time base
parameter of the function was *AVRational instead of AVRational.

Resolve these problems by introducing a new function, av_ts_make_time_string2().

We change the used format to "%.*f", use a precision of 6, except when printing
values near 0, in which case we calculate the precision dynamically to aim for
a similar precision in normal form as with %.6g.  No longer using scientific
representation can make parsing the timestamp easier for the users, we can
safely do this because the theoretical maximum of INT64_MAX*INT32_MAX still
fits into the string buffer in normal form.

We somewhat imitate %g by trimming ending zeroes and the potential decimal
point characters. In order not to trim "inf" as well, we assume that the
decimal point string does not contain the letter "f". Note that depending on
printf %f implementation, we might trim "infinity" to "inf".

Thanks for Allan Cady for bringing up this issue.

Signed-off-by: Marton Balint <cus@passwd.hu>
2024-03-25 21:30:51 +01:00
Henrik Gramner
7c003b63c8 avcodec/x86/h264_idct: Fix incorrect xmm spilling on win64
Broken in afa471d0ef. It just happened
to work before due to x86inc.asm previously performing XMM spills in
INIT_MMX mode which was more of a bug than an intentional feature.
2024-03-25 21:17:47 +01:00
Andreas Rheinhardt
316239096b avformat/mov_chan: Use anonymous enum
Fixes many -Wenum-conversion warnings with Clang caused by
e6c2c87037.
See e.g.
https://fate.ffmpeg.org/log.cgi?time=20240325033545&slot=aarch64-linux-clang-10&log=compile

Reviewed-by: Henrik Gramner <henrik@gramner.com>
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-25 17:40:19 +01:00
Andreas Rheinhardt
a8255aa357 fate/source: Fix FATE reference file
Forgotten in ecdc94b97f.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-25 15:18:37 +01:00
James Almer
65a04cae6f avutil/channel_layout: don't clear the opaque pointer on type conversion
Otherwise it would not be lossless.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-03-25 09:38:15 -03:00
Andreas Rheinhardt
1f1b773859 avutil/hwcontext_qsv: Fix mixed declaration and code
Reviewed-by: Xiang, Haihao <haihao.xiang@intel.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2024-03-25 13:22:41 +01:00