FFmpeg

mirror of https://git.ffmpeg.org/ffmpeg.git synced 2024-09-19 21:06:42 +00:00

Author	SHA1	Message	Date
Niklas Haas	22530ad1ce	lavc/h274: transpose IDCT This is mathematically equivalent to what we were doing before, but gives subtly different results due to rounding (rows first vs columns first). Doing it this way makes our film grain database generation match reference implementation and now produces bit-exact outputs in my testing. Rename the transposed variables to be a bit less confusing.	2023-10-03 00:27:14 +02:00
Niklas Haas	48fc414c7c	lavc/h274: fix comment (cosmetic) Either the average, or the sum right-shifted. Not the average right-shifted.	2023-09-28 17:11:23 +02:00
Niklas Haas	616e9d2413	lavc/h274: correct grain DB indices The spec specified indices in the order [x][y], but our code follows the traditional C convention of [y][x]. This was not correctly account for when calculating the base index of the grain database access.	2023-09-28 17:11:23 +02:00
Niklas Haas	338a5fcdbe	lavc/h274: fix PRNG definition The spec specifies x^31 + x^3 + 1 as the polynomial, but the diagram in Figure 1-1 omits the +1 offset. The initial implementation was based on the diagram, but this is wrong (produces subtly incorrect results).	2023-09-28 17:11:23 +02:00
Michael Niedermayer	98aec8c1b8	avcodec/h274: Fix signed left shift Fixes: 39463/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_H264_fuzzer-5736517629247488 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2021-10-09 11:42:16 +02:00
Michael Niedermayer	991b3deea9	avcodec/h274: fix bad left shifts Fixes: left shift of negative value -3 Fixes: 37788/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_H264_fuzzer-6024714540154880 Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2021-09-26 17:21:59 +02:00
Niklas Haas	a543d075cd	avcodec/h274: trim unnecessarily large array We only ever read to idx+3, so 256 values are overkill. Signed-off-by: James Almer <jamrial@gmail.com>	2021-09-12 11:07:40 -03:00
Niklas Haas	52c35d648c	avcodec/h274: don't read from uninitialized array members This bug flew under the radar because, in practice, these values are 0-initialized for the first invocation. But for subsequent invocations (with different h/v values), reading from the uninitialized parts of `out` is undefined behavior. Avoid this by simply adjusting the iteration range of the following loops. Has the added benefit of being a minor speedup. Signed-off-by: James Almer <jamrial@gmail.com>	2021-09-12 11:07:40 -03:00
Lynne	033105a739	h274: remove optimization pragma This results in warnings on compilers which don't support it, objections were raised during the review process about it but went unnoticed, and the speed benefit is highly compiler and version specific, and also not very critical. We generally hand-write assembly to optimize loops like that, rather than use compiler magic, and for 40% best case scenario, it's simply not worth it. Plus, tree vectorization is still problematic with GCC and disabled by default for a good reason, so enabling it locally is sketchy.	2021-08-28 15:13:55 +02:00
Niklas Haas	6bc29a6b57	avcodec/h274: add film grain synthesis routine This could arguably also be a vf, but I decided to put it here since decoders are technically required to apply film grain during the output step, and I would rather want to avoid requiring users insert the correct film grain synthesis filter on their own. The code, while in C, is written in a way that unrolls/vectorizes fairly well under -O3, and is reasonably cache friendly. On my CPU, a single thread pushes about 400 FPS at 1080p. Apart from hand-written assembly, one possible avenue of improvement would be to change the access order to compute the grain row-by-row rather than in 8x8 blocks. This requires some redundant PRNG calls, but would make the algorithm more cache-oblivious. The implementation has been written to the wording of SMPTE RDD 5-2006 as faithfully as I can manage. However, apart from passing a visual inspection, no guarantee of correctness can be made due to the lack of any publicly available reference implementation against which to compare it. Signed-off-by: Niklas Haas <git@haasn.dev> Signed-off-by: James Almer <jamrial@gmail.com>	2021-08-24 09:58:52 -03:00

10 Commits