Commit Graph

36865 Commits

Author SHA1 Message Date
Michael Niedermayer
f28299da8d avcodec/h264dec: Clear ref_count on slice header processing failure
Fixes using freed memory
Introduced in 7448019890
Fixes: 471/fuzz-1-ffmpeg_VIDEO_AV_CODEC_ID_H264_fuzzer

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-27 00:52:49 +01:00
Paul B Mahol
b4a13d442a avcodecc/ccaption_dec: remove extra word from long codec description
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-25 12:00:02 +01:00
Michael Niedermayer
2080bc3371 avcodec/utils: correct align value for interplay
Fixes out of array access
Fixes: 452/fuzz-1-ffmpeg_VIDEO_AV_CODEC_ID_INTERPLAY_VIDEO_fuzzer

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-25 00:56:37 +01:00
Carl Eugen Hoyos
6d6faa2a2d lavc/svq3: Fail for media key encryption.
Tested-by: ami_stuff

Fixes a part of ticket #6094.
2017-01-24 23:40:13 +01:00
Michael Niedermayer
9e6a242755 avcodec/vp56: Check for the bitstream end, pass error codes on
Fixes timeout
Fixes: 446/fuzz-3-ffmpeg_VIDEO_AV_CODEC_ID_VP6_fuzzer

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-24 23:01:04 +01:00
Martin Storsjö
9f10cff610 aarch64: Add NEON optimizations for 10 and 12 bit vp9 loop filter
This work is sponsored by, and copyright, Google.

This is similar to the arm version, but due to the larger registers
on aarch64, we can do 8 pixels at a time for all filter sizes.

Examples of runtimes vs the 32 bit version, on a Cortex A53:
                                             ARM AArch64
vp9_loop_filter_h_4_8_10bpp_neon:          213.2   172.6
vp9_loop_filter_h_8_8_10bpp_neon:          281.2   244.2
vp9_loop_filter_h_16_8_10bpp_neon:         657.0   444.5
vp9_loop_filter_h_16_16_10bpp_neon:       1280.4   877.7
vp9_loop_filter_mix2_h_44_16_10bpp_neon:   397.7   358.0
vp9_loop_filter_mix2_h_48_16_10bpp_neon:   465.7   429.0
vp9_loop_filter_mix2_h_84_16_10bpp_neon:   465.7   428.0
vp9_loop_filter_mix2_h_88_16_10bpp_neon:   533.7   499.0
vp9_loop_filter_mix2_v_44_16_10bpp_neon:   271.5   244.0
vp9_loop_filter_mix2_v_48_16_10bpp_neon:   330.0   305.0
vp9_loop_filter_mix2_v_84_16_10bpp_neon:   329.0   306.0
vp9_loop_filter_mix2_v_88_16_10bpp_neon:   386.0   365.0
vp9_loop_filter_v_4_8_10bpp_neon:          150.0   115.2
vp9_loop_filter_v_8_8_10bpp_neon:          209.0   175.5
vp9_loop_filter_v_16_8_10bpp_neon:         492.7   345.2
vp9_loop_filter_v_16_16_10bpp_neon:        951.0   682.7

This is significantly faster than the ARM version in almost
all cases except for the mix2 functions.

Based on START_TIMER/STOP_TIMER wrapping around a few individual
functions, the speedup vs C code is around 2-3x.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-01-24 22:36:11 +02:00
Martin Storsjö
ceb36b8178 aarch64: Add NEON optimizations for 10 and 12 bit vp9 itxfm
This work is sponsored by, and copyright, Google.

Compared to the arm version, on aarch64 we can keep the full 8x8
transform in registers, and for 16x16 and 32x32, we can process
it in slices of 4 pixels instead of 2.

Examples of runtimes vs the 32 bit version, on a Cortex A53:
                                                ARM  AArch64
vp9_inv_adst_adst_4x4_sub4_add_10_neon:       111.0    109.7
vp9_inv_adst_adst_8x8_sub8_add_10_neon:       914.0    733.5
vp9_inv_adst_adst_16x16_sub16_add_10_neon:   5184.0   3745.7
vp9_inv_dct_dct_4x4_sub1_add_10_neon:          65.0     65.7
vp9_inv_dct_dct_4x4_sub4_add_10_neon:         100.0     96.7
vp9_inv_dct_dct_8x8_sub1_add_10_neon:         111.0    119.7
vp9_inv_dct_dct_8x8_sub8_add_10_neon:         618.0    494.7
vp9_inv_dct_dct_16x16_sub1_add_10_neon:       295.1    284.6
vp9_inv_dct_dct_16x16_sub2_add_10_neon:      2303.2   1883.9
vp9_inv_dct_dct_16x16_sub8_add_10_neon:      2984.8   2189.3
vp9_inv_dct_dct_16x16_sub16_add_10_neon:     3890.0   2799.4
vp9_inv_dct_dct_32x32_sub1_add_10_neon:      1044.4   1012.7
vp9_inv_dct_dct_32x32_sub2_add_10_neon:     13333.7   9695.1
vp9_inv_dct_dct_32x32_sub16_add_10_neon:    18531.3  12459.8
vp9_inv_dct_dct_32x32_sub32_add_10_neon:    24470.7  16160.2
vp9_inv_wht_wht_4x4_sub4_add_10_neon:          83.0     79.7

The larger transforms are significantly faster than the corresponding
ARM versions.

The speedup vs C code is smaller than in 32 bit mode, probably
because the 64 bit intermediates in the C code can be expressed
more efficiently in aarch64.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-01-24 22:36:08 +02:00
Martin Storsjö
638eceed47 aarch64: Add NEON optimizations for 10 and 12 bit vp9 MC
This work is sponsored by, and copyright, Google.

This has mostly got the same differences to the 8 bit version as
in the arm version. For the horizontal filters, we do 16 pixels
in parallel as well. For the 8 pixel wide vertical filters, we can
accumulate 4 rows before storing, just as in the 8 bit version.

Examples of runtimes vs the 32 bit version, on a Cortex A53:
                                           ARM   AArch64
vp9_avg4_10bpp_neon:                      35.7      30.7
vp9_avg8_10bpp_neon:                      93.5      84.7
vp9_avg16_10bpp_neon:                    324.4     296.6
vp9_avg32_10bpp_neon:                   1236.5    1148.2
vp9_avg64_10bpp_neon:                   4639.6    4571.1
vp9_avg_8tap_smooth_4h_10bpp_neon:       130.0     128.0
vp9_avg_8tap_smooth_4hv_10bpp_neon:      440.0     440.5
vp9_avg_8tap_smooth_4v_10bpp_neon:       114.0     105.5
vp9_avg_8tap_smooth_8h_10bpp_neon:       327.0     314.0
vp9_avg_8tap_smooth_8hv_10bpp_neon:      918.7     865.4
vp9_avg_8tap_smooth_8v_10bpp_neon:       330.0     300.2
vp9_avg_8tap_smooth_16h_10bpp_neon:     1187.5    1155.5
vp9_avg_8tap_smooth_16hv_10bpp_neon:    2663.1    2591.0
vp9_avg_8tap_smooth_16v_10bpp_neon:     1107.4    1078.3
vp9_avg_8tap_smooth_64h_10bpp_neon:    17754.6   17454.7
vp9_avg_8tap_smooth_64hv_10bpp_neon:   33285.2   33001.5
vp9_avg_8tap_smooth_64v_10bpp_neon:    16066.9   16048.6
vp9_put4_10bpp_neon:                      25.5      21.7
vp9_put8_10bpp_neon:                      56.0      52.0
vp9_put16_10bpp_neon/armv8:              183.0     163.1
vp9_put32_10bpp_neon/armv8:              678.6     563.1
vp9_put64_10bpp_neon/armv8:             2679.9    2195.8
vp9_put_8tap_smooth_4h_10bpp_neon:       120.0     118.0
vp9_put_8tap_smooth_4hv_10bpp_neon:      435.2     435.0
vp9_put_8tap_smooth_4v_10bpp_neon:       107.0      98.2
vp9_put_8tap_smooth_8h_10bpp_neon:       303.0     290.0
vp9_put_8tap_smooth_8hv_10bpp_neon:      893.7     828.7
vp9_put_8tap_smooth_8v_10bpp_neon:       305.5     263.5
vp9_put_8tap_smooth_16h_10bpp_neon:     1089.1    1059.2
vp9_put_8tap_smooth_16hv_10bpp_neon:    2578.8    2452.4
vp9_put_8tap_smooth_16v_10bpp_neon:     1009.5     933.5
vp9_put_8tap_smooth_64h_10bpp_neon:    16223.4   15918.6
vp9_put_8tap_smooth_64hv_10bpp_neon:   32153.0   31016.2
vp9_put_8tap_smooth_64v_10bpp_neon:    14516.5   13748.1

These are generally about as fast as the corresponding ARM
routines on the same CPU (at least on the A53), in most cases
marginally faster.

The speedup vs C code is around 4-9x.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-01-24 22:36:05 +02:00
Martin Storsjö
48ad3fe1be aarch64: vp9dsp: Restructure the bpp checks
This work is sponsored by, and copyright, Google.

This is more in line with how it will be extended for more bitdepths.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-01-24 22:36:02 +02:00
Martin Storsjö
1e5d87eec3 arm: Add NEON optimizations for 10 and 12 bit vp9 loop filter
This work is sponsored by, and copyright, Google.

This is pretty much similar to the 8 bpp version, but in some senses
simpler. All input pixels are 16 bits, and all intermediates also fit
in 16 bits, so there's no lengthening/narrowing in the filter at all.

For the full 16 pixel wide filter, we can only process 4 pixels at a time
(using an implementation very much similar to the one for 8 bpp),
but we can do 8 pixels at a time for the 4 and 8 pixel wide filters with
a different implementation of the core filter.

Examples of relative speedup compared to the C version, from checkasm:
                                   Cortex    A7     A8     A9    A53
vp9_loop_filter_h_4_8_10bpp_neon:          1.83   2.16   1.40   2.09
vp9_loop_filter_h_8_8_10bpp_neon:          1.39   1.67   1.24   1.70
vp9_loop_filter_h_16_8_10bpp_neon:         1.56   1.47   1.10   1.81
vp9_loop_filter_h_16_16_10bpp_neon:        1.94   1.69   1.33   2.24
vp9_loop_filter_mix2_h_44_16_10bpp_neon:   2.01   2.27   1.67   2.39
vp9_loop_filter_mix2_h_48_16_10bpp_neon:   1.84   2.06   1.45   2.19
vp9_loop_filter_mix2_h_84_16_10bpp_neon:   1.89   2.20   1.47   2.29
vp9_loop_filter_mix2_h_88_16_10bpp_neon:   1.69   2.12   1.47   2.08
vp9_loop_filter_mix2_v_44_16_10bpp_neon:   3.16   3.98   2.50   4.05
vp9_loop_filter_mix2_v_48_16_10bpp_neon:   2.84   3.64   2.25   3.77
vp9_loop_filter_mix2_v_84_16_10bpp_neon:   2.65   3.45   2.16   3.54
vp9_loop_filter_mix2_v_88_16_10bpp_neon:   2.55   3.30   2.16   3.55
vp9_loop_filter_v_4_8_10bpp_neon:          2.85   3.97   2.24   3.68
vp9_loop_filter_v_8_8_10bpp_neon:          2.27   3.19   1.96   3.08
vp9_loop_filter_v_16_8_10bpp_neon:         3.42   2.74   2.26   4.40
vp9_loop_filter_v_16_16_10bpp_neon:        2.86   2.44   1.93   3.88

The speedup vs C code measured in checkasm is around 1.1-4x.
These numbers are quite inconclusive though, since the checkasm test
runs multiple filterings on top of each other, so later rounds might
end up with different codepaths (different decisions on which filter
to apply, based on input pixel differences).

Based on START_TIMER/STOP_TIMER wrapping around a few individual
functions, the speedup vs C code is around 2-4x.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-01-24 22:35:59 +02:00
Martin Storsjö
2ed67eba96 arm: Add NEON optimizations for 10 and 12 bit vp9 itxfm
This work is sponsored by, and copyright, Google.

This is structured similarly to the 8 bit version. In the 8 bit
version, the coefficients are 16 bits, and intermediates are 32 bits.

Here, the coefficients are 32 bit. For the 4x4 transforms for 10 bit
content, the intermediates also fit in 32 bits, but for all other
transforms (4x4 for 12 bit content, and 8x8 and larger for both 10
and 12 bit) the intermediates are 64 bit.

For the existing 8 bit case, the 8x8 transform fit all coefficients in
registers; for 10/12 bit, when the coefficients are 32 bit, the 8x8
transform also has to be done in slices of 4 pixels (just as 16x16 and
32x32 for 8 bit).

The slice width also shrinks from 4 elements to 2 elements in parallel
for the 16x16 and 32x32 cases.

The 16 bit coefficients from idct_coeffs and similar tables also need
to be lenghtened to 32 bit in order to be used in multiplication with
vectors with 32 bit elements. This leads to the fixed coefficient
vectors needing more space, leading to more cases where they have to
be reloaded within the transform (in iadst16).

This technically would need testing in checkasm for subpartitions
in increments of 2, but that slows down normal checkasm runs
excessively.

Examples of relative speedup compared to the C version, from checkasm:
                                     Cortex    A7     A8     A9    A53
vp9_inv_adst_adst_4x4_sub4_add_10_neon:      4.83  11.36   5.22   6.77
vp9_inv_adst_adst_8x8_sub8_add_10_neon:      4.12   7.60   4.06   4.84
vp9_inv_adst_adst_16x16_sub16_add_10_neon:   3.93   8.16   4.52   5.35
vp9_inv_dct_dct_4x4_sub1_add_10_neon:        1.36   2.57   1.41   1.61
vp9_inv_dct_dct_4x4_sub4_add_10_neon:        4.24   8.66   5.06   5.81
vp9_inv_dct_dct_8x8_sub1_add_10_neon:        2.63   4.18   1.68   2.87
vp9_inv_dct_dct_8x8_sub4_add_10_neon:        4.52   9.47   4.24   5.39
vp9_inv_dct_dct_8x8_sub8_add_10_neon:        3.45   7.34   3.45   4.30
vp9_inv_dct_dct_16x16_sub1_add_10_neon:      3.56   6.21   2.47   4.32
vp9_inv_dct_dct_16x16_sub2_add_10_neon:      5.68  12.73   5.28   7.07
vp9_inv_dct_dct_16x16_sub8_add_10_neon:      4.42   9.28   4.24   5.45
vp9_inv_dct_dct_16x16_sub16_add_10_neon:     3.41   7.29   3.35   4.19
vp9_inv_dct_dct_32x32_sub1_add_10_neon:      4.52   8.35   3.83   6.40
vp9_inv_dct_dct_32x32_sub2_add_10_neon:      5.86  13.19   6.14   7.04
vp9_inv_dct_dct_32x32_sub16_add_10_neon:     4.29   8.11   4.59   5.06
vp9_inv_dct_dct_32x32_sub32_add_10_neon:     3.31   5.70   3.56   3.84
vp9_inv_wht_wht_4x4_sub4_add_10_neon:        1.89   2.80   1.82   1.97

The speedup compared to the C functions is around 1.3 to 7x for the
full transforms, even higher for the smaller subpartitions.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-01-24 22:35:56 +02:00
Martin Storsjö
a4d4bad75c arm: Add NEON optimizations for 10 and 12 bit vp9 MC
This work is sponsored by, and copyright, Google.

The plain pixel put/copy functions are used from the 8 bit version,
for the double size (e.g. put16 uses ff_vp9_copy32_neon), and a new
copy128 is added.

Compared with the 8 bit version, the filters can no longer use the
trick to accumulate in 16 bit with only saturation at the end, but now
the accumulators need to be 32 bit. This avoids the need to keep track
of which filter index is the largest though, reducing the size of the
executable code for these filters.

For the horizontal filters, we only do 4 or 8 pixels wide in parallel
(while doing two rows at a time), since we don't have enough register
space to filter 16 pixels wide.

For the vertical filters, we still do 4 and 8 pixels in parallel just
as in the 8 bit case, but we need to store the output after every 2
rows instead of after every 4 rows.

Examples of relative speedup compared to the C version, from checkasm:
                               Cortex    A7     A8     A9    A53
vp9_avg4_10bpp_neon:                   2.25   2.44   3.05   2.16
vp9_avg8_10bpp_neon:                   3.66   8.48   3.86   3.50
vp9_avg16_10bpp_neon:                  3.39   8.26   3.37   2.72
vp9_avg32_10bpp_neon:                  4.03  10.20   4.07   3.42
vp9_avg64_10bpp_neon:                  4.15  10.01   4.13   3.70
vp9_avg_8tap_smooth_4h_10bpp_neon:     3.38   6.22   3.41   4.75
vp9_avg_8tap_smooth_4hv_10bpp_neon:    3.89   6.39   4.30   5.32
vp9_avg_8tap_smooth_4v_10bpp_neon:     5.32   9.73   6.34   7.31
vp9_avg_8tap_smooth_8h_10bpp_neon:     4.45   9.40   4.68   6.87
vp9_avg_8tap_smooth_8hv_10bpp_neon:    4.64   8.91   5.44   6.47
vp9_avg_8tap_smooth_8v_10bpp_neon:     6.44  13.42   8.68   8.79
vp9_avg_8tap_smooth_64h_10bpp_neon:    4.66   9.02   4.84   7.71
vp9_avg_8tap_smooth_64hv_10bpp_neon:   4.61   9.14   4.92   7.10
vp9_avg_8tap_smooth_64v_10bpp_neon:    6.90  14.13   9.57  10.41
vp9_put4_10bpp_neon:                   1.33   1.46   2.09   1.33
vp9_put8_10bpp_neon:                   1.57   3.42   1.83   1.84
vp9_put16_10bpp_neon:                  1.55   4.78   2.17   1.89
vp9_put32_10bpp_neon:                  2.06   5.35   2.14   2.30
vp9_put64_10bpp_neon:                  3.00   2.41   1.95   1.66
vp9_put_8tap_smooth_4h_10bpp_neon:     3.19   5.81   3.31   4.63
vp9_put_8tap_smooth_4hv_10bpp_neon:    3.86   6.22   4.32   5.21
vp9_put_8tap_smooth_4v_10bpp_neon:     5.40   9.77   6.08   7.21
vp9_put_8tap_smooth_8h_10bpp_neon:     4.22   8.41   4.46   6.63
vp9_put_8tap_smooth_8hv_10bpp_neon:    4.56   8.51   5.39   6.25
vp9_put_8tap_smooth_8v_10bpp_neon:     6.60  12.43   8.17   8.89
vp9_put_8tap_smooth_64h_10bpp_neon:    4.41   8.59   4.54   7.49
vp9_put_8tap_smooth_64hv_10bpp_neon:   4.43   8.58   5.34   6.63
vp9_put_8tap_smooth_64v_10bpp_neon:    7.26  13.92   9.27  10.92

For the larger 8tap filters, the speedup vs C code is around 4-14x.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-01-24 22:35:50 +02:00
Martin Storsjö
cda9a3e80b arm: vp9dsp: Restructure the bpp checks
This work is sponsored by, and copyright, Google.

This is more in line with how it will be extended for more bitdepths.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-01-24 22:35:44 +02:00
Michael Niedermayer
755933cb5c avcodec/mjpegdec: Check remaining bitstream in ljpeg_decode_yuv_scan()
Fixes timeout
Fixes: 445/fuzz-3-ffmpeg_VIDEO_AV_CODEC_ID_MJPEG_fuzzer
Fixes: 456/fuzz-2-ffmpeg_VIDEO_AV_CODEC_ID_JPEGLS_fuzzer

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-24 17:50:03 +01:00
Clément Bœsch
7448019890 Merge commit '38efff92f1ef81f3de20ff0460ec7b70c253d714'
* commit '38efff92f1ef81f3de20ff0460ec7b70c253d714':
  FATE: add a test for H.264 with two fields per packet
  h264: fix decoding multiple fields per packet with slice threads

This merge includes two commits because the FATE test was useful in
order to make proper testing.

The merge gets rid of the now unused:
- SLICE_SINGLETHREAD and SLICE_SKIPED macros
- max_contexts
- "again" label in decode_nal_units()

This commit also includes the fix from d3e4d406b.

Thanks to wm4 and Michael Niedermayer for their testing.

Merged-by: Clément Bœsch <u@pkh.me>
Merged-by: Matthieu Bouron <matthieu.bouron@gmail.com>
2017-01-24 16:13:03 +01:00
Michael Niedermayer
25f4f08ba5 avcodec/h264dec: Fix regression with "make fate-h264-attachment-631 THREADS=8"
This treats the case of no slices like no frames which it basically is.

The field is added to the context as other nal related fields are also there
and passing the has_slices field per *arguments is ugly and not consistent

Found-by: ubitux
Approved-by: ubitux

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-24 12:13:59 +01:00
Pavel Koshevoy
9ea2998586 avcodec/cuvid: fail early if GPU can't handle video resolution
CUVID on GeForce GT 730 and GeForce GTX 1060 does not report any error when
decoding 8K h264 packets. However, it does return an error during
cuvidCreateDecoder call if the indicated video resolution is not
supported.

Given that stream resolution is typically known as a result of probing
it is better to use this information during avcodec_open2 call to fail
immediately, rather than proceeding to decode and never receiving any
frames from the decoder nor receiving any indication of decode failure.

Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2017-01-23 17:49:35 +01:00
Michael Niedermayer
e371f031b9 avcodec/pngdec: Fix off by 1 size in decode_zbuf()
Fixes out of array access
Fixes: 444/fuzz-2-ffmpeg_VIDEO_AV_CODEC_ID_PNG_fuzzer

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-23 01:43:35 +01:00
Michael Niedermayer
a0341b4d74 avcodec/error_resilience: update indention after last commit
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-22 21:43:06 +01:00
Michael Niedermayer
d9d9fd9446 avcodec/error_resilience: Optimize motion recovery code by using blcok lists
This makes the code 7 times faster with the testcase from libfuzzer
and should reduce the amount of timeouts we hit in automated fuzzing.
(for example 438/fuzz-2-ffmpeg_VIDEO_AV_CODEC_ID_RV40_fuzzer)

The code is also faster with more realistic input though the difference
is small here as that is far from the worst cases the fuzzers pick out

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/targets/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-22 21:39:43 +01:00
Jonathan Campbell
d5d474aea5 avcodec/ac3dec: add consistent noise generation option.
use av_lfg_init_from_data() to seed AC-3 dithering from the AC-3 frame
data to make it consistent given the same AC-3 frame, if option is set.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-22 02:29:16 +01:00
Mark Thompson
d40a1ae7ec vaapi_mpeg4: Restore changes overwritten by merge
From 2aa8e33d7d.
2017-01-22 00:07:47 +00:00
Paul B Mahol
d60f090dd1 avcodec/fraps: add support for PAL8
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-21 18:08:08 +01:00
Michael Niedermayer
cde007dcd3 avcodec: Add FF_CODEC_CAP_SKIP_FRAME_FILL_PARAM to most h263 based codecs
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-21 02:30:38 +01:00
Matthieu Bouron
cf3affabb4 lavc/h264dec: re-indent after previous commit 2017-01-20 17:29:09 +01:00
Matthieu Bouron
639e262971 lavc/h264dec: make sure a slice is decoded before finishing setup
Fixes regression in fate-h264-attachment-631 with THREADS=8 introduced
by bdbbb8f11e.
2017-01-20 17:28:40 +01:00
Paul B Mahol
18cfcc6458 avcodec/wmaprodec: add xma_flush for seeking in XMA2
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-20 13:58:41 +01:00
Paul B Mahol
5d2609929d avcodec: add XMA2 parser
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-20 13:58:41 +01:00
Paul B Mahol
96fe4432f5 avcodec/wmaprodec: unbreak XMA mono decoding
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-20 13:58:36 +01:00
bnnm
cab0f3abc5 avcodec/atrac3: allow 6 channels (non-joint stereo)
Raises max channels to 6 (for non joint-stereo only),
there is no difference decoding 1 or N discrete channels.
Fixes trac issue #5840

Signed-off-by: bnnm <bananaman255@gmail.com>
2017-01-20 12:53:57 +01:00
Daniil Cherednik
9a619bef54 dcaenc: Use Huffman codes for Bit Allocation Index
Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2017-01-20 10:03:46 +00:00
Timo Rothenpieler
6b0a3ee6f8 avcodec/nvenc: add logging for more error cases 2017-01-20 10:29:36 +01:00
Timo Rothenpieler
5403d90f32 avcodec/nvenc: make gpu indices independend of supported capabilities 2017-01-20 10:29:36 +01:00
Paul B Mahol
8a1759ad46 avcodec/exr: export writer info into frame metadata
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-19 21:06:34 +01:00
Paul B Mahol
546e29d1f5 avcodec/exr: make it aware of 2 additional compressions
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-19 21:06:34 +01:00
Aleksandr Slobodeniuk
545511f57a avcodec/avcodec: fix lil typo in comment
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-19 20:21:36 +01:00
Michael Niedermayer
1df3d636d4 avcodec/speedhq: Fix warning about "initialization from incompatible pointer type"
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-19 19:49:55 +01:00
Paul B Mahol
45f4bf94af avcodec/wmaprodec: check number of channels for XMA streams
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-19 15:42:47 +01:00
Paul B Mahol
0fe50e56e9 avcodec/pixlet: use av_clip_uintp2_c explicitly
Found-by: Clément Bœsch <u@pkh.me>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-19 13:32:21 +01:00
Paul B Mahol
a340987e37 avcodec/pixlet: use av_clip_uintp2()
Found-by: Clément Bœsch <u@pkh.me>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-19 13:19:10 +01:00
Paul B Mahol
be46eb7101 avcodec/pixlet: clip chroma before shifting
Fixes artifacts.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-19 12:49:41 +01:00
Paul B Mahol
1daa08bd96 avcodec/wmapro: redone stream selection for XMA1/2
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-19 12:33:14 +01:00
Clément Bœsch
e5ac554ba7 lavc/h264: simplify find_unused_picture() 2017-01-19 10:34:10 +01:00
Paul B Mahol
6c43f33ac2 avcodec/wmaprodec: >2 channel support for XMA
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-19 00:37:26 +01:00
Clément Bœsch
c3050fcbdc lavc/h264dec: remove flush goto in decode callback 2017-01-18 18:06:21 +01:00
Mark Thompson
2a4a8653b6 lavc: Remove old vaapi decode infrastructure
Deprecates struct vaapi_context and the installed header vaapi.h,
to be removed at the next version bump.

(cherry picked from commit 851960f6f8)
2017-01-17 23:06:46 +00:00
Mark Thompson
defbb8bc26 vaapi_vp9: Convert to use the new VAAPI hwaccel code 2017-01-17 23:06:46 +00:00
Anton Khirnov
adb54e59c1 vaapi_hevc: Convert to use the new VAAPI hwaccel code
(cherry picked from commit ea8b730d8e)
Signed-off-by: Mark Thompson <sw@jkqxz.net>
2017-01-17 23:06:46 +00:00
Mark Thompson
fd1a6a0106 vaapi_mpeg4: Convert to use the new VAAPI hwaccel code
(cherry picked from commit ccd0316f7c)
2017-01-17 23:06:46 +00:00
Mark Thompson
32b3812b60 vaapi_vc1: Convert to use the new VAAPI hwaccel code
(cherry picked from commit 520fb77285)
2017-01-17 23:06:46 +00:00