Commit Graph

36830 Commits

Author SHA1 Message Date
Paul B Mahol
546e29d1f5 avcodec/exr: make it aware of 2 additional compressions
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-19 21:06:34 +01:00
Aleksandr Slobodeniuk
545511f57a avcodec/avcodec: fix lil typo in comment
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-19 20:21:36 +01:00
Michael Niedermayer
1df3d636d4 avcodec/speedhq: Fix warning about "initialization from incompatible pointer type"
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-19 19:49:55 +01:00
Paul B Mahol
45f4bf94af avcodec/wmaprodec: check number of channels for XMA streams
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-19 15:42:47 +01:00
Paul B Mahol
0fe50e56e9 avcodec/pixlet: use av_clip_uintp2_c explicitly
Found-by: Clément Bœsch <u@pkh.me>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-19 13:32:21 +01:00
Paul B Mahol
a340987e37 avcodec/pixlet: use av_clip_uintp2()
Found-by: Clément Bœsch <u@pkh.me>
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-19 13:19:10 +01:00
Paul B Mahol
be46eb7101 avcodec/pixlet: clip chroma before shifting
Fixes artifacts.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-19 12:49:41 +01:00
Paul B Mahol
1daa08bd96 avcodec/wmapro: redone stream selection for XMA1/2
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-19 12:33:14 +01:00
Clément Bœsch
e5ac554ba7 lavc/h264: simplify find_unused_picture() 2017-01-19 10:34:10 +01:00
Paul B Mahol
6c43f33ac2 avcodec/wmaprodec: >2 channel support for XMA
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-19 00:37:26 +01:00
Clément Bœsch
c3050fcbdc lavc/h264dec: remove flush goto in decode callback 2017-01-18 18:06:21 +01:00
Mark Thompson
2a4a8653b6 lavc: Remove old vaapi decode infrastructure
Deprecates struct vaapi_context and the installed header vaapi.h,
to be removed at the next version bump.

(cherry picked from commit 851960f6f8)
2017-01-17 23:06:46 +00:00
Mark Thompson
defbb8bc26 vaapi_vp9: Convert to use the new VAAPI hwaccel code 2017-01-17 23:06:46 +00:00
Anton Khirnov
adb54e59c1 vaapi_hevc: Convert to use the new VAAPI hwaccel code
(cherry picked from commit ea8b730d8e)
Signed-off-by: Mark Thompson <sw@jkqxz.net>
2017-01-17 23:06:46 +00:00
Mark Thompson
fd1a6a0106 vaapi_mpeg4: Convert to use the new VAAPI hwaccel code
(cherry picked from commit ccd0316f7c)
2017-01-17 23:06:46 +00:00
Mark Thompson
32b3812b60 vaapi_vc1: Convert to use the new VAAPI hwaccel code
(cherry picked from commit 520fb77285)
2017-01-17 23:06:46 +00:00
Mark Thompson
71acbea112 vaapi_mpeg2: Convert to use the new VAAPI hwaccel code
(cherry picked from commit 102e13c353)
2017-01-17 23:06:45 +00:00
Mark Thompson
c8b26d5954 vaapi_h264: Convert to use the new VAAPI hwaccel code
(cherry picked from commit 2fe93244ab)
2017-01-17 23:06:45 +00:00
Mark Thompson
79307ae563 lavc: Rewrite VAAPI decode infrastructure
Moves much of the setup logic for VAAPI decoding into lavc; the user
now need only provide the hw_frames_ctx.

(cherry picked from commit 123ccd07c5)
(cherry picked from commit 5e879b54a3)
(cherry picked from commit 0aec37e625)
(cherry picked from commit cfa4eb4fba)
2017-01-17 23:06:45 +00:00
Mark Thompson
d07d01bcce vaapi_vc1: Remove redundant version check
The lowest supported VAAPI version is 0.34 (checked at configure
time), so this test is no longer needed.

(cherry picked from commit 5a667322f5)
2017-01-17 23:06:45 +00:00
Mark Thompson
845c2c140b vaapi_vc1: Constify pointers
(cherry picked from commit 01d6f84f49)
2017-01-17 23:06:45 +00:00
Mark Thompson
6bc2808c41 vaapi_mpeg2: Constify pointers
(cherry picked from commit ee9061293e)
2017-01-17 23:06:45 +00:00
Mark Thompson
d0897da924 vaapi_h264: Constify pointers
(cherry picked from commit 03adfe9130)
2017-01-17 23:06:45 +00:00
Matthieu Bouron
bdbbb8f11e Merge commit 'f450cc7bc595155bacdb9f5d2414a076ccf81b4a'
* commit 'f450cc7bc595155bacdb9f5d2414a076ccf81b4a':
  h264: eliminate decode_postinit()

Also includes fixes from 1f7b4f9abc and e344e65109.

Original patch replace H264Context.next_output_pic (H264Picture *) by
H264Context.output_frame (AVFrame *). This change is discarded as it
is incompatible with the frame reconstruction and motion vectors
display code which needs the extra information from the H264Picture.

Merged-by: Clément Bœsch <u@pkh.me>
Merged-by: Matthieu Bouron <matthieu.bouron@gmail.com>
2017-01-17 14:38:48 +01:00
Clément Bœsch
9561de4183 lavc/h264dec: reconstruct and debug flush frames as well 2017-01-16 10:43:41 +01:00
Clément Bœsch
bd520e8569 lavc/h264_slice: drop redundant current_slice reset
It is done unconditionally in ff_h264_field_end()
2017-01-16 10:43:41 +01:00
Clément Bœsch
a91c265f39 lavc/pthread_frame: protect read state access in setup finish function 2017-01-16 10:43:41 +01:00
Paul B Mahol
40cf943714 avcodec: add SIPR parser
Fixes #2056.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-16 10:24:01 +01:00
Steve Lhomme
8fb4865901 dxva2: allow an empty array of ID3D11VideoDecoderOutputView
We can pick the correct slice index directly from the ID3D11VideoDecoderOutputView
casted from data[3].

Also added myself as maintainer for DXVA2 and D3D11VA.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-16 02:54:04 +01:00
Steve Lhomme
153b36fc62 dxva2: get the slice number directly from the surface in D3D11VA
No need to loop through the known surfaces, we'll use the requested surface
anyway.

The loop is only done for DXVA2.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-16 02:54:04 +01:00
Steve Lhomme
77742c75c5 dxva2: use a single macro to test if the DXVA context is valid
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-16 02:54:04 +01:00
Daniil Cherednik
c2500d62c6 dcaenc: Implementation of Huffman codes for DCA encoder
Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2017-01-15 18:17:12 +00:00
Daniil Cherednik
a6191d098a dcaenc: Reverse data layout to prevent data copies during Huffman encoding introduction
Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
2017-01-15 18:16:31 +00:00
Martin Storsjö
0ba0187535 aarch64: vp9mc: Fix a comment to refer to a register with the right name
This is cherrypicked from libav commit
85ad5ea72c.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:43 +01:00
Martin Storsjö
02cfb9a16e aarch64: vp9dsp: Fix vertical alignment in the init file
This is cherrypicked from libav commit
65074791e8.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:40 +01:00
Martin Storsjö
656d910981 arm: vp9mc: Fix vertical alignment of operands
This is cherrypicked from libav commit
c536e5e869.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:37 +01:00
Martin Storsjö
8b11a89c06 aarch64: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32
This work is sponsored by, and copyright, Google.

Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:

vp9_inv_dct_dct_16x16_sub16_add_neon:   1373.2
vp9_inv_dct_dct_32x32_sub32_add_neon:   8089.0

By skipping individual 8x16 or 8x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:

vp9_inv_dct_dct_16x16_sub1_add_neon:     235.3
vp9_inv_dct_dct_16x16_sub2_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub4_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub8_add_neon:    1036.7
vp9_inv_dct_dct_16x16_sub12_add_neon:   1372.1
vp9_inv_dct_dct_16x16_sub16_add_neon:   1372.1
vp9_inv_dct_dct_32x32_sub1_add_neon:     555.1
vp9_inv_dct_dct_32x32_sub2_add_neon:    5190.2
vp9_inv_dct_dct_32x32_sub4_add_neon:    5180.0
vp9_inv_dct_dct_32x32_sub8_add_neon:    5183.1
vp9_inv_dct_dct_32x32_sub12_add_neon:   6161.5
vp9_inv_dct_dct_32x32_sub16_add_neon:   6155.5
vp9_inv_dct_dct_32x32_sub20_add_neon:   7136.3
vp9_inv_dct_dct_32x32_sub24_add_neon:   7128.4
vp9_inv_dct_dct_32x32_sub28_add_neon:   8098.9
vp9_inv_dct_dct_32x32_sub32_add_neon:   8098.8

I.e. in general a very minor overhead for the full subpartition case due
to the additional cmps, but a significant speedup for the cases when we
only need to process a small part of the actual input data.

This is cherrypicked from libav commits
cad42fadcd and
a0c443a398.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:32 +01:00
Martin Storsjö
388f6e6715 arm: vp9itxfm: Skip empty slices in the first pass of idct_idct 16x16 and 32x32
This work is sponsored by, and copyright, Google.

Previously all subpartitions except the eob=1 (DC) case ran with
the same runtime:

                                     Cortex A7       A8       A9      A53
vp9_inv_dct_dct_16x16_sub16_add_neon:   3188.1   2435.4   2499.0   1969.0
vp9_inv_dct_dct_32x32_sub32_add_neon:  18531.7  16582.3  14207.6  12000.3

By skipping individual 4x16 or 4x32 pixel slices in the first pass,
we reduce the runtime of these functions like this:

vp9_inv_dct_dct_16x16_sub1_add_neon:     274.6    189.5    211.7    235.8
vp9_inv_dct_dct_16x16_sub2_add_neon:    2064.0   1534.8   1719.4   1248.7
vp9_inv_dct_dct_16x16_sub4_add_neon:    2135.0   1477.2   1736.3   1249.5
vp9_inv_dct_dct_16x16_sub8_add_neon:    2446.7   1828.7   1993.6   1494.7
vp9_inv_dct_dct_16x16_sub12_add_neon:   2832.4   2118.3   2266.5   1735.1
vp9_inv_dct_dct_16x16_sub16_add_neon:   3211.7   2475.3   2523.5   1983.1
vp9_inv_dct_dct_32x32_sub1_add_neon:     756.2    456.7    862.0    553.9
vp9_inv_dct_dct_32x32_sub2_add_neon:   10682.2   8190.4   8539.2   6762.5
vp9_inv_dct_dct_32x32_sub4_add_neon:   10813.5   8014.9   8518.3   6762.8
vp9_inv_dct_dct_32x32_sub8_add_neon:   11859.6   9313.0   9347.4   7514.5
vp9_inv_dct_dct_32x32_sub12_add_neon:  12946.6  10752.4  10192.2   8280.2
vp9_inv_dct_dct_32x32_sub16_add_neon:  14074.6  11946.5  11001.4   9008.6
vp9_inv_dct_dct_32x32_sub20_add_neon:  15269.9  13662.7  11816.1   9762.6
vp9_inv_dct_dct_32x32_sub24_add_neon:  16327.9  14940.1  12626.7  10516.0
vp9_inv_dct_dct_32x32_sub28_add_neon:  17462.7  15776.1  13446.2  11264.7
vp9_inv_dct_dct_32x32_sub32_add_neon:  18575.5  17157.0  14249.3  12015.1

I.e. in general a very minor overhead for the full subpartition case due
to the additional loads and cmps, but a significant speedup for the cases
when we only need to process a small part of the actual input data.

In common VP9 content in a few inspected clips, 70-90% of the non-dc-only
16x16 and 32x32 IDCTs only have nonzero coefficients in the upper left
8x8 or 16x16 subpartitions respectively.

This is cherrypicked from libav commit
9c8bc74c2b.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:30 +01:00
Martin Storsjö
ecd343aa1f arm: vp9itxfm: Only reload the idct coeffs for the iadst_idct combination
This avoids reloading them if they haven't been clobbered, if the
first pass also was idct.

This is similar to what was done in the aarch64 version.

This is cherrypicked from libav commit
3c87039a40.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:27 +01:00
Martin Storsjö
37cb224e3e aarch64: vp9itxfm: Don't repeatedly set x9 when nothing overwrites it
This is cherrypicked from libav commit
2f99117f6f.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:25 +01:00
Martin Storsjö
f69dd26df5 arm: vp9itxfm: Rename a macro parameter to fit better
Since the same parameter is used for both input and output,
the name inout is more fitting.

This matches the naming used below in the dmbutterfly macro.

This is cherrypicked from libav commit
79566ec8c7.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:21 +01:00
Martin Storsjö
4a5874ea8d arm/aarch64: vp9itxfm: Fix indentation of macro arguments
This is cherrypicked from libav commit
721bc37522.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:19 +01:00
Martin Storsjö
a95e7de41d aarch64: vp9itxfm: Use w3 instead of x3 for the int eob parameter
The clobbering tests in checkasm are only invoked when testing
correctness, so this bug didn't show up when benchmarking the
dc-only version.

This is cherrypicked from libav commit
4d960a1185.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:16 +01:00
Janne Grunau
a71cd8439f arm: vp9itxfm: Simplify the stack alignment code
This is one instruction less for thumb, and only have got
1/2 arm/thumb specific instructions.

This is cherrypicked from libav commit
e5b0fc170f.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:12 +01:00
Janne Grunau
cb220eeef9 aarch64: vp9: loop filter: replace 'orr; cbn?z' with 'adds; b.{eq,ne};
The latter is 1 cycle faster on a cortex-53 and since the operands are
bytewise (or larger) bitmask (impossible to overflow to zero) both are
equivalent.

This is cherrypicked from libav commit
e7ae8f7a71.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:10 +01:00
Janne Grunau
62ea07d797 aarch64: vp9: use alternative returns in the core loop filter function
Since aarch64 has enough free general purpose registers use them to
branch to the appropiate storage code. 1-2 cycles faster for the
functions using loop_filter 8/16, ... on a cortex-a53. Mixed results
(up to 2 cycles faster/slower) on a cortex-a57.

This is cherrypicked from libav commit
d7595de0b2.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 21:13:06 +01:00
Paul B Mahol
743052ec5b avcodec/cinepakenc: remove CVID from long description
Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-14 16:56:47 +01:00
Martin Vignali
31e722e9da libavcodec/psd : add test for channel depth/channel count in bitmap mode
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-01-14 04:52:43 +01:00
Paul B Mahol
2eaee6e79b avcodec/qdrw: skip long comment for now
Fixes part of #5918.

Signed-off-by: Paul B Mahol <onemda@gmail.com>
2017-01-13 21:19:17 +01:00
Steinar H. Gunderson
d68d7198be speedhq: Align blocks variable properly.
Seemingly ff_clear_block_sse assumed that the block array is aligned,
so make sure it is.

Fixes ticket #6079

Signed-off-by: James Almer <jamrial@gmail.com>
2017-01-13 16:47:53 -03:00