Commit Graph

21401 Commits

Author SHA1 Message Date
Martin Storsjö
0331c3f5e8 arm: vp9itxfm: Make the larger core transforms standalone functions
This work is sponsored by, and copyright, Google.

This reduces the code size of libavcodec/arm/vp9itxfm_neon.o from
15324 to 12388 bytes.

This gives a small slowdown of a couple tens of cycles, up to around
150 cycles for the full case of the largest transform, but makes
it more feasible to add more optimized versions of these transforms.

Before:                              Cortex A7       A8       A9      A53
vp9_inv_dct_dct_16x16_sub4_add_neon:    2063.4   1516.0   1719.5   1245.1
vp9_inv_dct_dct_16x16_sub16_add_neon:   3279.3   2454.5   2525.2   1982.3
vp9_inv_dct_dct_32x32_sub4_add_neon:   10750.0   7955.4   8525.6   6754.2
vp9_inv_dct_dct_32x32_sub32_add_neon:  18574.0  17108.4  14216.7  12010.2

After:
vp9_inv_dct_dct_16x16_sub4_add_neon:    2060.8   1608.5   1735.7   1262.0
vp9_inv_dct_dct_16x16_sub16_add_neon:   3211.2   2443.5   2546.1   1999.5
vp9_inv_dct_dct_32x32_sub4_add_neon:   10682.0   8043.8   8581.3   6810.1
vp9_inv_dct_dct_32x32_sub32_add_neon:  18522.4  17277.4  14286.7  12087.9

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-02-09 12:31:40 +02:00
Martin Storsjö
57ec83e424 omx: Use the EOS flag to handle flushing at the end
This avoids having to count the number of frames sent to the codec
and the number of output packets received; instead just wait until
the encoder returns a buffer with the EOS flag set.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-02-08 11:50:57 +02:00
Diego Biurrun
a25dac976a Use bitstream_init8() where appropriate 2017-02-07 18:27:21 +01:00
Alexandra Hájková
f7ec7f546f wma: Convert to the new bitstream reader 2017-02-06 15:13:34 +01:00
Martin Storsjö
58d87e0f49 aarch64: vp9itxfm: Restructure the idct32 store macros
This avoids concatenation, which can't be used if the whole macro
is wrapped within another macro.

This is also arguably more readable.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-02-05 13:05:32 +02:00
Martin Storsjö
3bc5b28d5a arm: vp9itxfm: Avoid .irp when it doesn't save any lines
This makes it more readable.

Signed-off-by: Martin Storsjö <martin@martin.st>
2017-02-05 12:59:19 +02:00
Diego Biurrun
7abdd026df asm: Consistently uppercase SECTION markers 2017-02-03 11:37:53 +01:00
Alexandra Hájková
c29da01ac9 svq3: Convert to the new bitstream reader 2017-02-02 17:06:17 +01:00
wm4
577326d430 lavc: deprecate refcounted_frames field
No deprecation guards, because the old decode API (for which this field
is needed) doesn't have any either.

This field should be removed together with the old decode calls.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2017-02-01 10:47:46 +01:00
Anton Khirnov
fd9212f2ed Mark some arrays that never change as const. 2017-02-01 10:42:59 +01:00
Alexandra Hájková
ab2539bd37 ffv1: Convert to the new bitstream reader 2017-01-31 17:54:11 +01:00
Alexandra Hájková
2d72219554 h261dec: Convert to the new bitstream reader 2017-01-31 17:54:11 +01:00
Alexandra Hájková
2b94ed12de shorten: Convert to the new bitstream reader 2017-01-31 17:54:11 +01:00
Alexandra Hájková
5a6da49dd0 ralf: Convert to the new bitstream reader 2017-01-31 17:54:11 +01:00
Alexandra Hájková
d85b37a955 loco: Convert to the new bitstream reader 2017-01-31 17:54:10 +01:00
Alexandra Hájková
0f94de8a09 fic: Convert to the new bitstream reader 2017-01-31 17:54:10 +01:00
Alexandra Hájková
6b1f559f9a dirac: Convert to the new bitstream reader 2017-01-31 17:54:10 +01:00
Alexandra Hájková
ffc00df0a6 cavs: Convert to the new bitstream reader 2017-01-31 17:54:10 +01:00
Alexandra Hájková
0c89ff82e9 aic: Convert to the new bitstream reader 2017-01-31 17:54:10 +01:00
Diego Biurrun
d4c2103bd3 golomb: Convert to the new bitstream reader 2017-01-31 17:46:19 +01:00
Andreas Cadhalpun
612cc07128 pgssubdec: reset rle_data_len/rle_remaining_len on allocation error
The code relies on their validity and otherwise can try to access a NULL
object->rle pointer, causing segmentation faults.

Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2017-01-31 09:35:54 +01:00
Mark Thompson
ca62236a89 vaapi_encode: Add VP8 support 2017-01-30 23:03:46 +00:00
Mark Thompson
ff35aa8ca4 vaapi_encode: Pass framerate parameters to driver
Only do this when building for a recent VAAPI version - initial
driver implementations were confused about the interpretation of the
framerate field, but hopefully this will be consistent everywhere
once 0.40.0 is released.
2017-01-30 22:52:54 +00:00
Mark Thompson
eddfb57210 vaapi_h264: Enable VBR mode
Default to using VBR when a target bitrate is set, unless the max rate
is also set and matches the target.  Changes to the Intel driver mean
that min_qp is also respected in this case, so set a codec default to
unset the value rather than using the current default inherited from
the MPEG-4 part 2 encoder.
2017-01-30 22:52:54 +00:00
Mark Thompson
f033ba470f vaapi_encode: Support VBR mode
This includes a backward-compatibility hack to choose CBR anyway on
old drivers which have no CBR support, so that existing programs will
continue to work their options now map to VBR.
2017-01-30 22:52:54 +00:00
Mark Thompson
ca6ae3b77a vaapi_encode: Add MPEG-2 support 2017-01-29 13:28:31 +00:00
Alexandra Hájková
381a4e31a6 tak: Convert to the new bitstream reader 2017-01-25 11:06:58 +01:00
Diego Biurrun
2e0e150144 magicyuv: Convert to the new bitstream reader 2017-01-25 10:38:43 +01:00
Diego Biurrun
b061f298f7 truemotion2rt: Convert to the new bitstream reader 2017-01-25 09:55:36 +01:00
Alexandra Hájková
e7f24c9ffc wavpack: Convert to the new bitstream reader 2017-01-25 09:55:35 +01:00
Alexandra Hájková
6668bc80b5 mpc: Convert to the new bitstream reader 2017-01-25 09:55:33 +01:00
Alexandra Hájková
fd8de7f2d8 dxtory: Convert to the new bitstream reader 2017-01-20 10:18:32 +01:00
Alexandra Hájková
4d49a4c550 apedec: Convert to the new bitstream reader 2017-01-20 10:18:32 +01:00
Anton Khirnov
b4a911c189 mpegvideoenc: make a table const 2017-01-19 09:52:21 +01:00
Anton Khirnov
296eff4d9d zmbvenc: get rid of a global table 2017-01-19 09:52:10 +01:00
Derek Buitenhuis
00b775dda2 hevc: Mark as having threadsafe init
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
2017-01-19 09:51:15 +01:00
Alexandra Hájková
54dcd22885 als: Convert to the new bitstream reader 2017-01-17 09:52:11 +01:00
Luca Barbato
fb59f87ce7 nvenc: Explicitly push the cuda context on encoding
Make sure that NVENC does not misbehave if other cuda usages happen
in the application.
2017-01-17 07:37:12 +01:00
Alexandra Hájková
4795e4f61f alac: Convert to the new bitstream reader 2017-01-13 10:27:03 +01:00
Luca Barbato
f8f7ad758d qsv: Set the correct range for la_depth
Setting an invalid range for it makes the encoder behave inconsistently.
2017-01-13 08:42:10 +01:00
Anton Khirnov
1202b71269 theora: export cropping information instead of handling it internally 2017-01-12 16:29:17 +01:00
Anton Khirnov
c3e84820d6 h264dec: export cropping information instead of handling it internally 2017-01-12 16:29:12 +01:00
Anton Khirnov
4fded0480f h264dec: be more explicit in handling container cropping
The current condition can trigger in cases where it shouldn't, with
unexpected results.
Make sure that:
- container cropping is really based on the original dimensions from the
  caller
- those dimenions are discarded on size change

The code is still quite hacky and eventually should be deprecated and
removed, with the decision about which cropping is used delegated to the
caller.
2017-01-12 16:28:05 +01:00
Anton Khirnov
a02ae1c683 hevcdec: export cropping information instead of handling it internally 2017-01-12 16:27:56 +01:00
Anton Khirnov
019ab88a95 lavc: add an option for exporting cropping information to the caller
Also, add generic code for handling cropping, so the decoders can export
just the cropping size and not bother with the rest.
2017-01-12 16:24:15 +01:00
Anton Khirnov
b68e353136 qsvdec: do not sync PIX_FMT_QSV surfaces
Introducing enforced sync points in arbitrary places is bad for
performance. Since the vast majority of receiving code (QSV VPP or
encoders, retrieving frames through hwcontext) will do the syncing, this
change should not be visible to most callers. But bumping micro just in
case.

This is also consistent with what VAAPI hwaccel does.
2017-01-12 16:21:39 +01:00
Steve Lhomme
ac3c3ee678 dxva2: allow an empty array of ID3D11VideoDecoderOutputView
We can pick the correct slice index directly from the ID3D11VideoDecoderOutputView
casted from data[3].

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2017-01-12 16:19:13 +01:00
Steve Lhomme
f67235a28c dxva2: get the slice number directly from the surface in D3D11VA
No need to loop through the known surfaces, we'll use the requested surface
anyway.

The loop is only done for DXVA2.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2017-01-12 16:09:41 +01:00
Mark Thompson
89725a8512 vaapi_h264: Scale log2_max_pic_order_cnt_lsb with max_b_frames
Before this change, it was possible to overflow pic_order_cnt_lsb and
generate a stream with invalid POC numbering.  This makes sure that
the field is large enough that a single IDR B* P sequence uses fewer
than half the available POC lsb values.
2017-01-11 23:03:58 +00:00
Mark Thompson
a3c3a5eac2 vaapi_encode: Support forcing IDR frames via AVFrame.pict_type 2017-01-11 23:03:58 +00:00