FFmpeg/libavcodec/x86
Christophe Gisquet 3e892b2bcd x86: hevc_mc: split differently calls
In some cases, 2 or 3 calls are performed to functions for unusual
widths. Instead, perform 2 calls for different widths to split the
workload.

The 8+16 and 4+8 widths for respectively 8 and more than 8 bits can't
be processed that way without modifications: some calls use unaligned
buffers, and having branches to handle this was resulting in no
micro-benchmark benefit.

For block_w == 12 (around 1% of the pixels of the sequence):
Before:
12758 decicycles in epel_uni, 4093 runs, 3 skips
19389 decicycles in qpel_uni, 8187 runs, 5 skips
22699 decicycles in epel_bi, 32743 runs, 25 skips
34736 decicycles in qpel_bi, 32733 runs, 35 skips

After:
11929 decicycles in epel_uni, 4096 runs, 0 skips
18131 decicycles in qpel_uni, 8184 runs, 8 skips
20065 decicycles in epel_bi, 32750 runs, 18 skips
31458 decicycles in qpel_bi, 32753 runs, 15 skips

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-24 12:05:33 +02:00
..
ac3dsp_init.c
ac3dsp.asm
audiodsp_init.c x86/audiodsp: move asm code out of dsputil 2014-06-22 19:53:09 +02:00
audiodsp.asm x86/audiodsp: move asm code out of dsputil 2014-06-22 19:53:09 +02:00
blockdsp_init.c Merge commit '12f129e545e5a5844b6ad7f3eb6a438015cad8bc' 2014-07-05 19:50:05 +02:00
blockdsp.asm Merge commit '79793f833784121d574454af4871866576c0749d' 2014-07-01 15:43:40 +02:00
bswapdsp_init.c Merge commit 'c67b449bebbe0b35c73b203683e77a0a649bc765' 2014-06-23 13:31:26 +02:00
bswapdsp.asm x86/dsputil: move put_signed_pixels_clamped out of bswapdsp.asm 2014-06-23 22:11:18 +02:00
cabac.h
cavsdsp.c Merge commit '7fb993d338d88f2f62e0a358b6c9f3eb9a3a08ac' 2014-07-25 13:05:08 +02:00
constants.c x86: sbrdsp/fft: reuse ps_neg constant 2014-08-06 19:25:08 +02:00
constants.h x86: sbrdsp/fft: reuse ps_neg constant 2014-08-06 19:25:08 +02:00
dcadsp_init.c
dcadsp.asm
dct32.asm x86/dct32: don't build ff_dct32_float_sse on x86_64 2014-06-09 00:51:43 +02:00
dct_init.c avcodec/x86/dct_init: fix build failure with clang && disable-optimizations 2014-06-09 19:32:41 +02:00
dct-test.c Merge commit 'a786c8259dafeca9744252230b5d78f67810770c' 2014-08-01 16:21:52 +02:00
deinterlace.asm
dirac_dwt.c Merge commit 'c23ce454b3e33634a188d6facfd2b7182af5af93' 2014-07-17 22:07:52 +02:00
dirac_dwt.h
diracdsp_mmx.c Merge commit 'c23ce454b3e33634a188d6facfd2b7182af5af93' 2014-07-17 22:07:52 +02:00
diracdsp_mmx.h
diracdsp_yasm.asm x86: diracdsp: reuse constants 2014-08-06 19:25:02 +02:00
dnxhdenc_init.c Merge commit '9e0b29911f1f167381a7dbdfca68bf417b8c767b' 2014-07-18 22:33:24 +02:00
dnxhdenc.asm
dwt_yasm.asm x86: dwt: better share constants 2014-08-06 19:24:57 +02:00
fdct.c Merge commit '85cabb8d002f2cd100ced5cc17d87bfc9460d314' 2014-07-19 13:45:59 +02:00
fdct.h Merge commit '85cabb8d002f2cd100ced5cc17d87bfc9460d314' 2014-07-19 13:45:59 +02:00
fdctdsp_init.c Merge commit '85cabb8d002f2cd100ced5cc17d87bfc9460d314' 2014-07-19 13:45:59 +02:00
fft_init.c
fft.asm x86: sbrdsp/fft: reuse ps_neg constant 2014-08-06 19:25:08 +02:00
fft.h
flac_dsp_gpl.asm lavc/flacenc: partially unroll loop in flac_enc_lpc_16 2014-08-13 03:09:26 +02:00
flacdsp_init.c lavc/flacenc: add sse4 version of the 16-bit lpc encoder 2014-08-13 01:14:47 +02:00
flacdsp.asm
fmtconvert_init.c
fmtconvert.asm
fpel.asm
fpel.h x86: hpeldsp: propagate changes across codecs 2014-05-26 15:37:04 +02:00
h263_loopfilter.asm
h263dsp_init.c
h264_chromamc_10bit.asm
h264_chromamc.asm
h264_deblock_10bit.asm Merge commit '79793f833784121d574454af4871866576c0749d' 2014-07-01 15:43:40 +02:00
h264_deblock.asm Merge commit '79793f833784121d574454af4871866576c0749d' 2014-07-01 15:43:40 +02:00
h264_i386.h
h264_idct_10bit.asm
h264_idct.asm
h264_intrapred_10bit.asm x86: vpx/h264/hevc/mpeg2: share constants 2014-08-06 18:36:31 +02:00
h264_intrapred_init.c Merge commit '79793f833784121d574454af4871866576c0749d' 2014-07-01 15:43:40 +02:00
h264_intrapred.asm Merge commit '79793f833784121d574454af4871866576c0749d' 2014-07-01 15:43:40 +02:00
h264_qpel_8bit.asm
h264_qpel_10bit.asm avcodec/x86/h264_qpel_10bit: locally define pb_0 2014-06-24 02:13:43 +02:00
h264_qpel.c Merge commit '7fb993d338d88f2f62e0a358b6c9f3eb9a3a08ac' 2014-07-25 13:05:08 +02:00
h264_weight_10bit.asm
h264_weight.asm
h264chroma_init.c
h264dsp_init.c Merge commit '5ab03e41e553452118113d0c224fa32b325e45e5' 2014-06-26 02:58:59 +02:00
hevc_deblock.asm x86: vpx/h264/hevc/mpeg2: share constants 2014-08-06 18:36:31 +02:00
hevc_idct.asm x86/hevc_idct: add a colon to labels 2014-07-28 21:43:32 +02:00
hevc_mc.asm x86: hevc_mc: correct unneeded use of SSE4 code 2014-08-24 11:43:33 +02:00
hevc_res_add.asm x86/hevc_res_add: refactor ff_hevc_transform_add{16,32}_8 2014-08-21 15:01:33 -03:00
hevcdsp_init.c x86: hevc_mc: split differently calls 2014-08-24 12:05:33 +02:00
hevcdsp.h hevcdsp: remove more instances of compile-time-fixed parameters 2014-08-22 15:22:42 +02:00
hpeldsp_init.c x86: hpeldsp: propagate changes across codecs 2014-05-26 15:37:04 +02:00
hpeldsp_rnd_template.c
hpeldsp.asm x86: vpx/h264/hevc/mpeg2: share constants 2014-08-06 18:36:31 +02:00
hpeldsp.h x86: hpeldsp: propagate changes across codecs 2014-05-26 15:37:04 +02:00
huffyuvdsp_init.c Merge commit '009331303a6462d07cbe94aef9c446f1a1695519' 2014-07-05 19:11:26 +02:00
huffyuvdsp.asm x86util: add and use RSHIFT/LSHIFT macros 2014-06-15 13:19:27 +02:00
huffyuvencdsp_mmx.c Merge commit '512f3ffe9b4bb86767c2b1176554407c75fe1a5c' 2014-05-28 00:03:59 +02:00
idct_mmx_xvid.c Merge commit 'e3fcb14347466095839c2a3c47ebecff02da891e' 2014-07-01 15:22:11 +02:00
idct_sse2_xvid.c Merge commit 'd35b94fbabd8beb5d566c0b5d01688aff62c3b36' 2014-08-09 12:11:13 +02:00
idct_xvid.h
idctdsp_init.c Merge commit '84d173d3de97c753234ab0c0b50551d51413d663' 2014-08-08 22:17:04 +02:00
idctdsp_mmx.c Merge commit 'e3fcb14347466095839c2a3c47ebecff02da891e' 2014-07-01 15:22:11 +02:00
idctdsp.asm x86: rename dsputil.asm to idctdsp.asm 2014-07-02 01:08:04 +02:00
idctdsp.h Merge commit 'e3fcb14347466095839c2a3c47ebecff02da891e' 2014-07-01 15:22:11 +02:00
imdct36.asm
inline_asm.h x86: better share ff_pw_2 2014-08-06 19:24:49 +02:00
lossless_audiodsp_init.c apedsp: move to llauddsp 2014-06-05 20:31:59 +02:00
lossless_audiodsp.asm apedsp: move to llauddsp 2014-06-05 20:31:59 +02:00
lossless_videodsp_init.c rename add_hfyu_left_prediction_int16 to add_hfyu_left_pred_int16 2014-05-29 19:50:44 +02:00
lossless_videodsp.asm avcodec/x86/lossless_videodsp: Fix size of values read for left/left_top 2014-06-19 05:53:41 +02:00
lpc.c
Makefile x86: hevc: adding transform_add 2014-08-20 01:28:56 +02:00
mathops.h
me_cmp_init.c Merge commit '2d60444331fca1910510038dd3817bea885c2367' 2014-07-17 23:27:40 +02:00
me_cmp.asm Merge commit '2d60444331fca1910510038dd3817bea885c2367' 2014-07-17 23:27:40 +02:00
mlpdsp.c
mpegaudiodsp.c
mpegvideo.c Merge commit '835f798c7d20bca89eb4f3593846251ad0d84e4b' 2014-08-15 20:11:56 +02:00
mpegvideodsp.c Merge commit 'fab9df63a3156ffe1f9490aafaea41e03ef60ddf' 2014-06-23 21:10:10 +02:00
mpegvideoenc_qns_template.c Merge commit '8d686ca59db14900ad5c12b547fb8a7afc8b0b94' 2014-07-07 15:08:55 +02:00
mpegvideoenc_template.c Merge commit '85cabb8d002f2cd100ced5cc17d87bfc9460d314' 2014-07-19 13:45:59 +02:00
mpegvideoenc.c mpegvideo: cosmetics: Lowercase ugly uppercase MPV_ function name prefixes 2014-08-15 01:26:33 -07:00
mpegvideoencdsp_init.c Merge commit '3c650efb81aaa3b395ba4606ee68a47ee4efb57b' 2014-07-07 16:17:27 +02:00
mpegvideoencdsp.asm Merge commit 'c166148409fe8f0dbccef2fe684286a40ba1e37d' 2014-07-07 15:36:58 +02:00
pixblockdsp_init.c avcodec: Change get_pixels() to ptrdiff_t linesize 2014-08-06 15:50:54 +02:00
pixblockdsp.asm avcodec: Change get_pixels() to ptrdiff_t linesize 2014-08-06 15:50:54 +02:00
pngdsp_init.c
pngdsp.asm
proresdsp_init.c Merge commit 'b4987f72197e0c62cf2633bf835a9c32d2a445ae' 2014-07-18 22:01:17 +02:00
proresdsp.asm
qpel.asm
qpeldsp_init.c Merge commit '7fb993d338d88f2f62e0a358b6c9f3eb9a3a08ac' 2014-07-25 13:05:08 +02:00
qpeldsp.asm Merge commit '368f50359eb328b0b9d67451f56fda20b3255f9a' 2014-05-30 02:43:34 +02:00
rnd_template.c
rv34dsp_init.c
rv34dsp.asm
rv40dsp_init.c Merge commit '7fb993d338d88f2f62e0a358b6c9f3eb9a3a08ac' 2014-07-25 13:05:08 +02:00
rv40dsp.asm Merge commit '79793f833784121d574454af4871866576c0749d' 2014-07-01 15:43:40 +02:00
sbrdsp_init.c
sbrdsp.asm x86: sbrdsp/fft: reuse ps_neg constant 2014-08-06 19:25:08 +02:00
simple_idct.c Merge commit '5dcc201505f71b1e73e9eef12ce89d4eed252ad0' 2014-07-19 13:56:29 +02:00
simple_idct.h Merge commit '5dcc201505f71b1e73e9eef12ce89d4eed252ad0' 2014-07-19 13:56:29 +02:00
snowdsp.c Merge commit 'c23ce454b3e33634a188d6facfd2b7182af5af93' 2014-07-17 22:07:52 +02:00
svq1enc_init.c x86/svq1enc: port ssd_int8_vs_int16 to yasm 2014-07-05 21:43:40 +02:00
svq1enc.asm x86/svq1enc: use unaligned mov on SSE2 2014-07-06 20:27:57 +02:00
ttadsp_init.c
ttadsp.asm x86/ttadsp: remove an unnecessary mova 2014-08-12 12:29:05 +02:00
v210-init.c
v210.asm
vc1dsp_init.c
vc1dsp_mmx.c
vc1dsp.asm
vc1dsp.h
videodsp_init.c Revert "x86/videodsp: add emulated_edge_mc_mmxext" 2014-06-28 05:39:07 +02:00
videodsp.asm Revert "x86/videodsp: add emulated_edge_mc_mmxext" 2014-06-28 05:39:07 +02:00
vorbisdsp_init.c
vorbisdsp.asm
vp3dsp_init.c
vp3dsp.asm
vp6dsp_init.c
vp6dsp.asm
vp8dsp_init.c Merge commit '79793f833784121d574454af4871866576c0749d' 2014-07-01 15:43:40 +02:00
vp8dsp_loopfilter.asm Merge commit '79793f833784121d574454af4871866576c0749d' 2014-07-01 15:43:40 +02:00
vp8dsp.asm x86: vpx/h264/hevc/mpeg2: share constants 2014-08-06 18:36:31 +02:00
vp9dsp_init.c x86/vp9: inital AVX2 intra_pred 2014-06-08 02:37:20 +02:00
vp9intrapred.asm vp9/x86: fix bug in intra_pred_hd_32x32. 2014-08-12 13:11:21 +02:00
vp9itxfm.asm x86: vpx/h264/hevc/mpeg2: share constants 2014-08-06 18:36:31 +02:00
vp9lpf.asm x86: vpx/h264/hevc/mpeg2: share constants 2014-08-06 18:36:31 +02:00
vp9mc.asm x86: vpx/h264/hevc/mpeg2: share constants 2014-08-06 18:36:31 +02:00
vp56_arith.h
w64xmmtest.c
xvididct_init.c Merge commit '84d173d3de97c753234ab0c0b50551d51413d663' 2014-08-08 22:17:04 +02:00