FFmpeg/libavcodec/x86
Ronald S. Bultje 6341838f3c Use word-writing instead of dword-writing (with two cached but otherwise
unchanged bytes) in the horizontal simple loopfilter. This makes the filter
quite a bit faster in itself (~30 cycles less on Core1), probably mostly
because we don't need a complex 4x4 transpose, but only a simple byte
interleave. Also allows using pextrw on SSE4, which speeds up even more
(e.g. 25% faster on Core i7).

Originally committed as revision 24638 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-31 23:13:15 +00:00
..
cavsdsp_mmx.c
cpuid.c
dct32_sse.c
deinterlace.asm Convert deinterlacing MMX code to YASM 2010-07-31 14:50:51 +00:00
dnxhd_mmx.c
dsputil_h264_template_mmx.c
dsputil_h264_template_ssse3.c
dsputil_mmx_avg_template.c
dsputil_mmx_qns_template.c
dsputil_mmx_rnd_template.c
dsputil_mmx.c
dsputil_mmx.h Convert deinterlacing MMX code to YASM 2010-07-31 14:50:51 +00:00
dsputil_yasm.asm
dsputilenc_mmx.c
fdct_mmx.c
fft_3dn2.c
fft_3dn.c
fft_mmx.asm
fft_sse.c
fft.c
fft.h
h264_deblock_sse2.asm
h264_i386.h
h264_idct_sse2.asm
h264_intrapred.asm
h264dsp_mmx.c
idct_mmx_xvid.c
idct_mmx.c
idct_sse2_xvid.c
idct_xvid.h
lpc_mmx.c
Makefile Convert deinterlacing MMX code to YASM 2010-07-31 14:50:51 +00:00
mathops.h
mlpdsp.c
motion_est_mmx.c
mpegaudiodec_mmx.c
mpegvideo_mmx_template.c
mpegvideo_mmx.c
rv40dsp_mmx.c
simple_idct_mmx.c
snowdsp_mmx.c
vc1dsp_mmx.c
vc1dsp_yasm.asm
vp3dsp_mmx.c
vp3dsp_mmx.h
vp3dsp_sse2.c
vp3dsp_sse2.h
vp6dsp_mmx.c
vp6dsp_mmx.h
vp6dsp_sse2.c
vp6dsp_sse2.h
vp8dsp-init.c Use word-writing instead of dword-writing (with two cached but otherwise 2010-07-31 23:13:15 +00:00
vp8dsp.asm Use word-writing instead of dword-writing (with two cached but otherwise 2010-07-31 23:13:15 +00:00
vp56_arith.h
x86inc.asm
x86util.asm