Commit Graph

19 Commits

Author SHA1 Message Date
Michael Niedermayer
fb3c33f3cd Merge commit '4cb6964244fd6c099383d8b7e99731e72cc844b9'
* commit '4cb6964244fd6c099383d8b7e99731e72cc844b9':
  dcadec: simplify decoding of VQ high frequencies

Conflicts:
	configure
	libavcodec/dcadec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-28 21:41:19 +01:00
Christophe Gisquet
4cb6964244 dcadec: simplify decoding of VQ high frequencies
The vector dequantization has a test in a loop preventing effective SIMD
implementation. By moving it out of the loop, this loop can be DSPized.

Therefore, modify the current DSP implementation. In particular, the
DSP implementation no longer has to handle null loop sizes.

The decode_hf implementations have following timings:

For x86 Arrandale:
        C  SSE SSE2 SSE4
win32: 260 162  119  104
win64: 242 N/A   89   72

The arm NEON optimizations follow in a later patch as external asm. The
now unused check for the y modifier in arm inline asm is removed from
configure.
2014-02-28 13:03:22 +01:00
Michael Niedermayer
df98b36aa6 Merge commit '5c1c6e82261b856214499b9fef3a08bf3ff6e0ae'
* commit '5c1c6e82261b856214499b9fef3a08bf3ff6e0ae':
  dca: include dcadsp.h in {arm,x86}/dca.h for checkheaders

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-08 17:25:31 +01:00
Janne Grunau
5c1c6e8226 dca: include dcadsp.h in {arm,x86}/dca.h for checkheaders 2014-02-08 13:38:36 +01:00
Christophe Gisquet
481a46a462 dcadsp: add int8x8_fmul_int32 to DSP context
It is currently declared as a macro who is set to inlinable functions,
among which a Neon and a default C implementations.

Add a DSP parameter to each inline function, unused except by the
default C implementation which calls a function from the DSP context.

On an Arrandale CPU, gain for an inlined SSE2 function vs. a call:
- Win32: 29 to 26 cycles
- Win64: 25 to 23 cycles

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-08 00:55:42 +01:00
Christophe Gisquet
2bd44cb705 dcadsp: add int8x8_fmul_int32 to dsp context
It is currently declared as a macro who is set to inlinable functions,
among which a Neon and a default C implementations.

Add a DSP parameter to each inline function, unused except by the
default C implementation which calls a function from the DSP context.

On an Arrandale CPU, gain for an inlined SSE2 function vs. a call:
- Win32: 29 to 26 cycles
- Win64: 25 to 23 cycles

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-07 22:51:59 +01:00
Thilo Borgmann
d814a839ac Reinstate proper FFmpeg license for all files. 2013-08-30 15:47:38 +00:00
Christophe Gisquet
b6293e2798 fmtconvert: Explicitly use int32_t instead of int
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-07-17 11:02:47 +03:00
Christophe Gisquet
f49564c607 fmtconvert: int32_t input to int32_to_float_fmul_scalar
It was previously declared as int.
Does not change fate results for x86.

Conflicts:

	libavcodec/ppc/fmtconvert_altivec.c

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-05-18 18:01:16 +02:00
Reimar Döffinger
8067f55edf Fix compilation on ARM with android gcc 4.7
With the current code it fails due to running out
of registers.
So code the store offsets manually into the assembler
instead.
Passes "make fate-dts".

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2013-04-15 09:04:07 +02:00
Michael Niedermayer
9b1a0c2ee8 Merge commit '76b19a3984359b3be44d4f7e4e69b7b86729a622'
* commit '76b19a3984359b3be44d4f7e4e69b7b86729a622':
  Fix a number of incorrect intmath.h #includes.
  avconv: remove an unused variable

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-26 12:56:55 +01:00
Diego Biurrun
76b19a3984 Fix a number of incorrect intmath.h #includes. 2013-02-26 00:51:34 +01:00
Michael Niedermayer
89c8eaa321 Merge commit '637606de2d2e0af0a9fa2f23f943765d7d7c5cd5'
* commit '637606de2d2e0af0a9fa2f23f943765d7d7c5cd5':
  configure: arm: make _inline arch ext symbols depend on inline_asm
  arm: use HAVE*_INLINE/EXTERNAL macros for conditional compilation

Conflicts:
	configure
	libavcodec/arm/dca.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-12-08 14:19:55 +01:00
Mans Rullgard
a7831d509f arm: use HAVE*_INLINE/EXTERNAL macros for conditional compilation
These macros reflect the actual capabilities required here.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-12-07 16:54:03 +00:00
Michael Niedermayer
a593f5bd9d arm/dca: Fix compilation of decode_blockcodes() with --enable-thumb
Fix Suggested-by: Reimar
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-09-16 19:06:11 +02:00
Mans Rullgard
f7de52354f ARM: dca: disable optimised decode_blockcodes() for old gcc
Old gcc versions have trouble compiling this function, and
no simple, targeted test is possible.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-15 01:02:58 +00:00
Mans Rullgard
00a856e3f9 dca: ARMv6 optimised decode_blockcode()
This is a hand-tuned version of the code with impossible parts of
the FASTDIV function ommitted.

2-5% faster overall on Cortex-A8.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-11-25 13:19:53 +00:00
Mans Rullgard
6308729e68 ARM: check for inline asm 'y' operand modifier support
The inline asm added in bf5d46d uses the 'y' modifier which
is only supported from gcc 4.5.  This check allows building
with older compilers.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-10-03 08:56:24 +01:00
Mans Rullgard
bf5d46d8e6 dca: NEON optimised high freq VQ decoding
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-09-30 19:01:23 +01:00