Commit Graph

7 Commits

Author SHA1 Message Date
Mirjana Vulin
8d2eb5fe58 mips: optimization for float aac decoder (sbr module)
Signed-off-by: Mirjana Vulin <mvulin@mips.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-21 22:43:08 +01:00
Janne Grunau
f101eab1be x86: call most of the x86 dsp init functions under if (ARCH_X86)
Rename the called dsp init functions to *_init_x86.
2012-10-08 11:54:05 +02:00
Christophe GISQUET
dabf8dd34a SBR DSP: unroll sum_square
The length is even, so some unrolling can be performed. Timings are for x86:
- 32bits: 102c -> 82c
- 64bits:  82c -> 69c

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-03-07 10:29:52 -08:00
Christophe GISQUET
34454c761f SBR DSP x86: implement SSE sbr_sum_square_sse
The 32bits targets have been compiled with -mfpmath=sse for proper reference.
sbr_sum_square C  /32bits: 82c (unrolled)/102c
               C  /64bits: 69c (unrolled)/82c
               SSE/32bits: 42c
               SSE/64bits: 31c

Use of SSE4.1 dpps to perform the final sum is slower.
Not unrolling to perform 8 operations in a loop yields 10 more cycles.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-02-23 15:50:06 -08:00
Christophe GISQUET
2e74a5abc2 SBR DSP: use intptr_t for the ixh parameter.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-02-23 15:48:40 -08:00
Mans Rullgard
be822d77b6 aacsbr: ARM NEON optimised sbrdsp functions
Overall speedup of HE-AAC decoding 2.3x on Cortex-A8, 1.2x on A9.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-01-28 14:56:18 +00:00
Mans Rullgard
aac46e088d aacsbr: move some simdable loops to function pointers
This prepares for assembly optimisations by moving the most
time-consuming loops to functions called through pointers
in a new context.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-01-28 14:56:18 +00:00