Commit Graph

7 Commits

Author SHA1 Message Date
Janne Grunau
9e12002f11 rv34: add NEON rv34_idct_add
Overall almost 4% faster, idct_add down from 350 to 85 cycles, idct_dc_add
down from 83 to 30 cycles.

squash: rv34 idct rearrange partial register loads
2012-01-16 19:26:41 +01:00
Christophe GISQUET
9ba9c34024 rv34: 1-pass inter MB reconstruction
Implement 1-pass inverse transform and reconstruction for inter blocks.
2012-01-16 19:26:41 +01:00
Mans Rullgard
81dc6a2a3c ARM: rv34: fix asm syntax in dc transform functions
Signed-off-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2012-01-12 22:11:13 +01:00
Janne Grunau
e1e369049e rv34: NEON optimised dc only inverse transform
30-50% faster than the C implementation, 0.5% overall speedup on
bourne.rmvb.
2012-01-12 18:33:55 +01:00
Christophe GISQUET
98f24ecd6c rv34: joint coefficient decoding and dequantization
Perform dequantization while decoding coefficients instead of performing it
on the entire coefficients buffer.

Since quantized coefficients are very sparse, this usually causes a small
speedup. Speedup of around 1% on Panda board compared to the removed here
neon code. Global speedup is probably around 3%.

Signed-off-by: Kostya Shishkov <kostya.shishkov@gmail.com>
2012-01-04 10:30:01 +01:00
Mans Rullgard
4722a03c75 rv34: NEON optimised 4x4 dequant
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-13 12:06:21 +00:00
Janne Grunau
42d32cf53c rv34: NEON optimised inverse transform functions
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-12-06 13:48:24 +00:00