Commit Graph

95 Commits

Author SHA1 Message Date
Jason Garrett-Glaser
1d37e908cf VP8: init one less near_mv
This one didn't actually need to be initialized.
(cherry picked from commit 891b1f15a7)
2011-02-18 19:52:41 +01:00
Jason Garrett-Glaser
8e624c1cee VP8: split out declarations to new header
(cherry picked from commit bcf4568f18)
2011-02-18 19:52:41 +01:00
Jason Garrett-Glaser
625e9309da VP8: faster MV clipping
(cherry picked from commit 7634771e70)
2011-02-18 19:52:40 +01:00
Reinhard Tartler
7ffe76e540 Merge libavcore into libavutil
Done to keep ABI compatible. Otherwise this is just silly
2011-02-16 23:00:30 +01:00
Mans Rullgard
4ae3ee4ae9 VP8: ARM optimised decode_block_coeffs_internal
Approximately 5% faster on Cortex-A8.

Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit a7878c9f73)
2011-02-13 00:52:51 +01:00
Jason Garrett-Glaser
ccba4f4c25 VP8: optimized mv prediction and decoding
Merge find_near_mvs and mv bitstream decoding: don't do prediction steps
until absolutely necessary.
(cherry picked from commit f3d09d44b7)
2011-02-11 02:54:10 +01:00
Jason Garrett-Glaser
a1b0a3c8bd VP8: idct_mb optimizations
Currently uses AV_RL32 instead of AV_RL32A, as the latter doesn't exist yet.
(cherry picked from commit 62457f9052)
2011-02-09 03:33:55 +01:00
Jason Garrett-Glaser
e9266a2be0 VP8: slightly faster loopfilter sharpness logic
(cherry picked from commit 8a2c99b486)
2011-02-06 20:31:44 +01:00
Jason Garrett-Glaser
9efa368f19 VP8: faster deblock strength calculation
Convert hev_thresh logic to a LUT, simplify mbedge_lim calculation.
(cherry picked from commit 79dec1541b)
2011-02-06 20:31:44 +01:00
Jason Garrett-Glaser
c54ac7a8f2 VP8: faster filter_level clip
(cherry picked from commit a1b227bb53)
2011-02-06 20:31:43 +01:00
Jason Garrett-Glaser
8cde1b7997 VP8: simplify lf_delta mb mode logic
(cherry picked from commit dd18c9a050)
2011-02-06 20:31:43 +01:00
Jason Garrett-Glaser
5ad4335c22 VP8: merge chroma MC calls
Adds some duplicated code, but avoids duplicate edge checks and similar.
~0.5% faster overall on Parkjoy test sample.
(cherry picked from commit 64233e702a)
2011-02-02 03:40:49 +01:00
Jason Garrett-Glaser
a4257d74e0 Slightly simplify VP8 inter_predict
Merge an if and a switch.
(cherry picked from commit 73be29b0c4)
2011-01-31 18:25:45 +01:00
Ronald S. Bultje
d23e3e5fea Move ff_emulated_edge_mc() into DSPContext.
(cherry picked from commit 2e27959879)
2011-01-30 03:41:01 +01:00
Ronald S. Bultje
6642a17935 Fix VP8 aliasing problems.
Replace * (uint32_t *) buf accesses with AV_WN32A/AV_COPY32.
(cherry picked from commit 9d4bdcb714)
2011-01-30 03:40:59 +01:00
Diego Elio Pettenò
e7e2df27f8 Add ff_ prefix to data symbols of encoders, decoders, hwaccel, parsers, bsf.
None of these symbols should be accessed directly, so declare them as
hidden.

Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit d36beb3f69)
2011-01-28 03:15:34 +01:00
Ronald S. Bultje
ac3c04e4d6 Don't do edge emulation unless the edge pixels will be used in MC.
Do not emulate larger edges than we will actually use for this round of
MC. Decoding goes from avg+SE 29.972+/-0.023sec to 29.856+/-0.023, i.e.
0.12sec or ~0.4% faster.
(cherry picked from commit 44002d8323)
2011-01-26 03:43:31 +01:00
Ronald S. Bultje
7148da489e Fix valgrind invalid read on top MB rows with CODEC_FLAG_EMU_EDGE set.
Originally committed as revision 26168 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-30 14:33:21 +00:00
Ronald S. Bultje
ee555de7dd Support CODEC_FLAG_EMU_EDGE in VP8 decoder.
Originally committed as revision 26117 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-28 17:37:19 +00:00
Stefano Sabatini
e16f217ceb Use new imgutils.h API names, fix deprecation warnings.
Originally committed as revision 25058 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-07 19:15:29 +00:00
Jason Garrett-Glaser
2b476e02e1 Remove some stray +s in VP8
Originally committed as revision 24791 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-13 02:02:07 +00:00
Pascal Massimino
aa93c52c21 remove b4_stride/mb_stride.
correct mb_xy to use mb_width.
tighten allocations.
reduce the amount of zeroing.

Originally committed as revision 24760 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-11 08:27:38 +00:00
Pascal Massimino
ccf13f9e20 fix over-allocation. confused b4_stride with mb_width.
Originally committed as revision 24758 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-11 05:24:19 +00:00
Stefano Sabatini
6ce9b4310c Remove use of the deprecated function avcodec_check_dimensions(), use
av_check_image_size() instead.

Originally committed as revision 24711 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-06 09:37:04 +00:00
Jason Garrett-Glaser
7e13022a4d VP8: fix bug in prefetch
Motion vectors in VP8 are qpel, not fullpel.

Originally committed as revision 24707 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-05 20:03:54 +00:00
Jason Garrett-Glaser
905ef0d064 VP5/6/8: eliminate CABAC dependency
Create a custom table for VP5/6/8's renorm to avoid depending on H.264's.
Saves one instruction in the arithmetic decoder as well.

Originally committed as revision 24701 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-04 23:04:05 +00:00
Jason Garrett-Glaser
1e73967950 VP8: partially inline decode_block_coeffs
Avoids a function call in the case of empty DCT blocks (most of the time).

Originally committed as revision 24691 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-04 02:23:25 +00:00
Jason Garrett-Glaser
ffbf0794f9 Fix 100L in r24689
Accidentally committed some timing code.

Originally committed as revision 24690 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-04 01:40:58 +00:00
Jason Garrett-Glaser
afb54a85c3 VP8: simplify decode_block_coeffs to avoid having to track nonzero coeffs
Slightly faster.

Originally committed as revision 24689 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-04 01:38:08 +00:00
Jason Garrett-Glaser
b0d5879513 VP8: slightly faster DCT coefficient probability update
Originally committed as revision 24687 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 23:21:47 +00:00
Jason Garrett-Glaser
476be414a4 VP8: make another RAC call branchy
1-2 clocks faster.

Originally committed as revision 24683 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 11:34:24 +00:00
Jason Garrett-Glaser
0908f1b945 VP8: unroll partition type decoding tree
~34% faster partition type decoding.

Originally committed as revision 24681 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 11:10:58 +00:00
Jason Garrett-Glaser
c5dec7f137 VP8: unroll splitmv decoding tree
Much faster splitmv mode decoding.

Originally committed as revision 24680 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 10:37:14 +00:00
Jason Garrett-Glaser
23117d69c1 VP8: unroll MB mode decoding tree
~50% faster MB mode decoding, plus eliminate a costly switch.

Originally committed as revision 24679 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 10:24:28 +00:00
Jason Garrett-Glaser
370b622a45 VP8: eliminate a dereference in coefficient decoding
Originally committed as revision 24671 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-02 22:48:38 +00:00
Jason Garrett-Glaser
f311208cf1 VP8: much faster DC transform handling
A lot of the time the DC block is empty: don't do the WHT in this case.
A lot of the rest of the time, there's only one coefficient: make a special
DC-only transform for that case.
When the block is empty, don't incorrectly mark luma DCT blocks as having DC
coefficients.

Originally committed as revision 24670 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-02 20:57:03 +00:00
Jason Garrett-Glaser
827d43bb9d VP8: move zeroing of luma DC block into the WHT
Lets us do the zeroing in asm instead of C.
Also makes it consistent with the way the regular iDCT code does it.

Originally committed as revision 24668 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-02 20:18:09 +00:00
Pascal Massimino
d2840fa49c only store intra prediction modes on the boundary for keyframes, not as a plane.
inter-frame behaviour unchanged.

Originally committed as revision 24664 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-02 09:44:53 +00:00
Jason Garrett-Glaser
10bf2eebbe VP8: simplify token_prob handling
~1.5% faster decode_block_coeffs

Originally committed as revision 24659 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-02 05:20:38 +00:00
Pascal Massimino
c22b4468a6 prevent access to vp8_coeff_band[16]
Originally committed as revision 24656 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-01 23:20:06 +00:00
Pascal Massimino
a8ab0cccf7 b0rk3d FATE + black helicopters hissing -> rolling back to r24556 and sleeping
Originally committed as revision 24559 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-27 23:09:13 +00:00
Pascal Massimino
62d1f7864e perform the clipping on luma_dc_qmul[1] and chroma_qmul[0] earlier
Originally committed as revision 24558 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-27 22:23:50 +00:00
Pascal Massimino
e7e81959d6 save some copies by moving some fields out of proba[2]
Originally committed as revision 24557 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-27 22:21:49 +00:00
Jason Garrett-Glaser
fca05ea8a0 VP8: add missing free
Fixes a tiny memory leak.

Originally committed as revision 24504 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-26 07:10:30 +00:00
Carl Eugen Hoyos
28e241de5d Fix r24445: Instead of needlessly initialising a variable, silence the warning.
Originally committed as revision 24498 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-25 14:49:45 +00:00
David Conrad
ca18a478e3 VP8: Inline traversing vp8_small_mvtree
Much faster read_mv_component, slightly faster overall

Originally committed as revision 24470 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:46:25 +00:00
David Conrad
7697cdcf95 VP8: Use vp56_rac_get_prob_branchy when the bit is only used by an if()
Originally committed as revision 24469 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:46:20 +00:00
David Conrad
fe1b5d974a Decode DCT tokens by branching to a different code path for each branch
on the huffman tree, instead of traversing the tree in a while loop.

Based on the similar optimization in libvpx's detokenize.c

10% faster at normal bitrates, and 30% faster for high-bitrate intra-only

Originally committed as revision 24468 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:46:17 +00:00
Jason Garrett-Glaser
13a1304bb3 Add myself to VP8 copyright and maintainers.
Also add Ronald to maintainers.

Originally committed as revision 24464 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:42:35 +00:00
Jason Garrett-Glaser
414ac27d8f VP8: always_inline some things to force gcc to do the right thing
Mostly seems to help in the MC code, which gets a hundred cycles faster.

Originally committed as revision 24463 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:36:21 +00:00