Commit Graph

363 Commits

Author SHA1 Message Date
James Almer
621e2625e0 swscale/x86/output: add missing AVX2 support preprocessor wrappers
Fixes compilation with old yasm

Signed-off-by: James Almer <jamrial@gmail.com>
2020-08-20 15:14:56 -03:00
James Almer
ba3e771a42 x86/yuv2rgb: fix crashes when storing data on unaligned buffers
Regression since fc6a5883d6 on SSSE3 enabled
CPUs.

Fixes ticket #8747

Signed-off-by: James Almer <jamrial@gmail.com>
2020-07-14 14:06:04 -03:00
Nelson Gomez
bc01337db4 swscale/x86/output: add AVX2 version of yuv2nv12cX
256 bits is just wide enough to fit all the operands needed to vectorize
the software implementation, but AVX2 is needed to for a couple of
instructions like cross-lane permutation.

Output is bit-for-bit identical to C.

Signed-off-by: Nelson Gomez <nelson.gomez@microsoft.com>
2020-06-14 16:34:07 +01:00
Ruiling Song
4700f7d6fc swscale/swscale: remove useless code
Signed-off-by: Ruiling Song <ruiling.song@intel.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-04-03 00:58:07 +02:00
Ting Fu
828f7db5d9 libswscale/x86/yuv2rgb: Fix Segmentation Fault when load unaligned data
Fixes ticket #8532

Signed-off-by: Ting Fu <ting.fu@intel.com>
2020-02-26 11:10:46 +01:00
Ting Fu
fc6a5883d6 libswscale/x86/yuv2rgb: add ssse3 version
Tested using this command:
/ffmpeg -pix_fmt yuv420p -s 1920*1080 -i ArashRawYuv420.yuv \
-vcodec rawvideo -s 1920*1080 -pix_fmt rgb24 -f null /dev/null

The fps increase from 389 to 640 on Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz

Signed-off-by: Ting Fu <ting.fu@intel.com>
2020-02-10 15:08:33 +01:00
Ting Fu
e934194b6a libswscale/x86/yuv2rgb: Change inline assembly into nasm code
The original inline assembly and nasm code have the same fps when called by command.
NASM code almost has no impact on the perfromance.

Signed-off-by: Ting Fu <ting.fu@intel.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2020-02-05 17:41:59 +01:00
Andreas Rheinhardt
736c7c20e7 swscale/x86/swscale: Fix undefined left shifts of negative numbers
This affected many FATE-tests: The number of failing tests went down
from 663 to 344. (Both numbers exclude tests that failed because of
unaligned accesses in code that is inside #if HAVE_FAST_UNALIGNED.)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2019-09-28 17:24:32 +02:00
Philip Langdale
cd48318035 swscale: Add support for NV24 and NV42
The implementation is pretty straight-forward. Most of the existing
NV12 codepaths work regardless of subsampling and are re-used as is.
Where necessary I wrote the slightly different NV24 versions.

Finally, the one thing that confused me for a long time was the
asm specific x86 path that did an explicit exclusion check for NV12.
I replaced that with a semi-planar check and also updated the
equivalent PPC code, which Lauri kindly checked.
2019-05-12 07:51:02 -07:00
Martin Vignali
658bbc0060 swscale/x86/rgb2rgb.asm : add Ivo Van Poorten name to the top of the file
suggested by Carl Eugen Hoyos
2018-10-18 21:43:19 +02:00
Martin Vignali
296609f859 swscale/x86/rgb2rgb : port shuffle 2103 mmxext to external asm and remove inline asm version 2018-10-13 14:12:41 +02:00
Martin Vignali
04afdbb560 swscale/x86/rgb2rgb : remove mmx version for shuffle2103 2018-10-13 14:12:36 +02:00
Sergey Lavrushkin
582bc5a348 libswscale: Adds conversions from/to float gray format.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2018-08-14 18:22:39 +02:00
Martin Vignali
07a566e7d6 swscale/swscale_unscaled : add X86_64 (SSE2 and AVX) for uyvyto422
and checkasm test
2018-04-22 19:15:32 +02:00
Martin Vignali
1ba5ca2d72 swscale/rgb : add X86 SIMD (SSSE3), for shuffle_bytes_1230, shuffle_bytes_3012, shuffle_bytes_3210 2018-03-24 20:22:08 +01:00
Martin Vignali
923a324174 swscale/rgb : add X86 SIMD (SSSE3) for shuffle_bytes_2103 and shuffle_bytes_0321 2018-03-24 20:21:58 +01:00
Thomas Köppe
43171a2a73 Fix missing used attribute for inline assembly variables
Variables used in inline assembly need to be marked with attribute((used)).
Static constants already were, via the define of DECLARE_ASM_CONST.
But DECLARE_ALIGNED does not add this attribute, and some of the variables
defined with it are const only used in inline assembly, and therefore
appeared dead. This change adds a macro DECLARE_ASM_ALIGNED that marks
variables as used.

This change makes FFMPEG work with Clang's ThinLTO.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2017-11-13 03:58:34 +01:00
James Almer
2904db9045 Merge commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2'
* commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2':
  x86util: Port all macros to cpuflags

See d5f8a642f6

Merged-by: James Almer <jamrial@gmail.com>
2017-10-21 12:15:57 -03:00
Diego Biurrun
fd502f4f5f build: Generalize yasm/nasm-related variable names
None of them are specific to the YASM assembler.

(Cherry-picked from libav commit 39e208f4d4)

Signed-off-by: James Almer <jamrial@gmail.com>
2017-06-21 17:00:29 -03:00
Diego Biurrun
994c4bc107 x86util: Port all macros to cpuflags
Also do some small cosmetic changes: Drop pointless _MMX suffix from ABSD2
macro name, drop pointless check for MMX support, we always assume MMX is
available in our SIMD code, fix spelling.
2017-03-14 17:23:32 +01:00
Diego Biurrun
39e208f4d4 build: Generalize yasm/nasm-related variable names
None of them are specific to the YASM assembler.
2017-03-01 10:18:15 +01:00
Andreas Cadhalpun
319438e2f2 swscale: save ebx register when it is not available
Configure checks if the ebx register can be used for asm and it has to
be saved if and only if this is not the case.
Without this the build fails when configuring with --toolchain=hardened
--disable-pic on i386 using gcc 4.8:
error: PIC register clobbered by '%ebx' in 'asm'

In that case gcc 4.8 reserves the ebx register for the GOT needed for
PIE, so it can't be used in asm directly.

Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
2016-12-21 01:26:10 +01:00
Michael Niedermayer
d736b52a04 swscale: Drop is9_OR_10BPS() use, its name is not correct
Found-by: Luca Barbato
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-11-10 00:33:12 +01:00
Michael Niedermayer
f59750641a swscale: x86: Add some forgotten 12-bit planar YUV cases
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2016-10-12 17:39:30 +02:00
Michael Niedermayer
328ea6a9a5 swscale: Add input support for 12-bit formats
Implemented for AV_PIX_FMT_GBRP12.

Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2016-10-12 17:39:30 +02:00
Luca Barbato
2b5b1e1e9b swscale: Rename is9_OR_10 to match what it does
It is used to select functions that work with 9-15bits.
2016-09-27 18:48:30 +02:00
Luca Barbato
e87a501e7d swscale: Update bitdepth range check
Make sure the scaling functions for the 9-15bits are used for
9-15bits bit depths correctly.
2016-09-27 17:17:54 +02:00
Timo Rothenpieler
99882d05a6 swscale: add support for P010LE/BE output 2016-08-31 13:19:46 +02:00
Diego Biurrun
facdfe4080 swscale: Add proper ff_ prefix to init functions
They are internal symbols that should not be exported.

based on a patch by Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2016-07-16 17:38:37 +02:00
Matthieu Bouron
9eb3da2f99 asm: FF_-prefix internal macros used in inline assembly
See merge commit '39d6d3618d48625decaff7d9bdbb45b44ef2a805'.
2016-06-27 17:21:18 +02:00
Hendrik Leppkes
c142dc203e Merge commit 'dc40a70c5755bccfb1a1349639943e1f408bea50'
* commit 'dc40a70c5755bccfb1a1349639943e1f408bea50':
  Drop unnecessary libavutil/x86/asm.h #includes

Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
2016-06-26 15:53:00 +02:00
Clément Bœsch
8ef57a0d61 Merge commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb'
* commit '41ed7ab45fc693f7d7fc35664c0233f4c32d69bb':
  cosmetics: Fix spelling mistakes

Merged-by: Clément Bœsch <u@pkh.me>
2016-06-21 21:55:34 +02:00
Diego Biurrun
dc40a70c57 Drop unnecessary libavutil/x86/asm.h #includes 2016-05-28 19:18:26 +02:00
Diego Biurrun
1e9c5bf4c1 asm: FF_-prefix internal macros used in inline assembly
These warnings conflict with system macros on Solaris, producing
truckloads of warnings about macro redefinition.
2016-05-28 19:18:26 +02:00
Vittorio Giovara
41ed7ab45f cosmetics: Fix spelling mistakes
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2016-05-04 18:16:21 +02:00
Diego Biurrun
0f40c90984 Drop pointless assert.h #includes 2016-05-03 15:45:10 +02:00
Pedro Arthur
6de58b4903 swscale: cleanup unused code
Removed previous swscale code under '#ifndef NEW_FILTER'
and removed unused fields of SwsContext
2016-03-31 16:36:16 -03:00
Michael Niedermayer
f6492a2ea8 swscale/x86/output: Fix yuv2planeX_16* with unaligned destination
Reviewed-by: BBB
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-02-17 04:47:51 +01:00
Michael Niedermayer
d07f6e5f1c swscale/x86/output: Move code into yuv2planeX_mainloop
Reviewed-by: BBB
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-02-17 04:47:34 +01:00
Derek Buitenhuis
21f9468402 avutil: Rename FF_CEIL_COMPAT to AV_CEIL_COMPAT
Libav, for some reason, merged this as a public API function. This will
aid in future merges.

A define is left for backwards compat, just in case some person
used it, since it is in a public header.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2016-01-27 16:36:46 +00:00
Michael Niedermayer
c8a9aaab26 swscale/x86/rgb2rgb_template: Fix planar2x() for short width
Fixes: 451b3e0cf956c0bd2f27ed753ac24050/asan_heap-oob_2873c01_3231_7ed10a9464d15f0d57277f5917c566a8.AVI

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2016-01-17 12:34:34 +01:00
Clément Bœsch
e8bc642202 lavu: add AV_CEIL_RSHIFT and use it in various places
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
2016-01-11 15:32:56 -05:00
Michael Niedermayer
a066ff89bc swscale/x86/rgb2rgb_template: Fallback to mmx in interleaveBytes() if the alignment is insufficient for SSE*
This also as a sideeffect fixes the non aligned case

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-15 03:06:33 +01:00
Michael Niedermayer
80bfce35cc swscale/x86/rgb2rgb_template: Do not crash on misaligend stride
Fixes Ticket5013

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-12-15 02:32:23 +01:00
Hendrik Leppkes
5d8e836d0e Replace all remaining occurances of step/depth_minus1 and offset_plus1 2015-09-08 17:10:48 +02:00
Pedro Arthur
62d176de12 swscale: refactor vertical scaler 2015-08-19 10:43:52 -03:00
Pedro Arthur
ed80dec621 swscale: fixed compiler warnings
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-08-18 22:56:50 +02:00
Pedro Arthur
e0a3173a94 swscale: refactor horizontal scaling
+ split color conversion from scaling
- disabled gamma correction, until it's refactored too

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-08-18 01:33:32 +02:00
Kevin Coyle
1262711388 YUV->BGR32 MMX support
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2015-07-04 00:03:45 +02:00
James Almer
e22edbfd41 swscale/x86/rgb2rgb_template: fix signedness of v in shuffle_bytes_2103_{mmx,mmxext}
Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
2015-06-23 13:28:09 -03:00