FFmpeg

mirror of https://git.ffmpeg.org/ffmpeg.git synced 2024-10-19 13:03:26 +00:00

Author	SHA1	Message	Date
Paul B Mahol	921eb21b1d	avfilter/x86/vf_360: add most of >8 depth asm	2019-09-16 10:21:16 +02:00
James Almer	4857688732	x86/vf_v360: use a faster horizontal add in remap4_8bit_line_avx2 Signed-off-by: James Almer <jamrial@gmail.com>	2019-09-06 12:11:46 -03:00
James Almer	2200cf1aca	x86/vf_v360: make remap{1,2}_8bit_line_avx2 work on x86_32 Signed-off-by: James Almer <jamrial@gmail.com>	2019-09-06 11:11:45 -03:00
Paul B Mahol	058bbf48c6	avfilter/vf_v360: x86 SIMD for interpolations	2019-09-06 14:10:37 +02:00
Ruiling Song	98e419cbf5	avfilter/vf_convolution: add x86 SIMD for filter_3x3() Tested using a simple command (apply edge enhance): ./ffmpeg_g -i ~/Downloads/bbb_sunflower_1080p_30fps_normal.mp4 \ -vf convolution="0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:5:1:1:1:0:128:128:128" \ -an -vframes 1000 -f null /dev/null The fps increase from 151 to 270 on my local machine. Signed-off-by: Ruiling Song <ruiling.song@intel.com>	2019-08-07 14:31:28 +08:00
James Almer	b8f1542dcb	avfilter/vf_gblur: add missing preprocessor check Fixes compilation on x86_32 Signed-off-by: James Almer <jamrial@gmail.com>	2019-06-12 10:54:59 -03:00
Ruiling Song	83f9da7768	avfilter/vf_gblur: add x86 SIMD optimizations The horizontal pass get ~2x performance with the patch under single thread. Tested overall performance using the command(avx2 enabled): ./ffmpeg -i 1080p.mp4 -vf gblur -f null /dev/null ./ffmpeg -i 1080p.mp4 -vf gblur=threads=1 -f null /dev/null For single thread, the fps improves from 43 to 60, about 40%. For multi-thread, the fps improves from 110 to 130, about 20%. Signed-off-by: Ruiling Song <ruiling.song@intel.com>	2019-06-12 08:53:11 +08:00
Paul B Mahol	dcae5ba322	avfilter: add anlmdn filter x86 SIMD optimizations	2019-01-10 21:49:47 +01:00
James Almer	ef67af31ff	x86/af_afir: use three operand form forat some instructions Fixes compilation with old yasm versions. Signed-off-by: James Almer <jamrial@gmail.com>	2019-01-03 23:36:19 -03:00
James Almer	5402c1886b	x86/af_afir: add ff_fcmul_add_avx() fcmul_add_c: 1228.8 fcmul_add_sse3: 334.3 fcmul_add_avx: 186.3 Tested on a Core i5 4460 @ 3.2GHz Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2019-01-03 10:12:19 -03:00
James Almer	82043dfd2e	avfilter/af_afir: split off fcmul_add into a DSP context Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2019-01-03 10:12:18 -03:00
James Almer	9b5bd665e1	x86/af_afir: fix processing the last element ff_fcmul_add_sse3() is now identical to the C version. Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2019-01-03 10:12:18 -03:00
James Almer	3913d6f734	x86/scene_sad: fix link errors when HAVE_X86ASM is not defined Reviewed-by: Haihao Xiang <haihao.xiang@intel.com> Signed-off-by: James Almer <jamrial@gmail.com>	2018-11-21 22:26:07 -03:00
Paul B Mahol	c98a32e4ad	avfilter/vf_blend: add 10bit support	2018-11-15 14:44:24 +01:00
Philip Langdale	1096614c42	avfilter/vf_bwdif: Use common yadif frame management logic After adding field type management to the common yadif logic, we can remove the duplicate copy of that logic from bwdif.	2018-11-14 17:41:01 -08:00
Marton Balint	6c2a7a8e9a	avfilter/vf_framerate: factorize SAD functions which compute SAD for a whole frame Also add SIMD which works on lines because it is faster then calculating it on 8x8 blocks using pixelutils. Signed-off-by: Marton Balint <cus@passwd.hu>	2018-11-11 20:30:50 +01:00
Paul B Mahol	0f0d468fbc	avfilter/vf_overlay: exclude nv12/nv21 formats from x86 asm check They are yet to be supported, Signed-off-by: Paul B Mahol <onemda@gmail.com>	2018-05-03 09:22:28 +02:00
Paul B Mahol	6d7c63588c	avfilter/vf_overlay: add x86 SIMD Specifically for yuv444, yuv422, yuv420 format when main stream has no alpha, and alpha is straight. Signed-off-by: Paul B Mahol <onemda@gmail.com>	2018-05-02 23:58:21 +02:00
Vasile Toncu	9c01cdb94e	avfilter/vf_interlace: remove duplicate code with same funcionality	2018-04-23 23:48:30 +02:00
Martin Vignali	f3df42e81d	avfilter/x86/vf_blend : add SIMD for 16 bit version of grainextract grainmerge average extremity negation	2018-04-05 21:46:16 +02:00
Martin Vignali	8eb0bb1108	avfilter/x86/vf_blend : reorganize DIFFERENCE macro to reduce line duplication between 8bit and 16 bit version	2018-04-05 21:46:11 +02:00
Martin Vignali	53a03b5c8c	avfilter/x86/vf_blend : add 16 bit version for BLEND_SIMPLE, phoenix, difference for SSE and AVX2 (x86_64)	2018-02-24 21:44:19 +01:00
Martin Vignali	6c6c9d14a8	avfilter/x86/vf_blend : indent	2018-02-24 21:44:16 +01:00
Martin Vignali	7590d58b61	avfilter/x86/vf_blend : reorganize init in order to add 16 bit version	2018-02-24 21:44:13 +01:00
Martin Vignali	3a230ce5fa	avfilter/x86/vf_blend : avfilter/x86/vf_blend : add AVX2 version for each func except divide and optimize average, grainextract, multiply, screen, grain merge	2018-01-28 20:21:32 +01:00
Marton Balint	4d95c6d5d7	avfilter/vf_framerate: add SIMD functions for frame blending Blend function speedups on x86_64 Core i5 4460: ffmpeg -f lavfi -i allyuv -vf framerate=60:threads=1 -f null none C: 447548411 decicycles in Blend, 2048 runs, 0 skips SSSE3: 130020087 decicycles in Blend, 2048 runs, 0 skips AVX2: 128508221 decicycles in Blend, 2048 runs, 0 skips ffmpeg -f lavfi -i allyuv -vf format=yuv420p12,framerate=60:threads=1 -f null none C: 228932745 decicycles in Blend, 2048 runs, 0 skips SSE4: 123357781 decicycles in Blend, 2048 runs, 0 skips AVX2: 121215353 decicycles in Blend, 2048 runs, 0 skips Signed-off-by: Marton Balint <cus@passwd.hu>	2018-01-28 18:50:52 +01:00
Martin Vignali	b94cd55155	avfilter/x86/vf_interlace : add AVX2 version	2018-01-11 21:03:19 +01:00
James Almer	8e0e4384b0	Revert "avfilter/vf_interlace : add AVX2 for lowpass_line 8 and 16" This reverts commits `1a5865b6dc` and `8fb1d63d91`. They made fate interlace tests fail when AVX2 was used. Signed-off-by: James Almer <jamrial@gmail.com>	2017-12-19 19:04:25 -03:00
Martin Vignali	3df6e61dad	avfilter/x86/vf_hflip : indent based on patch by Paul B Mahol	2017-12-19 21:10:12 +01:00
Martin Vignali	f181648176	avfilter/x86/vf_hflip : add avx2 version for hflip_byte and hflip_short	2017-12-19 21:10:09 +01:00
Martin Vignali	a4a4179e83	avfilter/x86/vf_hflip : merge hflip byte and hflip short to one macro	2017-12-19 21:10:05 +01:00
Martin Vignali	8fb1d63d91	avfilter/vf_tinterlace : add AVX2 func for lowpass_line 8 and 16	2017-12-19 20:59:59 +01:00
Martin Vignali	1a5865b6dc	avfilter/vf_interlace : add AVX2 for lowpass_line 8 and 16	2017-12-19 20:59:54 +01:00
Martin Vignali	d31770d9a6	avfilter/vf_interlace : move func init in ff_interlace_init and add depth arg for ff_interlace_init_x86	2017-12-19 20:59:47 +01:00
Martin Vignali	3c6dc27035	avfilter/x86/vf_interlace : avfilter/x86/vf_interlace : fix crash when using unaligned data in low_pass complex related to ticket 6491	2017-12-15 11:28:29 +01:00
Martin Vignali	49dced9fd0	avfilter/x86/vf_interlace : avoid crash when data are unaligned ticket 6491	2017-12-15 11:28:25 +01:00
Martin Vignali	869efbf971	avfilter/x86/vf_threshold : add threshold16 SIMD (SSE4 and AVX2)	2017-12-09 14:47:09 +01:00
James Almer	f2aa0ce5a0	x86/vf_hflip: use xor to zero initialize registers Signed-off-by: James Almer <jamrial@gmail.com>	2017-12-07 19:34:12 -03:00
James Almer	dc33fe1d00	x86/vf_hflip: don't load the width argument twice Signed-off-by: James Almer <jamrial@gmail.com>	2017-12-07 19:34:12 -03:00
James Almer	cc2ba526d4	x86/vf_threshold: make threshold8 functions work on x86_32 Signed-off-by: James Almer <jamrial@gmail.com>	2017-12-04 15:46:09 -03:00
Paul B Mahol	5ff0d2acae	avfilter/x86/vf_hflip.asm: fix building on x32 Signed-off-by: Paul B Mahol <onemda@gmail.com>	2017-12-04 15:08:43 +01:00
Paul B Mahol	86fda8be3f	avfilter: add hflip x86 SIMD Signed-off-by: Paul B Mahol <onemda@gmail.com>	2017-12-04 09:58:25 +01:00
James Almer	b73304f79e	x86vf_threshold/: use the PBLENDVB macro Fixes building with yasm Tested-by: stevenliu Signed-off-by: James Almer <jamrial@gmail.com>	2017-12-04 02:22:30 -03:00
Martin Vignali	6e3e696591	avfilter/x86/vf_threshold : cosmetic indent	2017-12-03 19:17:28 +01:00
Martin Vignali	9719d57b34	avfilter/x86/vf_threshold : add avx2 version for threshold 8	2017-12-03 19:17:23 +01:00
Martin Vignali	51345cb1d5	avfilter/x86/vf_threshold : make macro for threshold8 in order to add avx2 version	2017-12-03 19:17:19 +01:00
Paul B Mahol	bbfcb1b7c8	avfilter/vf_threshold: add x86 SIMD Signed-off-by: Paul B Mahol <onemda@gmail.com>	2017-12-02 14:58:56 +01:00
James Almer	2904db9045	Merge commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2' * commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2': x86util: Port all macros to cpuflags See `d5f8a642f6` Merged-by: James Almer <jamrial@gmail.com>	2017-10-21 12:15:57 -03:00
Thomas Mundt	40bfaa190c	avfilter/interlace: add support for 10 and 12 bit Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Thomas Mundt <tmundt75@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	2017-09-23 16:19:58 -03:00
Thomas Mundt	a7f6bfdc18	avfilter/interlace: prevent over-sharpening with the complex low-pass filter The complex vertical low-pass filter slightly over-sharpens the picture. This becomes visible when several transcodings are cascaded and the error potentises, e.g. some generations of HD->SD SD->HD. To prevent this behaviour the destination pixel must not exceed the source pixel when the average of the pixels above and below is less than the source pixel. And the other way around. Tested and approved in a visual transcoding cascade test by video professionals. SSIM/PSNR test with the first generation of an HD->SD file as a reference against the 6th generation(3 x SD->HD HD->SD): Results without the patch: SSIM Y:0.956508 (13.615881) U:0.991601 (20.757750) V:0.993004 (21.551382) All:0.974405 (15.918463) PSNR y:31.838009 u:48.424280 v:48.962711 average:34.759466 min:31.699297 max:40.857847 Results with the patch: SSIM Y:0.970051 (15.236232) U:0.991883 (20.905857) V:0.993174 (21.658049) All:0.981290 (17.279202) PSNR y:34.412108 u:48.504454 v:48.969496 average:37.264644 min:34.310637 max:42.373392 Signed-off-by: Thomas Mundt <tmundt75@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2017-09-15 22:40:21 +02:00

1 2 3 4 5 ...

269 Commits