Commit Graph

6315 Commits

Author SHA1 Message Date
James Almer
70c6b904be x86/intreadwrite: add missing casts to pointer arguments
Should make strict compilers happy.

Also, make AV_COPY128 use integer operations while at it. Removing the
inclusion of immintrin.h ensures a lot less intrinsic related headers are
included as well, which fixes a clash of defines with some Clang versions.

Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: James Almer <jamrial@gmail.com>
2024-07-11 18:24:26 -03:00
Zhao Zhili
906b883e7b avutil/executor: Fix stack overflow due to recursive call
av_executor_execute run the task directly when thread is disabled.
The task can schedule a new task by call av_executor_execute. This
forms an implicit recursive call. This patch removed the recursive
call.
2024-07-11 20:26:23 +08:00
Zhao Zhili
54f9469fa1 avutil/executor: Fix missing check before using mutex 2024-07-11 20:24:11 +08:00
James Almer
1a86a7a48d x86/intreadwrite: fix include of config.h
Should fix make checkheaders.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-07-10 13:52:52 -03:00
James Almer
15056dd650 x86/intreadwrite.h: add missing preprocessor checks
Removed by accident in the previous commits. This makes the code only run when
compiled with GCC and Clang like before. Support for other compilers like msvc
can be added later.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-07-10 13:49:21 -03:00
James Almer
bd1bcb07e0 x86/intreadwrite: use intrinsics instead of inline asm for AV_COPY128
This has the benefit of removing any SSE -> AVX penalty that may happen when
the compiler emits VEX encoded instructions.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-07-10 13:25:44 -03:00
James Almer
4a04cca69a x86/intreadwrite: use intrinsics instead of inline asm for AV_ZERO128
When called inside a loop, the inline asm version results in one pxor
unnecessarely emitted per iteration, as the contents of the __asm__() block are
opaque to the compiler's instruction scheduler.
This is not the case with intrinsics, where pxor will be emitted once with any
half decent compiler.

This also has the benefit of removing any SSE -> AVX penalty that may happen
when the compiler emits VEX encoded instructions.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-07-10 13:25:44 -03:00
Michael Niedermayer
e9e8bea2e7
avutil/wchar_filename: Correct sizeof
Fixes: CID1591930 Wrong sizeof argument

Sponsored-by: Sovereign Tech Fund
Reviewed-by: Steve Lhomme <robux4@ycbcr.xyz>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-10 18:10:10 +02:00
Michael Niedermayer
628ba061c8
avutil/hwcontext_d3d11va: correct sizeof IDirect3DSurface9
Fixes: CID1591944 Wrong sizeof argument

Sponsored-by: Sovereign Tech Fund
Reviewed-by: Steve Lhomme <robux4@ycbcr.xyz>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-10 18:10:09 +02:00
Michael Niedermayer
cf22f944d5
avutil/hwcontext_d3d11va: Free AVD3D11FrameDescriptor on error
Fixes: CID1598558 Resource leak

Sponsored-by: Sovereign Tech Fund
Reviewed-by: Steve Lhomme <robux4@ycbcr.xyz>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-10 18:10:09 +02:00
Michael Niedermayer
698ed0d5a5
avutil/hwcontext_d3d11va: correct sizeof AVD3D11FrameDescriptor
Fixes: CID1591909 Wrong sizeof argument

Sponsored-by: Sovereign Tech Fund
Reviewed-by: Steve Lhomme <robux4@ycbcr.xyz>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-07-10 18:10:09 +02:00
Zhao Zhili
85706f5136 avutil/hwcontext_videotoolbox: Fix version check
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-07-09 21:39:09 +08:00
Marvin Scholz
cd9ceaef22 avutil/hwcontext_videotoolbox: Set CVBuffer CGColorSpace
In addition to the other properties, try to obtain the right
CGColorSpace and set it as well, else it could lead to a CVBuffer
tagged as BT.2020 but with a CGColorSpace indicating BT.709.

Therefore it is essential for consistency to set a colorspace
according to the other values, or if none can be obtained (for example
because the other values are all unspecified) unset it as well.

Fix #10884

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-07-05 19:13:43 +08:00
Marvin Scholz
b4f9fcc63c avutil/hwcontext_videotoolbox: Update documentation
The documentation was not clear at all what specifically the
function does, so it was left unspecified if it will unset or
not touch attachments it could not map from the AVFrame.

The documentation of the return  value was wrong as well.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-07-05 19:13:43 +08:00
Marvin Scholz
1fa7554bd6 avutil/hwcontext_videotoolbox: Unset undefined values
When mapping AVFrame properties to the CVBuffer attachments, it is
necessary to properly delete undefined attachments, else we can
leave incorrect values in there guessed from VideoToolbox for
example, leading to inconsistent results where the AVFrame and
CVBuffer differ in metadata.

Ref #10884

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-07-05 19:13:43 +08:00
Tong Wu
d822146f4f avutil/hwcontext_d3d12va: add Flags for resource creation
Flags field is added to support diffferent resource creation.

Signed-off-by: Tong Wu <tong1.wu@intel.com>
2024-07-02 14:15:12 +02:00
Marton Balint
0d5e3f5a40 avutil/timestamp: avoid possible FPE when 0 is passed to av_ts_make_time_string2()
Signed-off-by: Marton Balint <cus@passwd.hu>
2024-06-30 09:11:44 +02:00
Rémi Denis-Courmont
d5e603ddc0 lavu/lls: remove useless VSETVL
This changes neither VL nor VTYPE, so it can safely be removed.
2024-06-29 21:03:44 +03:00
James Almer
8af0919cc6 avutil/stereo3d: add a Stereo3D view to signal that the view is unspecified
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-28 13:16:57 -03:00
James Almer
1c8b32e19f avutil/stereo3d: add a Stereo3D type to signal that the packing is unspecified
Given that a video stream/frame may have only one view or both views coded with
the packing information being unavailable, this commit adds a new type value
AV_STEREO3D_UNSPEC for this purpose.
The most common case for this is container level signaling of Stereo3D video
where the specifics are defined at the bitstream level.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-28 13:16:57 -03:00
Zhao Zhili
baf3123c1c avutil/executor: Allowing thread_count be zero
Before the patch, disable threads support at configure/build time
was the only method to force zero thread in executor. However,
it's common practice for libavcodec to run on caller's thread when
user specify thread number to one. And for WASM environment, whether
threads are supported needs to be detected at runtime. So executor
should support zero thread at runtime.

A single thread executor can be useful, e.g., to handle network
protocol. So we can't take thread_count one as zero thread, which
disabled a valid usercase.

Other libraries take -threads 0 to mean auto. Executor as a low
level utils doesn't do cpu detect. So take thread_count zero as
zero thread, literally.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-27 20:54:42 +08:00
J. Dekker
e61fed8280 avutil/riscv/cpu: fix __riscv_v_min_vlen typo
Signed-off-by: J. Dekker <jdek@itanimul.li>
2024-06-26 12:50:02 +02:00
Brad Smith
41190da9e1 aarch64: Add OpenBSD runtime detection of dotprod and i8mm using sysctl
Signed-off-by: Brad Smith <brad@comstyle.com>
2024-06-26 02:06:53 -04:00
James Almer
e6baf4f384 avutil/stereo3d: add a new allocator function that returns a size
av_stereo3d_alloc() is not useful in scenarios where you need to know the
runtime size of AVStereo3D.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-25 00:01:05 -03:00
Lynne
dae12ddb2e
lavu/stereo3d: change the horizontal FOV field to a rational
This avoids hardcoding any implementation-specific limitiations as
part of the API, and allows for future expandability.

This also allows API users to more conveniently convert the
values into floats without hardcoding specific conversion constants.

The API was committed a few days ago, so changing this field now
is within the realms of acceptable.
2024-06-24 23:53:25 +02:00
Cosmin Stejerean
cc587e69c6 avutil/dovi_meta: add fields for ext_mapping_idc
Co-authored-by: Niklas Haas <git@haasn.dev>
Signed-off-by: Niklas Haas <git@haasn.dev>
2024-06-22 15:48:23 +02:00
James Almer
c3606cad9c avutil/stereo3d: set a sane default value for AVRational fields
Prevent potential divisions by 0 when using them immediately after allocation.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-20 17:03:55 -03:00
James Almer
1044c09eca avutil/mastering_display_metadata: set a sane default value for AVRational fields
Prevent potential divisions by 0 when using them immediately after allocation.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-20 17:02:50 -03:00
James Almer
7f1b590480 avutil/ambient_viewing_environment: set a sane default value for AVRational fields
Prevent potential divisions by 0 when using them immediately after allocation.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-20 17:02:50 -03:00
Derek Buitenhuis
cf2436a0b4 avutil/stereo3d: Fill out stereo info provided by Vision Pro files
Based on what is in the files themselves, and what the API provides
to users.

URLs:
  * https://developer.apple.com/documentation/videotoolbox/kvtcompressionpropertykey_heroeye
  * https://developer.apple.com/documentation/videotoolbox/kvtcompressionpropertykey_stereocamerabaseline
  * https://developer.apple.com/documentation/videotoolbox/kvtcompressionpropertykey_horizontaldisparityadjustment
  * https://developer.apple.com/documentation/coremedia/kcmformatdescriptionextension_horizontalfieldofview

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2024-06-18 14:47:40 +01:00
Derek Buitenhuis
57bfba35d6 avutil/spherical: Add more spherical types
These originate from the Apple Vision Pro, and are documented here:

    https://developer.apple.com/documentation/coremedia/cmprojectiontype

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2024-06-18 14:47:40 +01:00
Zhao Zhili
e598a323dc avutil/macos_kperf: Fix assert which makes kperf failed to run
On m1, kpc_get_counter_count(KPC_MASK) return 8 in my test. The
exact value doesn't matter in our case, as long as we have a
sufficiently large array

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-18 15:24:12 +08:00
Zhao Zhili
ec1daa39e0 avutil/timer: Fix missing header for mach_absolute_time
mach/mach_time.h was included only when CONFIG_MACOS_KPERF wasn't
been defined.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-18 15:24:02 +08:00
Martin Storsjö
ab8f7030bc aarch64: Use cntvct_el0 as timer register on Android and macOS
The default timer register pmccntr_el0 usually requires enabling
access with e.g. a kernel module (while it is accessible by
default on Windows). On Linux, the default for checkasm benchmarks
is to use perf (if suitable headers are available) though.

On macOS, using cntvct_el0 gives measurements with the same
magnitude as mach_absolute_time (which is used currently), but
possibly with a little less overhead/noise.

Signed-off-by: Martin Storsjö <martin@martin.st>
2024-06-17 14:00:34 +03:00
Rémi Denis-Courmont
4819aeebf4 avr32: remove explicit support
The vendor has long since switched to Arm, with the last product
reaching their official end-of-life over 11 years ago. Linux support for
the ISA was dropped 7 years ago. More importantly, this architecture was
never supported by upstream GCC, and the vendor fork is stuck at version
4.2, which FFmpeg no longer supports (as per C11 requirement).

Presumably, this is still the case given the lack of vendor support.
Indeed all of the code being removed here consisted of inline assembler
scalar optimisations. A sane C compiler should be able to perform those
automatically nowadays (with the sole exception of fast CLZ detection),
but this is moot as this architecture is evidently dead.
2024-06-14 21:28:10 +03:00
Tomas Härdin
be2cabce32 lavu/intmath.h: Fix UB in ff_ctz_c() and ff_ctzll_c()
Found by value analysis
2024-06-14 14:28:25 +02:00
Tomas Härdin
3b9e457647 lavu/common.h: Fix UB in av_clip_uintp2_c()
Found by value analysis
2024-06-14 14:28:25 +02:00
Tomas Härdin
60ab40be70 lavu/common.h: Fix UB in av_clip_intp2_c()
Found by value analysis
2024-06-14 14:28:25 +02:00
Tomas Härdin
818a487849 lavu/common.h: Fix UB in av_clipl_int32_c()
Found by value analysis
2024-06-14 14:28:24 +02:00
James Almer
4b57ea8fc7 avutil/common: assert that bit position in av_zero_extend is valid
Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-13 20:36:09 -03:00
James Almer
39c90d6466 avutil: rename av_mod_uintp2 to av_zero_extend
It's more descriptive of what it does.

Signed-off-by: James Almer <jamrial@gmail.com>
2024-06-13 20:35:57 -03:00
Rémi Denis-Courmont
c5f69719bc lavu/bswap: remove some inline assembler
C code or compiler built-ins are preferable over inline assembler for
byte-swaps as it allows for better optimisations (e.g. instruction
scheduling) which would otherwise be impossible.

As with f64c2e710f for x86 and Arm,
this removes the inline assembler on GCC (and Clang) since we now
require recent enough compiler versions. This indeed seems to work on
AArch64, SuperH and, if Zbb is enabled, RISC-V. (AVR32 was not tested
since it has no known working compilers at this time.)
2024-06-13 21:16:16 +03:00
Rémi Denis-Courmont
0231097d1b lavu/x86: remove GCC 4.4- stuff
Since the C11 support is required, those GCC versions can no longer be
supported anyhow. (Clang pretends to be GCC 4.4, but it looks like the
code was intended for old GCC specifically.)
2024-06-13 21:16:16 +03:00
Rémi Denis-Courmont
424ac84839 lavu/arm: remove GCC 4.6- stuff
Since the C11 support is required, those GCC versions can no longer be
supported anyhow. (Clang pretends to be GCC 4.4, but the removed code
does not seem to have been intended for Clang.)
2024-06-13 21:16:16 +03:00
Haihao Xiang
a4630d479a lavu/hwcontext_vulkan: Support write on drm frame
Otherwise nothing is written into the destination when a write mapping
is requested.

For example, a vulkan frame mapped from a drm frame (which is wrapped as
a vaapi frame in the example) is used as the output of scale_vulkan
filter, it always gets a green screen without this patch.

ffmpeg -init_hw_device vaapi=va -init_hw_device vulkan=vulkan@va
-filter_hw_device vulkan -f lavfi -i testsrc=size=352x288,format=nv12
-vf
"hwupload,scale_vulkan,hwmap=derive_device=vaapi:reverse=1,format=vaapi,hwdownload,format=nv12"
-f nut - | ffplay -

Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
2024-06-12 01:53:18 +02:00
Rémi Denis-Courmont
f6d0a41c8c lavu/riscv: use Zbb CLZ/CTZ/CLZW/CTZW at run-time
Zbb static    Zbb dynamic   I baseline
clz       0.668032642   1.336072283   19.552376803
clzl      0.668092643   1.336181786   26.110855571
ctz       1.336208533   3.340209702   26.054869008
ctzl      1.336247784   3.340362457   26.055266290
(seconds for 1 billion iterations on a SiFive-U74 core)
2024-06-11 20:12:37 +03:00
Rémi Denis-Courmont
98db140910 lavu/riscv: use Zbb CPOP/CPOPW at run-time
Zbb static    Zbb dynamic   I baseline
popcount  1.336129286   3.469067758   20.146362909
popcountl 1.336322291   3.340292968   20.224829821
(seconds for 1 billion iterations on a SiFive-U74 core)
2024-06-11 20:12:37 +03:00
Rémi Denis-Courmont
324899b748 lavu/riscv: use Zbb REV8 at run-time
This adds runtime support to use Zbb REV8 for 32- and 64-bit byte-wise
swaps. The result is about five times slower than if targetting Zbb
statically, but still a lot faster than the default bespoke C code or a
call to GCC run-time functions.

For 16-bit swap, this is however unsurprisingly a lot worse, and so this
sticks to the baseline. In fact, even using REV8 statically does not
seem to be beneficial in that case.

         Zbb static    Zbb dynamic   I baseline
bswap16:  0.668184765   3.340764069   0.668029012
bswap32:  0.668174014   3.340763319   9.353855435
bswap64:  0.668221765   3.340496313  14.698672283
(seconds for 1 billion iterations on a SiFive-U74 core)
2024-06-11 20:12:37 +03:00
Rémi Denis-Courmont
378d1b06c3 riscv: probe for Zbb extension at load time
Due to hysterical raisins, most RISC-V Linux distributions target a
RV64GC baseline excluding the Bit-manipulation ISA extensions, most
notably:
- Zba: address generation extension and
- Zbb: basic bit manipulation extension.
Most CPUs that would make sense to run FFmpeg on support Zba and Zbb
(including the current FATE runner), so it makes sense to optimise for
them. In fact a large chunk of existing assembler optimisations relies
on Zba and/or Zbb.

Since we cannot patch shared library code, the next best thing is to
carry a flag initialised at load-time and check it on need basis.
This results in 3 instructions overhead on isolated use, e.g.:
1:  AUIPC rd, %pcrel_hi(ff_rv_zbb_supported)
    LBU   rd, %pcrel_lo(1b)(rd)
    BEQZ  rd, non_Zbb_fallback_code
    // Zbb code here

The C compiler will typically load the flag ahead of time to reducing
latency, and can also keep it around if Zbb is used multiple times in a
single optimisation scope. For this to work, the flag symbol must be
hidden; otherwise the optimisation degrades with a GOT look-up to
support interposition:
1:  AUIPC rd, GOT_OFFSET_HI
    LD    rd, GOT_OFFSET_LO(rd)
    LBU   rd, (rd)
    BEQZ  rd, non_Zbb_fallback_code
    // Zbb code here

This patch adds code to provision the flag in libraries using bit
manipulation functions from libavutil: byte-swap, bit-weight and
counting leading or trailing zeroes.
2024-06-11 20:12:37 +03:00
Zhao Zhili
33e4cc963d avutil/timer: Add clock_gettime as a fallback of AV_READ_TIME
Reviewed-by: Rémi Denis-Courmont <remi@remlab.net>
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-11 01:11:36 +08:00