[FFmpeg-devel] [PATCH] seek: Fix crashes in ff_seek_frame_binary if built with latest Clang 14
Andreas Rheinhardt
andreas.rheinhardt at outlook.com
Sat Nov 6 22:38:52 EET 2021
Andreas Rheinhardt:
> Martin Storsjö:
>> On Sat, 6 Nov 2021, Andreas Rheinhardt wrote:
>>
>>> Martin Storsjö:
>>>> Passing an uninitialized variable as argument to a function is
>>>> undefined behaviour (UB). The compiler can assume that UB does not
>>>> happen.
>>>>
>>>> Hence, the compiler can assume that the variables are never
>>>> uninitialized when passed as argument, which means that the codepaths
>>>> that initializes them must be taken.
>>>>
>>>> In ff_seek_frame_binary, this means that the compiler can assume
>>>> that the codepaths that initialize pos_min and pos_max are taken,
>>>> which means that the conditions "if (sti->index_entries)" and
>>>> "if (index >= 0)" can be optimized out.
>>>>
>>>> Current Clang git versions (upcoming Clang 14) enabled an optimization
>>>> that does this, which broke the current version of this function
>>>> (which intentionally left the variables uninitialized, but silencing
>>>> warnings about being uninitialized). See [1] for discussion on
>>>> the matter.
>>>>
>>>> [1] https://reviews.llvm.org/D105169#3069555
>>>>
>>>> Signed-off-by: Martin Storsjö <martin at martin.st>
>>>> ---
>>>> libavformat/seek.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/libavformat/seek.c b/libavformat/seek.c
>>>> index 40169736df..405ca316b3 100644
>>>> --- a/libavformat/seek.c
>>>> +++ b/libavformat/seek.c
>>>> @@ -283,7 +283,7 @@ int ff_seek_frame_binary(AVFormatContext *s, int
>>>> stream_index,
>>>> int64_t target_ts, int flags)
>>>> {
>>>> const AVInputFormat *const avif = s->iformat;
>>>> - int64_t av_uninit(pos_min), av_uninit(pos_max), pos, pos_limit;
>>>> + int64_t pos_min = 0, pos_max = 0, pos, pos_limit;
>>>> int64_t ts_min, ts_max, ts;
>>>> int index;
>>>> int64_t ret;
>>>>
>>>
>>> I already wanted to write that the compiler is wrong, but it seems it is
>>> not, as C11 differs from C99 in this respect (C11 6.3.2.1 2):
>>>
>>> "If the lvalue designates an
>>> object of automatic storage duration that could have been declared with
>>> the register storage class (never had its address taken), and that
>>> object is uninitialized (not declared
>>> with an initializer and no assignment to it has been performed prior to
>>> use), the behavior
>>> is undefined."
>>>
>>> For GCC and Clang av_uninit(x) is defined as x=x. And that is a problem:
>>> In case this macro is used to declare an automatic variable that is
>>> could be declared with the register storage class the
>>> pseudo-initialization is UB according to the above. So I think we will
>>> have to modify the macro to make it safe. Or just stop using it.
>>> (How could such a hack ever end up in a public header?)
>>
>> FWIW, most of the issue here comes from the fact that the uninitialized
>> value is passed as a parameter - in that respect, av_uninit() isn't any
>> better or worse than just leaving the variable plain uninitialized. (Not
>> that one really should reason around where UB is ok...)
>>
>
> It is UB even when it is not used as a function argument at all. While
> there seems to be no current compiler that uses this against us, there
> is no guarantee that it will stay this way. After all, this patch is
> about an unpleasant surprise and there might be more of that in the future.
>
>> Also, I haven't tried to read the standard wrt that, but I would expect
>> that passing an uninitialized value as parameter was UB even before C11?
>>
>
> The whole clause which I cited does not exist in old standards. C99's
> non-normative annex J.2 contains the following item in the list of
> undefined behaviour: "The value of an object with automatic storage
> duration is used while it is
> indeterminate (6.2.4, 6.7.8, 6.8)." Yet I have not found a normative
> statement to back this up. It is explicitly mentioned that using a trap
> representation is UB. If the type in question has a trap representation,
> then the compiler may of course always presume that every indeterminate
> object is a trap representation; but if it is known that the type in
> question does not permit trap representations (like the uintX_t types),
> then I don't see why this should be UB.
Addition: Using indeterminate values was definitely UB in C90 by the
very definition of UB:
"undefined behavior: Behavior, upon use of a non-portable or erroneous
program construct, of erroneous data, or of indeterminately valued
objects, for which the standard imposes no requirements".
C99 and C11 define UB as "behavior, upon use of a nonportable or
erroneous program construct or of erroneous data,
for which this International Standard imposes no requirements".
So it was always intended that using indeterminate values is UB. This
seems to be just an oversight on part of the authors of C99.
- Andreas
More information about the ffmpeg-devel
mailing list