[FFmpeg-devel] [PATCH] Support for UTF8 filenames on Windows

Ramiro Polla ramiro.polla
Thu Jul 9 08:32:53 CEST 2009

On Fri, Jun 26, 2009 at 1:10 PM, Karl Blomster<thefluff at uppcon.com> wrote:
> Ramiro Polla wrote:
>>>>> MAX_PATH is defined to 260 in WinDef.h, and that is actually the
>>>>> maximum
>>>>> allowed path length in the Win32 API unless you want to jump through
>>>>> some
>>>>> hoops. Paths of up to 32,767 characters (approximately) are allowed,
>>>>> but
>>>>> only if they are absolute and start with the magical \\?\ prefix. I
>>>>> guess
>>>>> I
>>>>> could do some detection of relative paths and add said magical prefix
>>>>> manually if so desired, but the static allocation seems safe enough,
>>>>> and
>>>>> the
>>>>> 260 character limit is indeed what a vast majority of Windows programs
>>>>> use.
>>>> Indeed, FFmpeg fails with long names. But if you truncate the long
>>>> name, it might turn into a valid name (like Mans said).
>>> Right, so if strlen(filename) > MAX_PATH, the function should fail? Or
>>> should I try the long paths workaround? (It will be a minor pain to
>>> implement, because detecting relative paths on Windows is pretty
>>> annoying.)
>> IMHO we shouldn't try fiddling around with the paths much, but just
>> pass them on to _open or _wopen. Nor should we check against MAX_PATH.
> Well, sure, I could do dynamic allocation instead, but I don't know what
> happens when you pass strings longer than MAX_PATH to _wopen; MSDN doesn't
> say. I don't really see the point though because whatever happens, it won't
> be what the user expects.
>>>>> Updated patch with less tabs (and a rather embarrassing typo fix)
>>>>> attached.
>>>>> Regards,
>>>>> Karl Blomster
>>>>> Index: libavformat/os_support.c
>>>>> ===================================================================
>>>>> --- libavformat/os_support.c ? ?(revision 19266)
>>>>> +++ libavformat/os_support.c ? ?(working copy)
>>>>> @@ -30,6 +30,23 @@
>>>>> ?#include <sys/time.h>
>>>>> ?#include "os_support.h"
>>>>> +#ifdef HAVE_WIN_UTF8_PATHS
>>>>> +#define WIN32_LEAN_AND_MEAN
>>>>> +#include <windows.h>
>>>>> +#endif
>>>>> +
>>>>> +#ifdef HAVE_WIN_UTF8_PATHS
>>>>> +int winutf8_open(const char *filename, int oflag, int pmode)
>>>>> +{
>>>>> + ? ?wchar_t wfilename[MAX_PATH * 2];
>>>> Isn't sizeof(wchar_t) == 2?
>>> Yes (at least on Win32), but characters outside the basic multilingual
>>> plane
>>> requires two UTF-16 code units to express. Of course this is a bit
>>> esoteric
>>> because the likelihood of such characters being used in filenames is very
>>> low, but in theory it could happen and it's not like allocating 520 extra
>>> bytes in a temporary buffer is going to kill anyone, so...
>>>> I think you could also use wchar_t wfilename[strlen(filename) + 1]
>>>> instead of malloc if we are going to try and pass paths larger than
>>>> MAX_PATH.
>>> The "proper" way would, I think, be to use
>>> MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS, filename, -1, NULL, 0)
>>> first, because that returns the exact number of wide characters required
>>> to
>>> store the string.
>> I'd say use this approach to malloc the string then.
>> But I'm still not really happy about having to choose at compile-time.
>> Is there no way the user could specify it at run-time?
> You could add an enable_win_utf8 parameter to av_open_input_file I guess but
> that would be a really ugly thing to have in the API and I doubt it'd be
> OK'd. This patch only changes the API, not the commandline interfaces and
> whatnot, so the only users of it would be people who use the ffmpeg API, and
> those people presumably compile ffmpeg themselves anyway and would know if
> they want UTF-8 support or not.

I'm thinking of maybe adding a field to URLContext to specify
win32_utf8, or adding URL_WIN32_UTF8 to flags. Does anyone have
other/better suggestions to let the user choose between the system
codepage or utf8 files on Windows at runtime?

Ramiro Polla

More information about the ffmpeg-devel mailing list