[FFmpeg-trac] #2502(FFprobe:open): ffprobe Produces Invalid JSON
FFmpeg
trac at avcodec.org
Fri Nov 8 17:54:16 CET 2013
#2502: ffprobe Produces Invalid JSON
-------------------------------------+-----------------------------------
Reporter: dnicolson | Owner:
Type: defect | Status: open
Priority: normal | Component: FFprobe
Version: unspecified | Resolution:
Keywords: utf8 | Blocked By:
Blocking: | Reproduced by developer: 1
Analyzed by developer: 1 |
-------------------------------------+-----------------------------------
Changes (by saste):
* analyzed: 0 => 1
* keywords: => utf8
* status: new => open
* reproduced: 0 => 1
Comment:
Replying to [comment:14 dnicolson]:
> I have made a reduced case and attached a file (test-pattern.avi), as
requested.
>
> I created an AVI file with ffmpeg using the following command:
>
> ffmpeg -i test-pattern-orig.avi -metadata title="æ" -metadata
artist="`echo -e \"\xe6\"`" -vcodec copy -acodec copy test-pattern.avi
> (backticks need to be added around the monospaced text).
>
> This creates the file test-pattern.avi with the title as a UTF-8 encoded
lowercase AE and the artist as a ISO-8859-1 encoded lowercase AE. VLC
displays metadata in ISO-8859-1 so the artist is correctly displayed as
"æ" but displays the title as "æ".
AE in ISO8859-1 = 0xE6
AE in UTF-8 = 0xC386
As a consequence, AE encoded in UTF-8 will render in IS08859-1 as two
distinct characters, and ISO8859-1 AE will not correspond to a valid UTF-8
sequence.
Now the problem is to understand what's the reference encoding. FFmpeg
always assumes UTF-8, so you should provide metadata encoded in UTF-8
format. Note that your command is broken since you're explicitly passing
an invalid UTF-8 sequence to the metadata option (which expects UTF-8
data).
Currently there is no way to specify (nor autodetect) the assumed
encoding.
> Because ffprobe assumes all valid UTF-8 in the metadata, the following
command produces invalid JSON:
>
> ffprobe -v quiet -print_format json -show_format -show_streams test-
pattern.avi | python -c 'import json,sys; json.load(sys.stdin)'
>
> A possible solution would be to strip invalid UTF-8 characters, or maybe
provide an alternate switch to replace invalid characters?
Implemented in an experimental patchset, see ticket #1163.
--
Ticket URL: <https://ffmpeg.org/trac/ffmpeg/ticket/2502#comment:16>
FFmpeg <http://ffmpeg.org>
FFmpeg issue tracker
More information about the FFmpeg-trac
mailing list