[FFmpeg-trac] #8466(avformat:closed): av_url_split violates RFC2396, fails to parse URLs
FFmpeg
trac at avcodec.org
Sun Feb 16 19:41:10 EET 2020
#8466: av_url_split violates RFC2396, fails to parse URLs
-------------------------------------+------------------------------------
Reporter: aitte | Owner:
Type: defect | Status: closed
Priority: normal | Component: avformat
Version: unspecified | Resolution: fixed
Keywords: http | Blocked By:
Blocking: | Reproduced by developer: 0
Analyzed by developer: 0 |
-------------------------------------+------------------------------------
Comment (by aitte):
Replying to [comment:8 cus]:
> Replying to [comment:7 aitte]:
> > Note how the code is adding a "/" slash if the first character of the
path is a question mark. Is there any problem if the URL is
"https://example.com#foo"?
> >
> > It looks to me like the hashmark check a few lines earlier (
https://git.videolan.org/?p=ffmpeg.git;a=commitdiff;h=987ce96d41d21a120e6109f9bb1e3a9dcddbf30b
) puts a null byte at the hashmark position to "erase the hashmark and
everything after it" already, so I guess there is no problem! Just asking
to be sure. :-)
>
> Yeah, that should be fine as well, empty path (it becomes empty after
removing the hashmark) was converted to / even before this patchset.
Hey, thanks again, I really almost can't believe how much cleaner the new
URL code is. The old one you replaced (ls, ls2 system) was a convoluted
mess. Hehe. This change is truly brilliant:
https://git.videolan.org/?p=ffmpeg.git;a=commitdiff;h=554576b6cfe79a91d37e14d3617ca417562085db
And ahh yeah I see what you mean now about the empty path converting to
"/", at this area of the code:
https://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavformat/http.c;h=d0df1b6d585da422aef14eb6128eb319ddd9ac98;hb=a3d8875df1f71e62a36c583cee638ab1d51a71d3#l220
So "http://example.com#tag" would be split via "av_url_split()" into
"http", "example.com" and "#tag" (path1). Then path1 variable gets
truncated to "\0tag", and then it says "if (path1[0] == '\0') { path =
"/"; }" which means it should properly be converting the path to "/" when
there was just a hashmark after the domain name.
And if the path is non-empty but starts with "?", your new code formats it
as "/?" properly. And lastly, if the path is normal and starts with "/"
already then it's used as-is.
Alright this looks perfect.
You've done a fantastic job!
- Cleaned up the buggy, messy av_url_split, wohoo!
- Added support for "example.com?x" and "example.com#x" URLs in the URL
splitter.
- Properly removing hashmark fragments from paths when connecting to
target URLs. That would have caused problems in the old code which kept
them. Great catch!
- Added support for auto-inserting "/" when connecting to target URLs that
didn't begin their path with a slash already.
This looks like a good implementation of RFC3986. Thank you so much for
the great work!
:-)
--
Ticket URL: <https://trac.ffmpeg.org/ticket/8466#comment:9>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker
More information about the FFmpeg-trac
mailing list