[FFmpeg-devel] [PATCH] http: handle URLs with spaces

wm4 nfxjfg at googlemail.com
Sun Feb 2 16:37:35 CET 2014


On Sun, 2 Feb 2014 15:10:34 +0000
Eli Kara <eli at algotec.co.il> wrote:

> > 
> >> FFmpeg *can* urlencode. It can also do so if a certain option is set (just like "cookies" or any other protocol option). However, the question is - do you really want to?
> 
> >Why wouldn't you want to escape things that are clearly not allowed unescaped? I'd expect libavformat's http support to work well, not that I have to preprocess URLs in various protocol specific >ways before passing it to libavformat. Sure, I could do that, but what's the point of forcing every single API user to add a bunch of crappy custom code just to make it behave as one would expect?
> 
> > As for escaping in general - you have to be careful not to escape already escaped characters. I'm not sure how that is supposed to be handled - I guess there is no standard way, since you don't know > whether the URL you get is escaped or not. Note that this is not the API user's fault - the user could provide escaped or unescaped or semi-broken mixed escaping, and expect it to work. There are > > > websites that encode HTML links that use unescaped URLs.
> 
> You said it yourself - there is no way for FFmpeg to know if a URL is escaped or malformed. The user of the API will have to provide that information anyway. I think adding the option to do
> encoding is a good thing.

The API user is just as clueless, if the URL comes from user input.
Which will be the case most of the time. In other situations, the API
user could trivially escape it on its own.

What I'm advocating here is that http.c should always escape disallowed
characters (like spaces), because making HTTP requests with them
unescaped is always invalid - thus it's 100% safe to escape them.

Also, IMHO, http.c shouldn't mess with URLs that are perfectly valid.

An option to do full escaping might be somewhat useful. But even then,
what do you escape? Do you escape even path fragments like '/'? Do you
escape '?' because they could be part of the path, instead of a HTTP
query string? See, you get in hell even with the simplest things.

> >> I'm a relatively new observer to this list but IMHO the pros and cons for encoding are these:
> >> 
> >> Con - URL encoding can be done outside of FFmpeg in many ways and with already existing, proven code.
> > How/where?
> If I'm to provide FFmpeg with a URL and in addition tell it to perform encoding, I may as well have encode it myself. For example, in XBMC. It already has ways to encode URLs,
> especially since most of them come from add-ons that are python scripts (urllib does an excellent job).
> So it boils down to this - since I have to hint FFmpeg anyway, one might say there is no need to do encoding.

I don't understand. What does XBMC and urllib have to do with this? I'm
talking about direct user input. For example, pass an URL to ffplay or
ffmpeg.

> >> Pro - In much the same way that http.c implements the HTTP protocol, 
> >> headers processing, cookies processing and other features, for completeness purposes it should also have URL encoding capabilities, at least as an option for HTTP.
> >Why an option to send invalid HTTP requests? Are you planning to use libavformat to test webserver bugs?
> 
> No. I was simply saying that much the same way that libavformat's HTTP support is complete, it should be made even better with URL encoding. I was saying it is a good thing to add encoding
> but as an optional step, as stated above (FFmpeg cannot in any way determine if a URL is already encoded or not - except for a few edge cases).

Well yes. Maybe it should attempt similar heuristics like web browsers
and other tools like wget apparently do if the URL contains disallowed
characters. If anyone has suggestions, I could implement it.


More information about the ffmpeg-devel mailing list