[FFmpeg-devel] [RFC] Built-in documentation API

Chris Miceli chris at miceli.net.au
Mon Aug 24 07:55:09 EEST 2020

I'm new to the community and eager to help out with this work
as well. Is there a reason why we would build something to do
this rather than something like doxygen? It would mean that the
team can point any contributors to the documentation for doxygen
and the developers of ffmpeg will not need to spend unnecessary time
maintaining the documentation system, concentrating instead on the
code itself.

Another benefit being that if there are other fixes you wish to make,
the entire doxygen community can benefit as well.

*Chris Miceli*

On Sun, Aug 23, 2020 at 8:39 PM Jim DeLaHunt <list+ffmpeg-dev at jdlh.com>

> On 2020-08-23 08:21, Nicolas George wrote:
> > Since the idea of documentation built in the libraries seems popular, I
> > have tried to outline an API to access it.…
> >
> > See the attached file […`documentation.c` omitted…].
> >
> > The idea would be to have the build system convert the documentation
> > into a C file with initialization for one or several AVDocNode
> > structures.
> >
> > Note that since all this must be in .rodata, we must get it right on the
> > first try, because of inter-libraries compatibility issues.
> > …The most important question IMHO is which format we adopt for the doc in
> > the library.…
> Text is superficially simple, but in a multicultural world, text is in
> reality very complex.
> All text strings should have a character encoding defined. I suggest
> that all the text fields be specified by the format as UTF-8 encoded. No
> need to offer other options.
> All human-readable strings should have their human language described.
> Either define in the format that the string is written in the English
> language (and decide if you want to require US or UK spelling), or add
> language attributes to each text string identifying the human language
> in which it is written (suggest using BCP 47[1] tags), or add a single
> language attribute for the whole AVDocNode and require that all text
> strings in that node be written in the same human language.
> Assuming UTF-8 encoding, is `char *` the right data type?  Does your
> profile of the C language offer something more precisely targeted?
> Something analogous to `std::string` of C++, perhaps?
> Does this format allow documentation in multiple languages at the same
> time? Might you ever want to ship an FFmpeg binary which has
> documentation in, say, both English and Chinese?
> Consider if you want to limit some text fields to a subset of UTF-8. For
> instance, are the strings in the "Name" field limited to the ASCII
> subset of UTF-8?  Are emoji permitted?
> What is the line wrapping model of these text objects?  Are line endings
> encoded with '\n' or '\r' or '\r\n' or any?  What effect does '\t' have?
> What about formfeed, or page eject?
> Does this architecture permit markup which defines tables?  How does it
> display such markup?
> This structure only stores marked-up text. Does that mean it is
> impossible to store diagrams and pictures in the documentation? Are you
> comfortable giving up that expressive power?
> Will the overall documentation system be limited to the expressive power
> of this mechanism?  If not, then when you define the document compiler
> which generates this format, you will need to define what gets done with
> parts of the mechanism which this architecture cannot support. Are they
> thrown out? Simplified somehow?
> Does this structure permit markup with font choices?  If the markup
> calls for heading style, or italic, or preformatted style, how will the
> display system invoke the correct fonts?
> Font choices are also part of correctly displaying character style for
> the language. The Unicode standard encodes Traditional Chinese,
> Simplified Chinese, Japanese, and parts of Korean and Vietnamese with
> unified Han codepoints. The text display uses a font choice to get the
> correct character style for the language. Do you want to permit
> documentation to appear in these languages with the correct character
> style?  How will that happen?
> How will this API display text?  Will it emit plain text with no
> markup?  Will it emit the internal markup language used by this data
> structure (eg "FFMTHML") and not attempt to format it?
> One risk of this architecture is that you are faced with a choice of
> making a mechanism which is well-defined but limited (e.g. to English
> and ASCII), or well-defined and terribly complex to define and to
> implement, or simply designed and implemented, but poorly defined
> outside of a core usage pattern. What is the value you are trying to
> unlock with this architecture?  How will you ensure this architecture
> gives a positive return (value) on investment (design and implementation
> and content authoring)?
> [1] https://tools.ietf.org/html/bcp47
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".

More information about the ffmpeg-devel mailing list