[FFmpeg-devel] MOD support for FFmpeg (My GSoC 2010 task starts tomorrow)

Thu May 27 04:31:59 CEST 2010

On Sun, May 23, 2010 at 11:27:37PM +0200, Sebastian Vater wrote:
> Dear guys!
> 
> Since tomorrow my main GSoC task starts, I thought it would be a good
> idea now to start discussing of MOD implementation in FFmpeg.
> 
> As some of you already know, I want to integrate MOD support by using
> TuComposer's engine. Since it has adressed all the issues with MOD
> support over 10 years ago and is also designed as a static/shared
> library, I think its design fits pretty well into FFmpeg (apart from
> code style, but changing this is more a refactoring task than writing
> lots of code).
> 
> But let's start with a small introduction first: What is a MOD file?
> 
> MOD shortly terms "Music module" with attached samples.
> Most of you probably know the MIDI format, which is almost the same,
> just that MIDI doesn't contain the sample data itself but insteads loads
> them from sample banks (like GUS patches, or soundcard wavetable).
> 
> This has the advantage that MID files are very small (they just contain
> the note data) but also the disavantage that you can't easily add
> customized samples and not even ensure that the file sounds the same
> with every device (just compare OPL2/3 MIDI with SB AWE32 MIDI as an
> example for this).
> 
> Professional music makers solve that issue usually by rendering the MIDI
> into a PCM file and mix their additional speech stuff etc. into the
> final PCM file.
> 
> This is where MOD differs, with MOD you have freedom to integrate custom
> samples to the note data, so you have a sound file which sounds always
> the same (like MP3, WAV, etc.).
> 
> That's the theory, in practice I realized that most module players don't
> handle these formats very well, so lots of module sound quite different
> compared to the tracker output which created the music file.
> 
> To summarize up, a module file can contain:
> 1. Module information (play time, artist, song message, initial speed, etc.)
> 
> 2. Sub-song information (some module formats support more songs in one
> file).
> 
> 3. Position/Order list data (how the tracks/pattern are played and in
> which order).
> 
> 4. Pattern/track data (this contains the actual notes, associated
> instrument/sample and effects). Some trackers organize this by single
> tracks and some by patterns. The difference is that track-based trackers
> allow each channel run with a different tempo, while pattern-based
> trackers share global values like tempo, so all tracks are played at the
> same speed.
> 
> 5. Instrument data (this does not contain the actual sample data but
> more musical related information like, keyboard <=> samples mapping, NNA
> (New Note Action) stuff, like volume/pitch/resonance/panning envelopes.
> One instrument can contain more samples, or if it's a MIDI instrument
> the MIDI instrument and channel number.
> 
> 6. Sample data (this contains the actual sample data as well as sample
> related structures, i.e. bits per sample, base frequency, initial/global
> volume and panning.
> 
> 7. Synth sound data (some trackers even attach synths to samples, like
> Adlib data, S3M being an example). TuComposer uses a complete synth
> sound assembler comparable to the instruction set of a regular
> microprocessor for allowing greatest possible degree of freedom).
> 
> How these data structures depend on each other?
> 
> One module can contain multiple sub-songs, multiple instruments as well
> as multiple envelope data.
> 
> One sub-song can contain exactly one position/order list table but
> multiple patterns/tracks.
> 
> One order list table can contain multiple elements pointing to
> patterns/tracks with additional information like speed change, transpose.
> 
> Each pattern = track * number of channels. So for a 16 channel module a
> pattern consists of 16 tracks (which can be the same though).
> 
> A sub-song can contain as many tracks/patterns as it likes.
> 
> Each instrument can contain multiple samples and assign multiple envelopes.
> 
> Each sample can assign one synth sound and each synth sound can have
> multiple wavetables and a code engine how to interpret and handle
> wavetables).
> 
> Since I today finished uploading of the UAE stuff on upload.ffmpeg.org
> (see AMIGA sub directory), you can download that and try out TuComposer
> in (Win-)UAE.
> 
> This way you can concern yourself that TuComposer delivers enough good
> quality to be qualified for FFmpeg. ;-)
> 
> So why I'm choosing TuComposer as part of FFmpeg for this?
> 
> I have spent many years debugging MOD/S3M/XM/IT to playback like the
> original tracker which invented the file format which can be a hell and
> cost me most of the development time, because lots of this stuff is
> either poorly documented if at all.
> 
> Another reason is that TuComposer is a complete composer/tracker engine,
> not just a playback engine, since FFmpeg is also capable of encoding.
> 
> I would be glad to see TuComposer as official part (with a different
> name like libavcomposer, although) of FFmpeg.
> 
> What would be the benefit of FFmpeg?
> 
> It can convert MOD/S3M/XM/IT and TCM modules to each other, you can e.g.
> run:
> ffmpeg -i my_song.s3m my_song.xm to convert a S3M file to XM, but you
> also can do:
> ffmpeg -i my_song.it my_song.ogg to render it as OGG vorbis file, and of
> course:
> ffplay my_song.mod
> 
> to simply playback a module with a nice pattern display (maybe like
> OpenCubicPlayer or ImpulseTracker ;)). Look at TuCView in the UAE stuff
> I uploaded today to see an example of this.
> 
> You can use FFmpeg as a base for a future tracker like program, which
> adds lots of additional functionality like allowing OGG/MP3/FLAC/APE
> etc. samples (any format that FFmpeg supports), even make a tracker
> supporting creation of music videos (thanks to FFmpeg video
> demuxers/decoders).
> 
> Even more, FFmpeg probably will be the first software in the world which
> can handle MOD/S3M/XM/IT as audio part of an AVI/MOV/etc.
> 
> Finally I get my good old TuComposer mostly platform-independant (making
> it so just requires these small lines to be changes which are Amiga
> specific like sound output to hardware, where we could simply use
> libavdevice instead).
> 
> The thing is that TuComposer contains already everything we need for
> proper MOD support, so we don't need to rewrite everything, which
> probably would require some 2-3 years before it has enough quality, i.e.
> far too long for this GSoC task.
> 
> Now for the implementation:
> 
> There are actually two methods, I will discuss these in detail here:
> 1. Make a new libavcomposer (beside libavformat, libavfilter, libavcodec):
> 
> I know that some of you already said, that you wouldn't like to see a
> new library in FFmpeg.
> 
> But I don't want libavcomposer to replace libavformat/libavcodec, i.e.
> there will be MOD/S3M/XM/IT demuxers which transfer these file formats
> into a common-shareable libavcomposer structure (almost all in
> libavcomposer will become part of public FFmpeg API like almost all
> functions in TuComposer are public, too).
> 
> In fact, the TuComposer module demuxer would simply be put in
> libavformat/iff.c where the ILBM/8SVX stuff also relies. ;-)
> 
> The decoder however, would not have to be rewritten or added for each
> new module format (there are plenty of module formats), since the
> decoder just uses libavcomposer structure to handle the module.
> 
> If we want S3M to XM conversion or sth. like that and not just PCM
> rendering, we need a common shareable structure between MOD/S3M/XM/IT,
> the job which libavcomposer will take. Speaking of TuComposer analogy,
> libavcomposer would be the base library which contains linked-lists of
> attached modules, etc.
> 
> Another advantage is that you can simply turn off MOD support by:
> ../configure --disable-avcomposer (I think a lot of people using FFmpeg
> don't really need module support, and forcing them type
> --disable-mod-codec --disable-s3m-codec --disable-xm-codec
> --disable-it-codec --disable-tcm-codec --disable-669-codec
> --disable-mtm-codec --disable-mid-codec [...] will just be a pain to
> type and even to remember (is this damn fucking format now a module
> format or not).
> 
> For those already have taken a look at TuComposer's header files: A
> valid mapping could be:
> tucomposer/tucomposer.h => libavcomposer/avcomposer.h
> tucomposer/module.h => libavcomposer/module.h
> tucomposer/song.h => libavcomposer/song.h
> tucomposer/external* => obselete since FFmpeg already has this, thus can
> be deleted.
> 
> Each header file also has a single C file (as opposed to TuComposer
> where each function was an own C file residing in a sub-directory).
> so, for example, module.h has module.c which contains all functions from
> Sources/C/tucomposer.library/Modules/*.c and so on.
> 
> Please note that TuComposer is around 300k executable size now on m68k
> Amiga and an x86 will be twice as large (when I last checked with
> DJGPP), so I think 600k are a good point for a new sub-library in FFmpeg.
> 
> Also there are over 40k lines of C code already present, although I
> think we can reduce that to 10k or sth. by using neat macro stuff and
> removing unnecessary parts, I think it's also a good point to manage
> them in a different branch (which also ensures that module development
> doesn't interfere much with the other stuff in FFmpeg).
> 
> Also libavcomposer will be able to save this structure in TCM format
> which can then be used by the demuxer to transfer it to the decoder (so
> that networking will work, too). The decoder can load the TCM file and
> fill the libavcomposer structure and then do the playback.
> 
> 2. Don't make a new libavcomposer but try to integrate everything with
> libavfilter/libavformat/libavcodec
> 
> Some of you suggested this, but I'm not pretty sure how well the design
> I planned and discussed above will really fit into this.
> 
> Adding a huge bunch of structures which probably are rarely used in
> compared to most other structures FFmpeg offers, sounds at first a bit
> controversal to me.
> 
> Maybe we can do sth. like add a AVComposer structure to AVPacket or
> AVCodecContext which will simply points to NULL if it's an non-MOD demuxer.
> 
> This however will probably (because of public API change) only possible
> when we do a major version bump (well the first idea might need this,
> too. But at least it won't break old software compiled for older
> versions of FFmpeg).
> 
> What do you think? Hope I didn't miss anything important out!

I dont think the 1. and. 2. are exclusive, it surely should be possible
to have a seperate lib and at the same time provide "mod" decoding support
through the existing APIs.
And we need to support it through existing apis because anything else would
be quite inconvenient for applications.
I am a bit undecided though about seperate lib or no seperate lib

about the existing APIs i do think they are sufficient for basic playback
and transcoding.

For video transcoding (mpeg2 for example) we support reusing the motion
vectors from the source if the user wants.
This works as the decoder returns a AVFrame that contains an array of
motion vectors and the encoder takes a AVFrame as input

I imagine that for mod something quite similar is possible to transcode
them to another mod format.

Its also a long standing feature request to make encode/decode_audio() work
with AVFrame or a similar structure instead of int16_t*

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Concerning the gods, I have no means of knowing whether they exist or not
or of what sort they may be, because of the obscurity of the subject, and
the brevity of human life -- Protagoras
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100527/c9386d03/attachment.pgp>