[FFmpeg-soc] [soc] libavsequencer [PATCH 02/08] Sub-song public API header file.

Sun Aug 15 21:07:21 CEST 2010

On date Sunday 2010-08-15 19:35:14 +0200, Vitor Sessak encoded:
> On 08/15/2010 06:41 PM, Sebastian Vater wrote:
> >Vitor Sessak a écrit :
> >>On 08/14/2010 02:50 PM, Sebastian Vater wrote:
> >>
> >>>Up to now, they're distinct entities, sub-volume and sub-panning is only
> >>>considered during slides but never actually taken into account on final
> >>>calculation.
> >>>
> >>>This stuff will surely cause interesting discussions. Or to be short,
> >>>yes, there's a lot of things to improve! But all this please, when the
> >>>basic stuff here works, i.e. module playback as original TuComposer did.
> >>>[...]
> >>>Well, removing this would not actually be a problem, because I never
> >>>used that, in TuComposer all loaders, indeed, have a flag to tell if to
> >>>use 8-bit or 6-bit volumes, but my current implementation always used
> >>>8-bit though.
> >>>
> >>>Maybe this is really a candidate to fully remove.
> >>
> >>Everything that is not used by any code _should_ be fully removed! It
> >>is easy to start simple and improve afterwards than to try to commit a
> >>complicated patch with unused parts.
> >
> >It is used by the player, i.e. the player handles 6-bit volumes when
> >this flag is set. I also remembered somethings which makes it impossible
> >to replace this without getting trouble (synth sound support).
> >The synth sound assembler also uses 6-bit in that case and since this a
> >full-featured assembly language, it's practically impossible to convert
> >TCM files correctly to 8-bit instead. :(
> 
> What about other file formats?
> 
> >In TuComposer the user was able to put a parameter for the demuxers
> >which decides to use original 6-bit or 8-bit volume. The current demuxer
> >in FFmpeg, however doesn't allow this now, but it might encounter a file
> >which already has this flag set.
> >
> >>>
> >>>No, it requires fully understanding of what's being parsed. Again, this
> >>>is hard to describe in words (this problem was detected after
> >>>disassembly nightmares almost 10 years ago, so I barely remember what
> >>>was the exact cause for this flag, now).
> >>>
> >>>Let me explain the problem:
> >>>In XM/IT when you use the global volume/panning commands, they only
> >>>affect the next note being played, but not in S3M, here it affects all
> >>>notes being currently played already!
> >>
> >>Then it looks clear to me to duplicate all the volume formats to have
> >>an "immediate" version and a "next note" version. Do you agree?
> >
> >Which is much more complex and more code than just defining a new flag.
> >
> >>Let me repeat my question again: Why is the stack depth should be
> >>described in the BSS? Why not set it in player.h?
> >
> >player.h is read-only for composers and external users, i.e. anything
> >that will go there isn't user editable anymore and this flag is supposed
> >to be user editable.
> 
> So you need something that is both user-editable and not part of the
> BSS. Again, where should be stuff that is not part of the
> description of a song _and_ is set by the user?
> 
> >The other point was your idea with moving the actual stack data to
> >player.h, that was an excellent idea, because this is not supposed to be
> >writeable to the end-user.
> >>>The BSS should not pass actually data to the player which it can't
> >>>handle,
> >>
> >>No, of course it should! How can the BSS avoid doing it if it is not
> >>supposed to know anything about the player? If the players limitations
> >>should be defined somewhere, it should be in player.h!
> >
> >The player itself can also play more than 256 channels, in fact it is a
> >limitation of both the player and the BSS (the track limitations and
> >effect limitations are accounting for the BSS, too).
> 
> So where the limitation would be if we change the BSS to a int16_t
> (no, I don't think this has any use doing it, but you understand my
> point)?
> 
> What it looks like to me you are doing is:
> 
> 1- You have some BSS parameter foo.
> 2- Sane values for the parameter foo is between 0 and AA
> 3- You look for the integer type whose maximum value is closest to AA
> 4- You use this type for foo.
> 
> I don't think this is the cleaner neither the most future-proof way
> of determining sane values for fields.
> 
> >>>and more than 8-bit channels is not handled correctly, it will
> >>>fallback a lots of features normally available, i.e. exact channel
> >>>control via effects.
> >>>
> >>>The engines itself don't care much about numbers actually there, in
> >>>theory both the module virtual channels and song host channels can
> >>>handle infinite channels.
> >>>
> >>>But the effects are designed to access and control them, also, which is
> >>>done currently by 8-bit values (other 8-bit decide control mechanism).
> >>>But I do expect, as a musician that I can control what I play.
> >>
> >>Again, what the player can or cannot do is completely irrelevant for
> >>the discussion of the BSS. The BSS should be a way to describe a song,
> >>no matter who (or if) it is going to be played!
> >
> >The point is that when we don't have a format which can save / restore
> >the BSS in a 100% state, this can be awful.
> >
> >Imagine you're sitting one week on a new song, using an extended BSS
> >which no format can save and/or load 100% because there is still are
> >still no (de-)muxers for it and you notice that 90% of your work is crap
> >because you can't save it.
> 
> You are using a crap editor to fill up the BSS. Moreover, your point
> is only valid considering everybody will want to write only TCM
> files. Your GUI should not let you use unsupported values, even if
> it fit in the data types.
> 
> >I'm not generally against the idea to extend
> >that, but we should make first sure that it can be 100% exported and
> >imported.
> 
> I'm not talking about extending. I mean that converting a field from
> uint8_t to int should not mean every 16-bit value is now valid!
> 
> >Or consider the following, you see that FFmpeg has a nice avsequencer
> >and start using it writing a GUI, since you see that it supports 16-bit
> >effect commands you write the GUI it way, i.e. the users think they can
> >use also the 16-bit range and when they save it, because they think, hmm
> >16-bit, so I have lots of user effects for my game / demo and then
> >save...BAM!
> 
> GUI developers should not need babysitting. And _if_ we want to
> specify this limits somewhere, it should not be in the BSS
> (limits.h?).
> 
> >Please note that most tracker editors are more like hex editors, i.e.
> >command, instrument and command data are edited directly in hex (and
> >sometimes decimal) values.
> 
> And they don't do any sanity-check?
> 
> >>>Imagine you're driving a car and I tell you, well, you can control 3 of
> >>>your hoops, but the 4th one is beyond access boundary. Would you
> >>>buy/take such one? No, right?
> >>
> >>Imagine you are driving a car. When you are reading your map (that you
> >>brought in some random shop), you find instructions of how bad the 4th
> >>gear in Toyota cars behave, right in the middle of the map! Ok, your
> >>car happens to be a Toyota and the problem happens to be real, but
> >>this does not means such info about belong to the map, it belongs to
> >>the car manual!!!
> >
> >I'll understand the point better now, but there are still some issues to
> >solve before. My point now is: We should first define a new
> >muxer/demuxer format which can im-/export the new extended features,
> 
> I'm not suggesting adding any new features, I suggest not depending
> on the BSS data types to know what is supported or not.
> 
> >before actually implementing them (or at least in parallel).
> >
> >Imagine having a word processor where you can set bold/italic/etc. but
> >it can only save ASCII, therefore loosing all that. It simply doesn't
> >make sense to have features which cannot be exported and re-imported at all.
> 
> And you would say that the problem in this case would be _the
> internal data structures_ used by the word processor?
> 
> >>>That's why I keep the limitations to be consistent, i.e. fitting in each
> >>>other.
> >>
> >>I don't really care much at this point about _what_ the limitations
> >>are, I care now about _where_ they are, and IMHO it is not in the BSS.
> >
> >Limitations to be also defineable by the user should be in the BSS,
> >read-only limitations can, of course be moved into the player.
> 
> Is this _part of the song description_?
> 
> >Please always note (i.e. never forgot) that you also can create an
> >avsequencer empty file and edit it, in that case you can edit during
> >composing time your limitations to ensure you won't have surprises when
> >you save to your format.
> >
> >For example if I create a new empty file and want to save it on
> >composition finish as S3M, I can set the use S3M-like global volume
> >command flag manually and I know that it's fine when I finally save it
> >in S3M.
> >
> >The same applies if I want to do a module for a friend which only has an
> >Amiga 500 and I want to ensure that it will playback fine there with an
> >ordinary MOD player, so I can again set the Amiga limits manually.
> 
> Again, in the BSS? This enforcing this limits is the job of the
> editor, and he should read from somewhere else than the BSS (again,
> the BSS should be _format independent_).
> 
> >>>Yes, and that's why these values are there. Also composers can set this
> >>>manually. Please don't forgot that ALL fields in the BSS are supposed to
> >>>be editable by the user, too.
> >>>
> >>>Please note, removing any of these fields will break compatibility with
> >>>IFF-TCM1 file format since TuComposer stores the values.
> >>
> >>How so? If it is defined somewhere in IFF-TCM1 files, the decoder are
> >>free to ignore this field, no?
> >
> >And therefore lose ability to playback files correctly.
> 
> How so? We are talking about a field that we just agreed that won't
> change playback.
> 
> >Also again here
> >the same accounts, the min/max pairs are also supposed to be
> >user-editable, moving them to player.h will again make this read-only.
> 
> So player_config.h or something else.
> 
> >Imagine, I want to start a new song with avsequencer but want to ensure
> >that it will save correctly let's say in IT. Then I can set manually
> >these values to the IT limits and I'm sure that when saving my work, I
> >get a correct file.
> 
> Again, I think the BSS should do _one_ _single_ _thing_: describing
> the song in a format-independent format. IT limits are useful to be
> set somewhere, but not in the BSS.

I tend to agree with Vitor here.

BSS = Big Sound Struct = Sound Module Descriptor

This describes a sound module, so it should be as generic and abstract
as possible, possibly it shouldn't contain format-specific
information. This is very important if we want to keep a clean design.

As for the compatibility problem, let's consider this scenario.  Anna
is composing a module song. She's using a fancy editor which links
against FFmpeg/libavsequencer, which works through saving a binary
representation of the libavsequencer BSS.

This representation is really generic, so it doesn't depend on the
features of the format which will be used to store the song module.

Now Anna wants to let his friend Bob hear the song module she's
composing. She can save the song using one of the many sound module
formats, S3M, XM/IT, TCM1. When she tries to save to S3M, her program
may cry out that it isn't possible to save to that format, as that
format doesn't allow to store some of the features of the composed
song.

Is that a problem? I don't think so, and indeed when using a word
processor this happens all the times, when saving you can either do
two choices:

* lose the information stored in the binary representation. Which
  information to lose and which to keep is a problem of the program
or
* store the information in a more expressive format

For example if you save a word document to txt file, the program will
discard some of the features of the document when performing the
conversion.