[FFmpeg-devel] tkhd transformation matrix in mov is ignored except for width/height and scaling

Sat Oct 31 17:53:42 CET 2009

I've been trying for months to get transcoding an mov via ffmpeg to
correctly rotate a video that includes a transformation matrix in the tkhd
node (iphone video shot in portrait, basically).  I could see that various
patches about the matrix were discussed on this list, and I couldn't figure
out why nothing seemed to be functioning even after said patches were
applied. The main patch is here:

http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/2008-July/049941.html

and it was corrected by this patch:

http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/2009-June/071622.html

In desperation, I finally dug into the source and tried to figure out what i
was doing wrong.  What was almost immediately apparent is that those patches
deal only with width and height of the track as well as scaling the width
and height.  In fact, the values in the matrix are only stored locally in
variables on the stack in the static function that reads the tkhd node. The
width and height are computed from the matrix and stored in a location
accessible from outside the mov_read_tkhd() function, but the rest of the
matrix is completely inaccessible once the function returns.

I'll admit to being fairly surprised that this is the case, since the iphone
is an increasingly popular platform for recording video, and the default
orientation for the phone in a users hand (vertical - portrait mode) will
result in a video that appears rotated counterclockwise by 90 degrees when
played via libavformat.  More importantly, any video transcoded via
libavformat will result in a video that is rotated AND which no longer has
the transformation matrix values to compensate stored in the tkhd, so that
even quicktime and the iphone play the video sideways.  This is such a
strange state of affairs so long after the release of the iphone, that I
have to wonder if I'm reading things incorrectly.

Incidentally, a description of the matrix transformation process is here:

http://developer.apple.com/mac/library/documentation/QuickTime/RM/MovieBasics/MTEditing/K-Chapter/11MatrixFunctions.html

I'm an experienced developer, but I know next to nothing about av codecs and
such.  I could attempt to hack in a fix, but I don't even know where to
begin.  Storing the matrix somewhere that is accessible outside
mov_read_tkhd() is simple enough.  Running each pixel through the transform
is also pretty easy.  But I've got no clue where to place that code. Could
someone maybe point me in the correct direction for where to find code that
places each pixel in the outbound stream?

My alternative is a lame hack of the code in libavformat/movenc.c which
writes a default identity matrix into the matrix.  I can copy a rotation
matrix into there instead based on a commandline param, but I am loathe to
do that.

Thanks in advance for your help.