[Libav-user] Help with unpacking PIX_FMT_YUV4* data types

Ricky Huang rhuang.work at gmail.com
Wed Mar 12 00:01:25 CET 2014


Hello J.,

Thank you for the sample code.  From what I am able to understand from your code, you are 
  - extracting the video frames out of the videos via ffmpeg.avpicture_fill() call
  - converting the image context from the original FMT to either FMT_BGRA or FMT_RGBA
  - sws_scale()'ing which performs the actual conversion (input/destination sizes are the same, what's change is the PIX_FMT)
  - perform the flip (or not) and output the image

So what I am wondering is that I do not see anywhere in you code that unpacks the luminance component (Y channel in the original email).  Are you suggesting I should convert YUV into RGB, then calculate the luminance value based on the RGB to luminance formula found here: http://stackoverflow.com/questions/596216/formula-to-determine-brightness-of-rgb-color?


Thank you again!

On Mar 10, 2014, at 9:46 PM, J Decker <d3ck0r at gmail.com> wrote:

> During Init.....
> 
> 
> 			file->pVideoFrameRGB = ffmpeg.av_frame_alloc();
> 			{
> 				int num_bytes;
> #ifdef WIN32
> #define PIX_FMT  PIX_FMT_BGRA
> #else
> #define PIX_FMT  PIX_FMT_RGBA
> #endif
> 				num_bytes = ffmpeg.avpicture_get_size(PIX_FMT, file->pVideoCodecCtx->width, file->pVideoCodecCtx->height);
> 				file->rgb_buffer = (_8 *)ffmpeg.av_malloc(num_bytes*sizeof(uint8_t));
> 				ffmpeg.avpicture_fill((AVPicture*)file->pVideoFrameRGB, file->rgb_buffer, PIX_FMT
> 					, file->pVideoCodecCtx->width, file->pVideoCodecCtx->height);
> 			}
> 			file->img_convert_ctx = ffmpeg.sws_getContext(file->pVideoCodecCtx->width, file->pVideoCodecCtx->height
> 				, file->pVideoCodecCtx->pix_fmt 
> 				, file->pVideoCodecCtx->width, file->pVideoCodecCtx->height
> 				, PIX_FMT, SWS_BICUBIC, NULL, NULL, NULL);
> 
> 
> 
> // CDATA is _32 color data in my library, image surface is the raw out RGB
> 
> 						CDATA *surface = GetImageSurface( out_surface );
> 						ffmpeg.sws_scale(file->img_convert_ctx
> 							, (const uint8_t*const*)file->pVideoFrame->data, file->pVideoFrame->linesize
> 							, 0, file->pVideoCodecCtx->height
> 							, file->pVideoFrameRGB->data
> 							, file->pVideoFrameRGB->linesize);
> 
> // depending on target platform, the image needs to be 'flipped' on output from the pVideoFrameRGB
> 						{
> #ifdef _INVERT_IMAGE
> 							int row;
> 							for( row = 0; row < file->pVideoFrame->height; row++ )
> 							{
> 								memcpy( surface + ( file->pVideoFrame->height - row - 1 ) * file->pVideoFrame->width
> 									, file->rgb_buffer + row * sizeof(CDATA) * file->pVideoFrame->width, file->pVideoFrame->width * sizeof( CDATA ) );
> 							}
> #else
> 							memcpy( surface 
> 								, file->rgb_buffer , file->pVideoFrame->height * file->pVideoFrame->width * sizeof( CDATA ) );
> #endif
> 						}
> 
> 
> // Output surface to display
> 
> 
> On Mon, Mar 10, 2014 at 6:11 PM, Ricky Huang <rhuang.work at gmail.com> wrote:
> 
> On Mar 10, 2014, at 4:28 PM, J Decker <d3ck0r at gmail.com> wrote:
> 
>> I would think you would just use 'sws_getContext' and 'sws_scale'
>> worked good for me... 
> 
> Thank you for the pointer.  May I have a little more info regarding how you used sws_getContext()?  I found its API definition as follows:
> 
> struct SwsContext * 	sws_getContext (int srcW, int srcH, enum PixelFormat srcFormat, int dstW, int dstH, enum PixelFormat dstFormat, int flags, SwsFilter *srcFilter, SwsFilter *dstFilter, const double *param)
> 
> and it allocates and returns a struct SwsContext*.
> 
> 
> In my case I am not modifying anything in my filter, why is this function good for me? (I am assuming this performs some kind of transformation because there's a source and destination).  Also in the returned SwsContext struct, I see there are 4 members:
> 
> int 	lumXInc
> int 	chrXInc
> int 	lumYInc
> int 	chrYInc
> 
> perhaps this is where I can get the luminance?  In that case, does it mean I have to perform some kind of transformation before even able to get the Context?
> 
> 
> Thank you for in advance for clarification.
> 
> 
> 
>> On Mon, Mar 10, 2014 at 4:18 PM, Ricky Huang <rhuang.work at gmail.com> wrote:
>> Hello all,
>> 
>> I am writing code that extracts the luminance component of an input video using my own custom filter in libavfilter - specifically I am extracting it from "PIX_FMT_YUV420P" a video and I am wondering how to go about doing so.  According to the pixfmt.h header: 
>>> 
>>> PIX_FMT_YUV420P,   ///< planar YUV 4:2:0, 12bpp, (1 Cr & Cb sample per 2x2 Y samples)
>> 
>> I am not sure how to interpret that.  Assuming I am doing this in my draw_slice() function. It has the definition of:
>>> 
>>> static void draw_slice(AVFilterLink *inlink, int y0, int h, int slice_dir)
>> 
>> I know I can get to the input data plane by:
>>> 
>>> AVFilterBufferRef *cur_pic = link->cur_buf;
>>> uint8_t *data = cur_pic->data[0];
>> 
>> But there are multiple "planes" in the data.  Does data[0], data[1], data[2] correspond to each of the Y, U, V channels?
>> 
>> Also, once I am able to point my pointer at the correct coordinate, how should I interpret the extracted result (float, int, etc)?
>> 
>> 
>> Thank you in advance.
>> 
>> _______________________________________________
>> Libav-user mailing list
>> Libav-user at ffmpeg.org
>> http://ffmpeg.org/mailman/listinfo/libav-user
>> 
>> 
>> _______________________________________________
>> Libav-user mailing list
>> Libav-user at ffmpeg.org
>> http://ffmpeg.org/mailman/listinfo/libav-user
> 
> 
> _______________________________________________
> Libav-user mailing list
> Libav-user at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/libav-user
> 
> 
> _______________________________________________
> Libav-user mailing list
> Libav-user at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/libav-user

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ffmpeg.org/pipermail/libav-user/attachments/20140311/421d0a93/attachment.html>


More information about the Libav-user mailing list