[FFmpeg-devel] [PATCH] h264: integrate clear_blocks calls with IDCT.

Michael Niedermayer michaelni at gmx.at
Mon Feb 18 16:55:16 CET 2013


On Sun, Feb 17, 2013 at 07:47:22PM -0800, Ronald S. Bultje wrote:
> Hi,
> 
> On Sun, Feb 17, 2013 at 6:04 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > On Sun, Feb 17, 2013 at 02:52:54PM -0800, Ronald S. Bultje wrote:
> >> From: "Ronald S. Bultje" <rsbultje at gmail.com>
> >>
> >> The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
> >> to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
> >> (in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
> >> tested (cathedral), i.e. almost 30 cycles per mb faster.
> >>
> >
> >> Arm assembly changes untested.
> >
> > fate-h264 (h264-conformance-ba_mw_d in this case but its not the only
> > one)
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > ff_h264_idct_add8_neon () at ffmpeg/libavcodec/arm/h264idct_neon.S:166
> > 166         ldrsh           r8,  [r1]
> 
> diff --git a/libavcodec/arm/h264idct_neon.S b/libavcodec/arm/h264idct_neon.S
> index a2521b7..8d85227 100644
> --- a/libavcodec/arm/h264idct_neon.S
> +++ b/libavcodec/arm/h264idct_neon.S
> @@ -105,6 +105,7 @@ function ff_h264_idct_add16_neon, export=1
>          ldr             r0,  [r5], #4
>          ldrb            r8,  [r6, r8]
>          subs            r8,  r8,  #1
> +        mov             r3,  r1
>          blt             2f
>          ldrsh           lr,  [r1]
>          add             r0,  r0,  r4
> @@ -116,7 +117,7 @@ function ff_h264_idct_add16_neon, export=1
>          adreq           lr,  ff_h264_idct_add_neon    + CONFIG_THUMB
>          blx             lr
>  2:      subs            ip,  ip,  #1
> -        add             r1,  r1,  #32
> +        add             r1,  r3,  #32
>          bne             1b
>          pop             {r4-r8,pc}
>  endfunc
> @@ -136,13 +137,14 @@ function ff_h264_idct_add16intra_neon, export=1
>          add             r0,  r0,  r4
>          cmp             r8,  #0
>          ldrsh           r8,  [r1]
> +        mov             r3,  r1
>          iteet           ne
>          adrne           lr,  ff_h264_idct_add_neon    + CONFIG_THUMB
>          adreq           lr,  ff_h264_idct_dc_add_neon + CONFIG_THUMB
>          cmpeq           r8,  #0
>          blxne           lr
>          subs            ip,  ip,  #1
> -        add             r1,  r1,  #32
> +        add             r1,  r3,  #32
>          bne             1b
>          pop             {r4-r8,pc}
>  endfunc
> 
> ?

Program received signal SIGSEGV, Segmentation fault.
ff_h264_idct_add8_neon () at ffmpeg/libavcodec/arm/h264idct_neon.S:168
168         ldrsh           r8,  [r1]

btw, dont you have some linux system with qemu-arm available ?


[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The real ebay dictionary, page 1
"Used only once"    - "Some unspecified defect prevented a second use"
"In good condition" - "Can be repaird by experienced expert"
"As is" - "You wouldnt want it even if you were payed for it, if you knew ..."
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20130218/7260eb4d/attachment.asc>


More information about the ffmpeg-devel mailing list