On Sun, Jul 05, 2009 at 12:02:27PM +0000, Jai Menon wrote:
On Sun, Jun 28, 2009 at 12:06 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Sat, Jun 27, 2009 at 08:25:43PM +0000, Jai Menon wrote:
On Thu, Jun 25, 2009 at 9:51 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Wed, Jun 24, 2009 at 05:59:19PM +0000, Jai Menon wrote:
On Wed, Jun 24, 2009 at 3:58 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Wed, Jun 24, 2009 at 01:42:08PM +0000, Jai Menon wrote: > On Wed, Jun 24, 2009 at 1:27 PM, Michael Niedermayer<michaelni@gmx.at> wrote: > > On Sun, Jun 21, 2009 at 04:35:20PM +0000, Jai Menon wrote: [...] > > [...] > >> @@ -806,6 +815,26 @@ > >> > >> line += s->picture.linesize[0]; > >> } > >> + } else { > >> + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { > >> + uint16_t *dst; > >> + x = tile->comp[0].coord[0][0] - s->image_offset_x; > >> + dst = line + x * s->ncomponents * 2; > >> + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { > >> + for (compno = 0; compno < s->ncomponents; compno++) { > > > >> + *src[compno] = av_rescale(*src[compno], (1 << 16) - 1, > >> + (1 << s->cbps[compno]) - 1); > > > > av_rescale is too slow > > So just (*src[compno]/((1 << s->cbps[compno]) - 1)) * ((1 << 16) - 1) ?
* is slow / s slower
"src" << C it should be
<possibly dumb question ahead>
I understand that * and / are slower but how can I achieve the same effect with a single <<?
well, not the same but close enough IMHO src<<C or (src<<C) + (src>>(16-C)) should be close enough, my point was mainly that av_rescale() is too slow to be done per pixel and anything else is better
Okay, modified patch attached.
[...]
@@ -806,6 +815,22 @@
line += s->picture.linesize[0]; } + } else { + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { + uint16_t *dst; + x = tile->comp[0].coord[0][0] - s->image_offset_x; + dst = line + x * s->ncomponents * 2; + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { + for (compno = 0; compno < s->ncomponents; compno++) { + *src[compno] = *src[compno] << (16 - s->cbps[compno]); + *src[compno] += 1 << 15; + *src[compno] = av_clip(*src[compno], 0, (1 << 16) - 1); + *dst++ = *src[compno]++;
i dont think using *src[compno] as a temporary is a good choice
You mean *src[compno] should be copied to dst and all operations should be done on dst? Current approach seemed correct because this a part of level shifting. Or did i misunderstand?
int val= src << ... val += ... val = av_clip(...) *dst++= val; its easy for the compiler to put val in a register, doing t with src is not because it would have to proof that src is not read after it [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Awnsering whenever a program halts or runs forever is On a turing machine, in general impossible (turings halting problem). On any real computer, always possible as a real computer has a finite number of states N, and will either halt in less than N cycles or never halt.