[PATCH] jpeg2k : support higher sample precision
Hi, Attached patch fixes decoding of files where component precision > 8. Possible test files are : codestreams_profile1/p1_04.j2k testfiles_jp2/file6.jp2 testfiles_jp2/file7.jp2 Posting here for comments. -- Regards, Jai
On Sun, Jun 21, 2009 at 4:35 PM, Jai Menon<jmenon86@gmail.com> wrote:
Hi,
Attached patch fixes decoding of files where component precision > 8.
Possible test files are :
codestreams_profile1/p1_04.j2k testfiles_jp2/file6.jp2 testfiles_jp2/file7.jp2
Posting here for comments.
Michael, is this okay to apply? -- Regards, Jai
On Sun, Jun 21, 2009 at 4:35 PM, Jai Menon<jmenon86@gmail.com> wrote:
Hi,
Attached patch fixes decoding of files where component precision > 8.
Possible test files are :
codestreams_profile1/p1_04.j2k testfiles_jp2/file6.jp2 testfiles_jp2/file7.jp2
Posting here for comments.
I guess there are no comments, will apply this weekend. -- Regards, Jai
On Sun, Jun 21, 2009 at 04:35:20PM +0000, Jai Menon wrote:
Hi,
Attached patch fixes decoding of files where component precision > 8.
Possible test files are :
codestreams_profile1/p1_04.j2k testfiles_jp2/file6.jp2 testfiles_jp2/file7.jp2
Posting here for comments.
-- Regards,
Jai
j2kdec.c | 35 ++++++++++++++++++++++++++++++++--- 1 file changed, 32 insertions(+), 3 deletions(-) 10381c8c4a6486b49b3013bc0683cb6d62f9161e high_precision.patch Index: j2kdec.c =================================================================== --- j2kdec.c (revision 4434) +++ j2kdec.c (working copy) @@ -54,7 +54,7 @@ uint8_t cbps[4]; ///< bits per sample in particular components uint8_t sgnd[4]; ///< if a component is signed uint8_t properties[4]; - + int precision; int ncomponents; int tile_width, tile_height; ///< tile size int numXtiles, numYtiles;
@@ -225,6 +225,8 @@ for (i = 0; i < s->ncomponents; i++){ // Ssiz_i XRsiz_i, YRsiz_i uint8_t x = bytestream_get_byte(&s->buf); s->cbps[i] = (x & 0x7f) + 1; + if (s->cbps[i] > s->precision) + s->precision = s->cbps[i]; s->sgnd[i] = (x & 0x80) == 1; if (bytestream_get_byte(&s->buf) != 1) return -1;
FFMAX [...]
@@ -806,6 +815,26 @@
line += s->picture.linesize[0]; } + } else { + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { + uint16_t *dst; + x = tile->comp[0].coord[0][0] - s->image_offset_x; + dst = line + x * s->ncomponents * 2; + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { + for (compno = 0; compno < s->ncomponents; compno++) {
+ *src[compno] = av_rescale(*src[compno], (1 << 16) - 1, + (1 << s->cbps[compno]) - 1);
av_rescale is too slow
+ *src[compno] += 1 << 15; + if (*src[compno] < 0) + *src[compno] = 0; + else if (*src[compno] >= (1 << 16)) + *src[compno] = (1 << 16) - 1; + *dst++ = *src[compno]++;
av_clip()
+ } + } + line += s->picture.linesize[0]; + } + } return 0; }
_______________________________________________ FFmpeg-soc mailing list FFmpeg-soc@mplayerhq.hu https://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-soc
-- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Republics decline into democracies and democracies degenerate into despotisms. -- Aristotle
On Wed, Jun 24, 2009 at 1:27 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Sun, Jun 21, 2009 at 04:35:20PM +0000, Jai Menon wrote:
Hi,
Attached patch fixes decoding of files where component precision > 8.
Possible test files are :
codestreams_profile1/p1_04.j2k testfiles_jp2/file6.jp2 testfiles_jp2/file7.jp2
Posting here for comments.
-- Regards,
Jai
j2kdec.c | 35 ++++++++++++++++++++++++++++++++--- 1 file changed, 32 insertions(+), 3 deletions(-) 10381c8c4a6486b49b3013bc0683cb6d62f9161e high_precision.patch Index: j2kdec.c =================================================================== --- j2kdec.c (revision 4434) +++ j2kdec.c (working copy) @@ -54,7 +54,7 @@ uint8_t cbps[4]; ///< bits per sample in particular components uint8_t sgnd[4]; ///< if a component is signed uint8_t properties[4]; - + int precision; int ncomponents; int tile_width, tile_height; ///< tile size int numXtiles, numYtiles;
@@ -225,6 +225,8 @@ for (i = 0; i < s->ncomponents; i++){ // Ssiz_i XRsiz_i, YRsiz_i uint8_t x = bytestream_get_byte(&s->buf); s->cbps[i] = (x & 0x7f) + 1; + if (s->cbps[i] > s->precision) + s->precision = s->cbps[i]; s->sgnd[i] = (x & 0x80) == 1; if (bytestream_get_byte(&s->buf) != 1) return -1;
FFMAX
Changed locally.
[...]
@@ -806,6 +815,26 @@
line += s->picture.linesize[0]; } + } else { + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { + uint16_t *dst; + x = tile->comp[0].coord[0][0] - s->image_offset_x; + dst = line + x * s->ncomponents * 2; + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { + for (compno = 0; compno < s->ncomponents; compno++) {
+ *src[compno] = av_rescale(*src[compno], (1 << 16) - 1, + (1 << s->cbps[compno]) - 1);
av_rescale is too slow
So just (*src[compno]/((1 << s->cbps[compno]) - 1)) * ((1 << 16) - 1) ?
+ *src[compno] += 1 << 15; + if (*src[compno] < 0) + *src[compno] = 0; + else if (*src[compno] >= (1 << 16)) + *src[compno] = (1 << 16) - 1; + *dst++ = *src[compno]++;
av_clip()
Ah yes, fixed. I'll change it in the other case as well. Thanks for the review. -- Regards, Jai
On Wed, Jun 24, 2009 at 01:42:08PM +0000, Jai Menon wrote:
On Wed, Jun 24, 2009 at 1:27 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Sun, Jun 21, 2009 at 04:35:20PM +0000, Jai Menon wrote: [...] [...]
@@ -806,6 +815,26 @@
line += s->picture.linesize[0]; } + } else { + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { + uint16_t *dst; + x = tile->comp[0].coord[0][0] - s->image_offset_x; + dst = line + x * s->ncomponents * 2; + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { + for (compno = 0; compno < s->ncomponents; compno++) {
+ *src[compno] = av_rescale(*src[compno], (1 << 16) - 1, + (1 << s->cbps[compno]) - 1);
av_rescale is too slow
So just (*src[compno]/((1 << s->cbps[compno]) - 1)) * ((1 << 16) - 1) ?
* is slow / s slower "src" << C it should be [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Why not whip the teacher when the pupil misbehaves? -- Diogenes of Sinope
On Wed, Jun 24, 2009 at 3:58 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Wed, Jun 24, 2009 at 01:42:08PM +0000, Jai Menon wrote:
On Wed, Jun 24, 2009 at 1:27 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Sun, Jun 21, 2009 at 04:35:20PM +0000, Jai Menon wrote: [...] [...]
@@ -806,6 +815,26 @@
line += s->picture.linesize[0]; } + } else { + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { + uint16_t *dst; + x = tile->comp[0].coord[0][0] - s->image_offset_x; + dst = line + x * s->ncomponents * 2; + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { + for (compno = 0; compno < s->ncomponents; compno++) {
+ *src[compno] = av_rescale(*src[compno], (1 << 16) - 1, + (1 << s->cbps[compno]) - 1);
av_rescale is too slow
So just (*src[compno]/((1 << s->cbps[compno]) - 1)) * ((1 << 16) - 1) ?
* is slow / s slower
"src" << C it should be
<possibly dumb question ahead> I understand that * and / are slower but how can I achieve the same effect with a single <<? -- Regards, Jai
On Wed, Jun 24, 2009 at 05:59:19PM +0000, Jai Menon wrote:
On Wed, Jun 24, 2009 at 3:58 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Wed, Jun 24, 2009 at 01:42:08PM +0000, Jai Menon wrote:
On Wed, Jun 24, 2009 at 1:27 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Sun, Jun 21, 2009 at 04:35:20PM +0000, Jai Menon wrote: [...] [...]
@@ -806,6 +815,26 @@
line += s->picture.linesize[0]; } + } else { + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { + uint16_t *dst; + x = tile->comp[0].coord[0][0] - s->image_offset_x; + dst = line + x * s->ncomponents * 2; + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { + for (compno = 0; compno < s->ncomponents; compno++) {
+ *src[compno] = av_rescale(*src[compno], (1 << 16) - 1, + (1 << s->cbps[compno]) - 1);
av_rescale is too slow
So just (*src[compno]/((1 << s->cbps[compno]) - 1)) * ((1 << 16) - 1) ?
* is slow / s slower
"src" << C it should be
<possibly dumb question ahead>
I understand that * and / are slower but how can I achieve the same effect with a single <<?
well, not the same but close enough IMHO src<<C or (src<<C) + (src>>(16-C)) should be close enough, my point was mainly that av_rescale() is too slow to be done per pixel and anything else is better it also might be possible to merge the rescaling into some other previous step [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB If you think the mosad wants you dead since a long time then you are either wrong or dead since a long time.
On Thu, Jun 25, 2009 at 9:51 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Wed, Jun 24, 2009 at 05:59:19PM +0000, Jai Menon wrote:
On Wed, Jun 24, 2009 at 3:58 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Wed, Jun 24, 2009 at 01:42:08PM +0000, Jai Menon wrote:
On Wed, Jun 24, 2009 at 1:27 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Sun, Jun 21, 2009 at 04:35:20PM +0000, Jai Menon wrote: [...] [...]
@@ -806,6 +815,26 @@
line += s->picture.linesize[0]; } + } else { + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { + uint16_t *dst; + x = tile->comp[0].coord[0][0] - s->image_offset_x; + dst = line + x * s->ncomponents * 2; + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { + for (compno = 0; compno < s->ncomponents; compno++) {
+ *src[compno] = av_rescale(*src[compno], (1 << 16) - 1, + (1 << s->cbps[compno]) - 1);
av_rescale is too slow
So just (*src[compno]/((1 << s->cbps[compno]) - 1)) * ((1 << 16) - 1) ?
* is slow / s slower
"src" << C it should be
<possibly dumb question ahead>
I understand that * and / are slower but how can I achieve the same effect with a single <<?
well, not the same but close enough IMHO src<<C or (src<<C) + (src>>(16-C)) should be close enough, my point was mainly that av_rescale() is too slow to be done per pixel and anything else is better
Okay, modified patch attached. -- Regards, Jai
On Sat, Jun 27, 2009 at 08:25:43PM +0000, Jai Menon wrote:
On Thu, Jun 25, 2009 at 9:51 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Wed, Jun 24, 2009 at 05:59:19PM +0000, Jai Menon wrote:
On Wed, Jun 24, 2009 at 3:58 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Wed, Jun 24, 2009 at 01:42:08PM +0000, Jai Menon wrote:
On Wed, Jun 24, 2009 at 1:27 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Sun, Jun 21, 2009 at 04:35:20PM +0000, Jai Menon wrote: [...] [...] > @@ -806,6 +815,26 @@ > > line += s->picture.linesize[0]; > } > + } else { > + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { > + uint16_t *dst; > + x = tile->comp[0].coord[0][0] - s->image_offset_x; > + dst = line + x * s->ncomponents * 2; > + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { > + for (compno = 0; compno < s->ncomponents; compno++) {
> + *src[compno] = av_rescale(*src[compno], (1 << 16) - 1, > + (1 << s->cbps[compno]) - 1);
av_rescale is too slow
So just (*src[compno]/((1 << s->cbps[compno]) - 1)) * ((1 << 16) - 1) ?
* is slow / s slower
"src" << C it should be
<possibly dumb question ahead>
I understand that * and / are slower but how can I achieve the same effect with a single <<?
well, not the same but close enough IMHO src<<C or (src<<C) + (src>>(16-C)) should be close enough, my point was mainly that av_rescale() is too slow to be done per pixel and anything else is better
Okay, modified patch attached.
[...]
@@ -806,6 +815,22 @@
line += s->picture.linesize[0]; } + } else { + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { + uint16_t *dst; + x = tile->comp[0].coord[0][0] - s->image_offset_x; + dst = line + x * s->ncomponents * 2; + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { + for (compno = 0; compno < s->ncomponents; compno++) { + *src[compno] = *src[compno] << (16 - s->cbps[compno]); + *src[compno] += 1 << 15; + *src[compno] = av_clip(*src[compno], 0, (1 << 16) - 1); + *dst++ = *src[compno]++;
i dont think using *src[compno] as a temporary is a good choice [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Opposition brings concord. Out of discord comes the fairest harmony. -- Heraclitus
On Sun, Jun 28, 2009 at 12:06 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Sat, Jun 27, 2009 at 08:25:43PM +0000, Jai Menon wrote:
On Thu, Jun 25, 2009 at 9:51 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Wed, Jun 24, 2009 at 05:59:19PM +0000, Jai Menon wrote:
On Wed, Jun 24, 2009 at 3:58 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Wed, Jun 24, 2009 at 01:42:08PM +0000, Jai Menon wrote:
On Wed, Jun 24, 2009 at 1:27 PM, Michael Niedermayer<michaelni@gmx.at> wrote: > On Sun, Jun 21, 2009 at 04:35:20PM +0000, Jai Menon wrote: [...] > [...] >> @@ -806,6 +815,26 @@ >> >> line += s->picture.linesize[0]; >> } >> + } else { >> + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { >> + uint16_t *dst; >> + x = tile->comp[0].coord[0][0] - s->image_offset_x; >> + dst = line + x * s->ncomponents * 2; >> + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { >> + for (compno = 0; compno < s->ncomponents; compno++) { > >> + *src[compno] = av_rescale(*src[compno], (1 << 16) - 1, >> + (1 << s->cbps[compno]) - 1); > > av_rescale is too slow
So just (*src[compno]/((1 << s->cbps[compno]) - 1)) * ((1 << 16) - 1) ?
* is slow / s slower
"src" << C it should be
<possibly dumb question ahead>
I understand that * and / are slower but how can I achieve the same effect with a single <<?
well, not the same but close enough IMHO src<<C or (src<<C) + (src>>(16-C)) should be close enough, my point was mainly that av_rescale() is too slow to be done per pixel and anything else is better
Okay, modified patch attached.
[...]
@@ -806,6 +815,22 @@
line += s->picture.linesize[0]; } + } else { + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { + uint16_t *dst; + x = tile->comp[0].coord[0][0] - s->image_offset_x; + dst = line + x * s->ncomponents * 2; + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { + for (compno = 0; compno < s->ncomponents; compno++) { + *src[compno] = *src[compno] << (16 - s->cbps[compno]); + *src[compno] += 1 << 15; + *src[compno] = av_clip(*src[compno], 0, (1 << 16) - 1); + *dst++ = *src[compno]++;
i dont think using *src[compno] as a temporary is a good choice
You mean *src[compno] should be copied to dst and all operations should be done on dst? Current approach seemed correct because this a part of level shifting. Or did i misunderstand? -- Regards, Jai
On Sun, Jul 05, 2009 at 12:02:27PM +0000, Jai Menon wrote:
On Sun, Jun 28, 2009 at 12:06 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Sat, Jun 27, 2009 at 08:25:43PM +0000, Jai Menon wrote:
On Thu, Jun 25, 2009 at 9:51 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Wed, Jun 24, 2009 at 05:59:19PM +0000, Jai Menon wrote:
On Wed, Jun 24, 2009 at 3:58 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Wed, Jun 24, 2009 at 01:42:08PM +0000, Jai Menon wrote: > On Wed, Jun 24, 2009 at 1:27 PM, Michael Niedermayer<michaelni@gmx.at> wrote: > > On Sun, Jun 21, 2009 at 04:35:20PM +0000, Jai Menon wrote: [...] > > [...] > >> @@ -806,6 +815,26 @@ > >> > >> line += s->picture.linesize[0]; > >> } > >> + } else { > >> + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { > >> + uint16_t *dst; > >> + x = tile->comp[0].coord[0][0] - s->image_offset_x; > >> + dst = line + x * s->ncomponents * 2; > >> + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { > >> + for (compno = 0; compno < s->ncomponents; compno++) { > > > >> + *src[compno] = av_rescale(*src[compno], (1 << 16) - 1, > >> + (1 << s->cbps[compno]) - 1); > > > > av_rescale is too slow > > So just (*src[compno]/((1 << s->cbps[compno]) - 1)) * ((1 << 16) - 1) ?
* is slow / s slower
"src" << C it should be
<possibly dumb question ahead>
I understand that * and / are slower but how can I achieve the same effect with a single <<?
well, not the same but close enough IMHO src<<C or (src<<C) + (src>>(16-C)) should be close enough, my point was mainly that av_rescale() is too slow to be done per pixel and anything else is better
Okay, modified patch attached.
[...]
@@ -806,6 +815,22 @@
line += s->picture.linesize[0]; } + } else { + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { + uint16_t *dst; + x = tile->comp[0].coord[0][0] - s->image_offset_x; + dst = line + x * s->ncomponents * 2; + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { + for (compno = 0; compno < s->ncomponents; compno++) { + *src[compno] = *src[compno] << (16 - s->cbps[compno]); + *src[compno] += 1 << 15; + *src[compno] = av_clip(*src[compno], 0, (1 << 16) - 1); + *dst++ = *src[compno]++;
i dont think using *src[compno] as a temporary is a good choice
You mean *src[compno] should be copied to dst and all operations should be done on dst? Current approach seemed correct because this a part of level shifting. Or did i misunderstand?
int val= src << ... val += ... val = av_clip(...) *dst++= val; its easy for the compiler to put val in a register, doing t with src is not because it would have to proof that src is not read after it [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Awnsering whenever a program halts or runs forever is On a turing machine, in general impossible (turings halting problem). On any real computer, always possible as a real computer has a finite number of states N, and will either halt in less than N cycles or never halt.
On Mon, Jul 6, 2009 at 3:45 AM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Sun, Jul 05, 2009 at 12:02:27PM +0000, Jai Menon wrote:
On Sun, Jun 28, 2009 at 12:06 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Sat, Jun 27, 2009 at 08:25:43PM +0000, Jai Menon wrote:
On Thu, Jun 25, 2009 at 9:51 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Wed, Jun 24, 2009 at 05:59:19PM +0000, Jai Menon wrote:
On Wed, Jun 24, 2009 at 3:58 PM, Michael Niedermayer<michaelni@gmx.at> wrote: > On Wed, Jun 24, 2009 at 01:42:08PM +0000, Jai Menon wrote: >> On Wed, Jun 24, 2009 at 1:27 PM, Michael Niedermayer<michaelni@gmx.at> wrote: >> > On Sun, Jun 21, 2009 at 04:35:20PM +0000, Jai Menon wrote: > [...] >> > [...] >> >> @@ -806,6 +815,26 @@ >> >> >> >> line += s->picture.linesize[0]; >> >> } >> >> + } else { >> >> + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { >> >> + uint16_t *dst; >> >> + x = tile->comp[0].coord[0][0] - s->image_offset_x; >> >> + dst = line + x * s->ncomponents * 2; >> >> + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { >> >> + for (compno = 0; compno < s->ncomponents; compno++) { >> > >> >> + *src[compno] = av_rescale(*src[compno], (1 << 16) - 1, >> >> + (1 << s->cbps[compno]) - 1); >> > >> > av_rescale is too slow >> >> So just (*src[compno]/((1 << s->cbps[compno]) - 1)) * ((1 << 16) - 1) ? > > * is slow > / s slower > > "src" << C > it should be
<possibly dumb question ahead>
I understand that * and / are slower but how can I achieve the same effect with a single <<?
well, not the same but close enough IMHO src<<C or (src<<C) + (src>>(16-C)) should be close enough, my point was mainly that av_rescale() is too slow to be done per pixel and anything else is better
Okay, modified patch attached.
[...]
@@ -806,6 +815,22 @@
line += s->picture.linesize[0]; } + } else { + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { + uint16_t *dst; + x = tile->comp[0].coord[0][0] - s->image_offset_x; + dst = line + x * s->ncomponents * 2; + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { + for (compno = 0; compno < s->ncomponents; compno++) { + *src[compno] = *src[compno] << (16 - s->cbps[compno]); + *src[compno] += 1 << 15; + *src[compno] = av_clip(*src[compno], 0, (1 << 16) - 1); + *dst++ = *src[compno]++;
i dont think using *src[compno] as a temporary is a good choice
You mean *src[compno] should be copied to dst and all operations should be done on dst? Current approach seemed correct because this a part of level shifting. Or did i misunderstand?
int val= src << ... val += ... val = av_clip(...) *dst++= val;
its easy for the compiler to put val in a register, doing t with src is not because it would have to proof that src is not read after it
Ah, thanks for the explanation. Modified patch attached. I guess same change should be made for the other case as well. -- Regards, Jai
On Thu, Jul 9, 2009 at 3:31 AM, Jai Menon<jmenon86@gmail.com> wrote:
On Mon, Jul 6, 2009 at 3:45 AM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Sun, Jul 05, 2009 at 12:02:27PM +0000, Jai Menon wrote:
On Sun, Jun 28, 2009 at 12:06 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Sat, Jun 27, 2009 at 08:25:43PM +0000, Jai Menon wrote:
On Thu, Jun 25, 2009 at 9:51 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Wed, Jun 24, 2009 at 05:59:19PM +0000, Jai Menon wrote: > On Wed, Jun 24, 2009 at 3:58 PM, Michael Niedermayer<michaelni@gmx.at> wrote: > > On Wed, Jun 24, 2009 at 01:42:08PM +0000, Jai Menon wrote: > >> On Wed, Jun 24, 2009 at 1:27 PM, Michael Niedermayer<michaelni@gmx.at> wrote: > >> > On Sun, Jun 21, 2009 at 04:35:20PM +0000, Jai Menon wrote: > > [...] > >> > [...] > >> >> @@ -806,6 +815,26 @@ > >> >> > >> >> line += s->picture.linesize[0]; > >> >> } > >> >> + } else { > >> >> + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { > >> >> + uint16_t *dst; > >> >> + x = tile->comp[0].coord[0][0] - s->image_offset_x; > >> >> + dst = line + x * s->ncomponents * 2; > >> >> + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { > >> >> + for (compno = 0; compno < s->ncomponents; compno++) { > >> > > >> >> + *src[compno] = av_rescale(*src[compno], (1 << 16) - 1, > >> >> + (1 << s->cbps[compno]) - 1); > >> > > >> > av_rescale is too slow > >> > >> So just (*src[compno]/((1 << s->cbps[compno]) - 1)) * ((1 << 16) - 1) ? > > > > * is slow > > / s slower > > > > "src" << C > > it should be > > <possibly dumb question ahead> > > I understand that * and / are slower but how can I achieve the same > effect with a single <<?
well, not the same but close enough IMHO src<<C or (src<<C) + (src>>(16-C)) should be close enough, my point was mainly that av_rescale() is too slow to be done per pixel and anything else is better
Okay, modified patch attached.
[...]
@@ -806,6 +815,22 @@
line += s->picture.linesize[0]; } + } else { + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { + uint16_t *dst; + x = tile->comp[0].coord[0][0] - s->image_offset_x; + dst = line + x * s->ncomponents * 2; + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { + for (compno = 0; compno < s->ncomponents; compno++) { + *src[compno] = *src[compno] << (16 - s->cbps[compno]); + *src[compno] += 1 << 15; + *src[compno] = av_clip(*src[compno], 0, (1 << 16) - 1); + *dst++ = *src[compno]++;
i dont think using *src[compno] as a temporary is a good choice
You mean *src[compno] should be copied to dst and all operations should be done on dst? Current approach seemed correct because this a part of level shifting. Or did i misunderstand?
int val= src << ... val += ... val = av_clip(...) *dst++= val;
its easy for the compiler to put val in a register, doing t with src is not because it would have to proof that src is not read after it
Ah, thanks for the explanation. Modified patch attached. I guess same change should be made for the other case as well.
Is this okay to apply Michael? -- Regards, Jai
On Thu, Jul 09, 2009 at 03:31:34AM +0000, Jai Menon wrote:
On Mon, Jul 6, 2009 at 3:45 AM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Sun, Jul 05, 2009 at 12:02:27PM +0000, Jai Menon wrote:
On Sun, Jun 28, 2009 at 12:06 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Sat, Jun 27, 2009 at 08:25:43PM +0000, Jai Menon wrote:
On Thu, Jun 25, 2009 at 9:51 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Wed, Jun 24, 2009 at 05:59:19PM +0000, Jai Menon wrote: > On Wed, Jun 24, 2009 at 3:58 PM, Michael Niedermayer<michaelni@gmx.at> wrote: > > On Wed, Jun 24, 2009 at 01:42:08PM +0000, Jai Menon wrote: > >> On Wed, Jun 24, 2009 at 1:27 PM, Michael Niedermayer<michaelni@gmx.at> wrote: > >> > On Sun, Jun 21, 2009 at 04:35:20PM +0000, Jai Menon wrote: > > [...] > >> > [...] > >> >> @@ -806,6 +815,26 @@ > >> >> > >> >> line += s->picture.linesize[0]; > >> >> } > >> >> + } else { > >> >> + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { > >> >> + uint16_t *dst; > >> >> + x = tile->comp[0].coord[0][0] - s->image_offset_x; > >> >> + dst = line + x * s->ncomponents * 2; > >> >> + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { > >> >> + for (compno = 0; compno < s->ncomponents; compno++) { > >> > > >> >> + *src[compno] = av_rescale(*src[compno], (1 << 16) - 1, > >> >> + (1 << s->cbps[compno]) - 1); > >> > > >> > av_rescale is too slow > >> > >> So just (*src[compno]/((1 << s->cbps[compno]) - 1)) * ((1 << 16) - 1) ? > > > > * is slow > > / s slower > > > > "src" << C > > it should be > > <possibly dumb question ahead> > > I understand that * and / are slower but how can I achieve the same > effect with a single <<?
well, not the same but close enough IMHO src<<C or (src<<C) + (src>>(16-C)) should be close enough, my point was mainly that av_rescale() is too slow to be done per pixel and anything else is better
Okay, modified patch attached.
[...]
@@ -806,6 +815,22 @@
line += s->picture.linesize[0]; } + } else { + for (; y < tile->comp[0].coord[1][1] - s->image_offset_y; y++) { + uint16_t *dst; + x = tile->comp[0].coord[0][0] - s->image_offset_x; + dst = line + x * s->ncomponents * 2; + for (; x < tile->comp[0].coord[0][1] - s->image_offset_x; x++) { + for (compno = 0; compno < s->ncomponents; compno++) { + *src[compno] = *src[compno] << (16 - s->cbps[compno]); + *src[compno] += 1 << 15; + *src[compno] = av_clip(*src[compno], 0, (1 << 16) - 1); + *dst++ = *src[compno]++;
i dont think using *src[compno] as a temporary is a good choice
You mean *src[compno] should be copied to dst and all operations should be done on dst? Current approach seemed correct because this a part of level shifting. Or did i misunderstand?
int val= src << ... val += ... val = av_clip(...) *dst++= val;
its easy for the compiler to put val in a register, doing t with src is not because it would have to proof that src is not read after it
Ah, thanks for the explanation. Modified patch attached. I guess same change should be made for the other case as well.
-- Regards,
Jai
j2kdec.c | 30 ++++++++++++++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-) bf80dbee0e341ad620da597f9bf71f0c62b72b11 high_precision.patch
ok [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB There will always be a question for which you do not know the correct awnser.
On Sun, Jul 12, 2009 at 7:22 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Thu, Jul 09, 2009 at 03:31:34AM +0000, Jai Menon wrote:
On Mon, Jul 6, 2009 at 3:45 AM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Sun, Jul 05, 2009 at 12:02:27PM +0000, Jai Menon wrote:
On Sun, Jun 28, 2009 at 12:06 PM, Michael Niedermayer<michaelni@gmx.at> wrote:
On Sat, Jun 27, 2009 at 08:25:43PM +0000, Jai Menon wrote:
On Thu, Jun 25, 2009 at 9:51 PM, Michael Niedermayer<michaelni@gmx.at> wrote: > On Wed, Jun 24, 2009 at 05:59:19PM +0000, Jai Menon wrote: >> On Wed, Jun 24, 2009 at 3:58 PM, Michael Niedermayer<michaelni@gmx.at> wrote: >> > On Wed, Jun 24, 2009 at 01:42:08PM +0000, Jai Menon wrote: >> >> On Wed, Jun 24, 2009 at 1:27 PM, Michael Niedermayer<michaelni@gmx.at> wrote: >> >> > On Sun, Jun 21, 2009 at 04:35:20PM +0000, Jai Menon wrote:
[...]
j2kdec.c | 30 ++++++++++++++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-) bf80dbee0e341ad620da597f9bf71f0c62b72b11 high_precision.patch
ok
Thanks, applied. -- Regards, Jai
participants (2)
-
Jai Menon -
Michael Niedermayer