Author Topic: Faster fade out code?  (Read 3308 times)

DarkKobold

  • Hero Member
  • *****
  • Posts: 1200
Faster fade out code?
« on: September 20, 2015, 04:30:16 AM »
Here is my fade out code:


   
fade_out()
{
   int i, clr;
   char j;
   for (j = 0; j <8; j++)
   {
      for (i=0; i < 64; i++)
      {
         clr = get_color(i);
         if ((clr&7) > 0) clr = clr - 1;
         if ((clr&56) > 0) clr = clr - 8;
           if ((clr&448) > 0) clr = clr - 64;
           set_color(i, clr);         
       }
       vsync();
       for (i=64; i < 128; i++)
      {
         clr = get_color(i);
         if ((clr&7) > 0) clr = clr - 1;
         if ((clr&56) > 0) clr = clr - 8;
           if ((clr&448) > 0) clr = clr - 64;
           set_color(i, clr);         
       }
       vsync();
       for (i=128; i < 192; i++)
      {
         clr = get_color(i);
         if ((clr&7) > 0) clr = clr - 1;
         if ((clr&56) > 0) clr = clr - 8;
           if ((clr&448) > 0) clr = clr - 64;
           set_color(i, clr);         
       }
       vsync();
       for (i=192; i < 256; i++)
      {
         clr = get_color(i);
         if ((clr&7) > 0) clr = clr - 1;
         if ((clr&56) > 0) clr = clr - 8;
           if ((clr&448) > 0) clr = clr - 64;
           set_color(i, clr);         
       }
       vsync();
       for (i=256; i < 320; i++)
      {
         clr = get_color(i);
         if ((clr&7) > 0) clr = clr - 1;
         if ((clr&56) > 0) clr = clr - 8;
           if ((clr&448) > 0) clr = clr - 64;
           set_color(i, clr);         
       }
       vsync();
       for (i=320; i < 384; i++)
      {
         clr = get_color(i);
         if ((clr&7) > 0) clr = clr - 1;
         if ((clr&56) > 0) clr = clr - 8;
           if ((clr&448) > 0) clr = clr - 64;
           set_color(i, clr);         
       }
       vsync();
       for (i=384; i < 448; i++)
      {
         clr = get_color(i);
         if ((clr&7) > 0) clr = clr - 1;
         if ((clr&56) > 0) clr = clr - 8;
           if ((clr&448) > 0) clr = clr - 64;
           set_color(i, clr);         
       }
       vsync();
       for (i=448; i < 512; i++)
      {
         clr = get_color(i);
         if ((clr&7) > 0) clr = clr - 1;
         if ((clr&56) > 0) clr = clr - 8;
           if ((clr&448) > 0) clr = clr - 64;
           set_color(i, clr);         
       }
       vsync();
   }   
     cls();
     reset_satb();
     satb_update();
     vsync();
}   


Unfortunately, even when dividing it up into 64 color blocks, it still causes flicker in real hardware. I'm guessing its just too slow to write in C, but I'm not good enough with assembly. Any help would be appreciated!
Hey, you.

touko

  • Hero Member
  • *****
  • Posts: 953
Re: Faster fade out code?
« Reply #1 on: September 20, 2015, 06:05:36 AM »
I think the best way for optimising your routine is doing the fade in a buffer for all palettes and transfer all the buffer after a vsync in asm with a tia bloc transfer .

like that:

/* A 256 bytes buffer is enough for 8 palettes */
int my_buffer[1024];

#asm

   stz $402
   stz $403

   tia _my_buffer , $404 , 1023

#endasm

My fade routine is close to yours, and is very fast,but in ASM .


« Last Edit: September 21, 2015, 09:53:25 PM by touko »

TheOldMan

  • Hero Member
  • *****
  • Posts: 958
Re: Faster fade out code?
« Reply #2 on: September 20, 2015, 07:28:32 AM »
int dv;
.
.
.
         clr = get_color(i);
         dv = 0;

         /* low color bits : 0000 0111  */
         if (clr & 0x0007 ) dv ++;

         /* mid color bits : 0011 1000 */
         if (clr & 0x0028 ) dv = dv+ 8;  /* iirc, this is faster than += in huc. try both */

         /* hi color bits : 1 1100 0000 */
          if (clr & 0x01c0 ) dv = dv +  64;

          clr = clr - dv;
           set_color(i, clr);         
.
.
.
...............................................................................
that's off the top of my head, so double check the hex values :) And everything else....

Assuming it works, the next step is to move that to asm;  iirc, an int parameter will come in in the A and X registers, so color won't have to be loaded; the rest is a pretty straightfoward conversion.
( look up the asm code in the listing to see how it's done. That's how I learned 650x asm :)

For the low sets of bits, you can just check the low byte; for the high set, shift the color right (?) for the check (then you can and with 0xe0 ) But that's a bit of asm optimization you may not need (or want to do).
.................................................................................
Just out of curiosity, why are you ckecking for >0 anyway? clr can't be negative, an & won't change that. And != 0 is true in C, so you don't have to compare the result to anything....

Bonknuts

  • Hero Member
  • *****
  • Posts: 3292
Re: Faster fade out code?
« Reply #3 on: September 20, 2015, 09:23:00 AM »
What touko said. Wait for vblank, read all colors (sprite and BG, if desired) into a buffer in ram. Do your alterations to those values in ram, then wait for vync and upload the changes during vblank. Rinse repeat. HuC is still going to be slow, but if you do the operations during normal/whole frames but only update the changes during vblank - it should do the job.

 Fading out is easy, fading in is a bit more complex.

 On a side note; I had an idea for a RGB to YUV conversion table, for special fading type of effect. YUV is nicer to work with IMO and gives you a wider range of features. Of course, going from YUV to RGB will need a different set of tables and take a bit longer.
« Last Edit: September 20, 2015, 09:26:26 AM by Bonknuts »

spenoza

  • Hero Member
  • *****
  • Posts: 2751
Re: Faster fade out code?
« Reply #4 on: September 20, 2015, 11:05:05 AM »
Fading out is easy, fading in is a bit more complex.

When you fade in you know what your colors are going to be, so couldn't you pre-calculate/prepare the fade-in and then just cycle through the known palettes?
<a href="http://www.pcedaisakusen.net/2/34/103/show-collection.htm" class="bbc_link" target="_blank">My meager PC Engine Collection so far.</a><br><a href="https://www.pcenginefx.com/forums/" class="bbc_link" target="_blank">PC Engine Software Bible</a><br><a href="http://www.racketboy.com/forum/" c

Bonknuts

  • Hero Member
  • *****
  • Posts: 3292
Re: Faster fade out code?
« Reply #5 on: September 20, 2015, 11:37:47 AM »
Yeah, you need to know the values to reach rather than just checking for overflow or floor (fixed value). It requires a little more logic for testing for overflow for the addition process. There are quite a few ways to do fade in approach, but they are all more complex with more operations than a simple fade out.

 Fixed point deltas are one way. That requires a very large buffer (entries * RGB * delta or 512*3*2) in ram and setup the initial distance calculation for each color (which can take some time). It's wasteful on ram, but every R/G/B is faded in equally.

 Rate of change delta is another way. It requires a smaller size buffer, but doesn't fade in equally. You basically take the R/G/B color of the destination and copy these as the deltas to subtract from the destination palette block (your first initial setup). On every call, you subtract 1 from each delta, then subtract from the RGB block and write to the buffer to be uploaded. The R/G/B destination values are never altered, just used as a value to subtract from. When the delta reaches 0, because on each call you subtract by one, it means that particular R/G/B is at full value. As you can see, lower values will reach their max value faster than large values. It looks decent, and it's fast.

 

touko

  • Hero Member
  • *****
  • Posts: 953
Re: Faster fade out code?
« Reply #6 on: September 21, 2015, 01:02:56 AM »
@DarkKobold:And may be you don't need to fade all palettes .

Gredler

  • Guest
Re: Faster fade out code?
« Reply #7 on: September 21, 2015, 03:08:01 AM »
@DarkKobold:And may be you don't need to fade all palettes .


This is what I was thinking,  only apply the fade to specific sprites that cant be animated onto the screen (background and ui elements).

touko

  • Hero Member
  • *****
  • Posts: 953
Re: Faster fade out code?
« Reply #8 on: September 21, 2015, 03:22:05 AM »
Yes is useless to do it on the 32 palettes 99% of the time .
In practice it's 4->6 palettes max .

DarkKobold

  • Hero Member
  • *****
  • Posts: 1200
Re: Faster fade out code?
« Reply #9 on: September 21, 2015, 03:43:02 AM »
Yeah, you need to know the values to reach rather than just checking for overflow or floor (fixed value). It requires a little more logic for testing for overflow for the addition process. There are quite a few ways to do fade in approach, but they are all more complex with more operations than a simple fade out.

 Fixed point deltas are one way. That requires a very large buffer (entries * RGB * delta or 512*3*2) in ram and setup the initial distance calculation for each color (which can take some time). It's wasteful on ram, but every R/G/B is faded in equally.

 Rate of change delta is another way. It requires a smaller size buffer, but doesn't fade in equally. You basically take the R/G/B color of the destination and copy these as the deltas to subtract from the destination palette block (your first initial setup). On every call, you subtract 1 from each delta, then subtract from the RGB block and write to the buffer to be uploaded. The R/G/B destination values are never altered, just used as a value to subtract from. When the delta reaches 0, because on each call you subtract by one, it means that particular R/G/B is at full value. As you can see, lower values will reach their max value faster than large values. It looks decent, and it's fast.

 

This is way too complex, you just need to subtract each color from 511, and do it backwards. Excuse the psuedo code.

int palette_holder[512];

for c = 0 to 511
   palette_holder[c] = 511 - get_color(i);

for i = 1 to 7
for j = 0 to 511
       clr = palette_holder[j];
         dv = 0;

         /* low color bits : 0000 0111  */
         if (clr & 0x0007 ) dv ++;

         /* mid color bits : 0011 1000 */
         if (clr & 0x0028 ) dv = dv+ 8;  /* iirc, this is faster than += in huc. try both */

         /* hi color bits : 1 1100 0000 */
          if (clr & 0x01c0 ) dv = dv +  64;
          clr2 = get_color(j);
          clr = clr - (dv XOR 511);
          clr2 = clr2+ dv;
           set_color(i, clr2);       
          palette_holder[j] = clr;


Not sure if HuC has XOR though.
« Last Edit: September 21, 2015, 03:49:20 AM by DarkKobold »
Hey, you.

touko

  • Hero Member
  • *****
  • Posts: 953
Re: Faster fade out code?
« Reply #10 on: September 21, 2015, 07:55:45 AM »
If i remember correctly XOR is A^B ..

DarkKobold

  • Hero Member
  • *****
  • Posts: 1200
Re: Faster fade out code?
« Reply #11 on: September 30, 2016, 06:11:35 PM »
I think the best way for optimising your routine is doing the fade in a buffer for all palettes and transfer all the buffer after a vsync in asm with a tia bloc transfer .

like that:

/* A 256 bytes buffer is enough for 8 palettes */
int my_buffer[1024];

#asm

   stz $402
   stz $403

   tia _my_buffer , $404 , 1023

#endasm

My fade routine is close to yours, and is very fast,but in ASM .





So, rather than create a buffer for all the palettes at once, I'd like to split it up into chunks for 64 "colors" at a time. The problem is, I have no idea what the stz $402,  stz $403 lines do. Set addresses to zero... but why?

« Last Edit: October 04, 2016, 08:08:26 AM by DarkKobold »
Hey, you.

cabbage

  • Sr. Member
  • ****
  • Posts: 342
Re: Faster fade out code?
« Reply #12 on: September 30, 2016, 09:00:05 PM »
Code: [Select]
#include "huc.h"
int my_buffer[1024];
main(){
#asm
   stz $402
   stz $403
   tia _my_buffer, $404, 511
#endasm
}
compiles successfully...
a variable declared in c, e.g. my_buffer, is accessed with a leading underscore in asm, as in _my_buffer
« Last Edit: September 30, 2016, 09:06:19 PM by cabbage »

touko

  • Hero Member
  • *****
  • Posts: 953
Re: Faster fade out code?
« Reply #13 on: October 01, 2016, 01:05:02 AM »
Quote
This fails to compile:

19000 2D:B710 tia _my_buffer, $404, 511
Undefined symbol in operand field!

int my_buffer[1024], must be a global variable (before the main procedure) .
Tested and works fine for me, but beware the #asm and #endasm must'n be to the left hedge,you must add 1 or 2 space caracters before .

EDIT:cabbage has answered .

@DarkKobold:You can also copy the content of your VCE palettes in the buffer the same way

 tai $404 , _my_buffer , 511

More faster than in C .
« Last Edit: October 01, 2016, 02:54:20 AM by touko »

DarkKobold

  • Hero Member
  • *****
  • Posts: 1200
Re: Faster fade out code?
« Reply #14 on: October 01, 2016, 05:25:11 AM »
Quote
This fails to compile:

19000 2D:B710 tia _my_buffer, $404, 511
Undefined symbol in operand field!

int my_buffer[1024], must be a global variable (before the main procedure) .
Tested and works fine for me, but beware the #asm and #endasm must'n be to the left hedge,you must add 1 or 2 space caracters before .

EDIT:cabbage has answered .

@DarkKobold:You can also copy the content of your VCE palettes in the buffer the same way

 tai $404 , _my_buffer , 511

More faster than in C .
Quote
This fails to compile:

19000 2D:B710 tia _my_buffer, $404, 511
Undefined symbol in operand field!

int my_buffer[1024], must be a global variable (before the main procedure) .
Tested and works fine for me, but beware the #asm and #endasm must'n be to the left hedge,you must add 1 or 2 space caracters before .

EDIT:cabbage has answered .

@DarkKobold:You can also copy the content of your VCE palettes in the buffer the same way

 tai $404 , _my_buffer , 511

More faster than in C .

Oh, I didn't realize it needed to be a global. That is 2k of ram dedicated just to fadeout then...
Hey, you.