Ups and Downs in VDC Bitmap Manipulation

Started by nikoniko, April 20, 2007, 08:34 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

nikoniko

First of all, I'm very happy to say that a friend of mine has loaned me his C128 (and monitor)! I'd asked if he wanted to sell it and he said no, but offered that I could use it for a bit while he's busy with other things. It's nice having a 128 again, even if it's only for a while. Something's wonky with 40 column mode, but I'm mostly interested in 80 columns anyway and that looks great.

Anyway, playing around a bit, I decided to pursue an idea I had about an alternate way of dealing with bitmaps. Normally a bitmap is organized from left to right across the screen, which is fine for many things, but for blitting your typical software sprites it's not ideal. Let's say you have a 16x16 sprite you want to put somewhere. First, you'd calculate and send to the VDC the starting address for where you'll write the upper left corner of the sprite. Then you'd write out two or three bytes of the sprite depending on how many cell columns it straddles. Then you'd have to write a new address to the VDC so that you can do the next line of the sprite. And so on. In the end you'll have updated the address registes 16 times. Multiply that times however many sprites you have moving around and that turns into a lot of overhead every time you update your screen.

Now imagine that instead of bitmap memory being laid out like that, successive bytes would represent the next line down in the same cell column. Sort of like the VIC's bitmap mode, except that instead of going down 8 bytes, then over and up, you can just keep going down all the way to the bottom. So let's look at the case of a 16x16 sprite again. Just as before, you'd calculate and write the starting address to the VDC. You'd output the first byte that represents the upper left corner of the sprite. But then, instead of drawing the sprite horizontally, you'd write the next line that goes underneath that one. Since cells are now stacked vertically, you'd be able to output 16 lines from the sprite before having to write to the address register again to move into a new cell column. If your sprite straddles three columns, you'd only have to write to the address register 3 times instead of 16. That's a big savings if you have a lot to move around the screen.

But VDC bitmaps don't work that way, right? Right. But characters do. So rather than setting up a normal bitmap, you'd create a text screen made up of custom 8x32 characters. We can fit 6 rows of these tall characters on a normal screen, so 80 character columns times 6 rows gives us 640x192 resolution. If we organize screen memory so that the top left character is 0, the next character in the same column is 1, followed by 2,3,4,5, then in the next column continuing 6,7,8,9,10,11, and so on, we'll be able to take advantage of the method described above. We'll need 480 bytes of screen memory.

0    6   ...   473
1    7   ...   474
2    8   ...   475
3    9   ...   476
4    10  ...   478
5    11  ...   479

We'll also need an additional 480 bytes for attribute memory since using more than 256 characters requires attributes. Then we have our 480 8x32 custom character definitions, and those will take up 15360 bytes.

The C128 I'm working on only has 16K VDC memory, so here's how I'm organizing it:

$0000...$3BFF   Character definitions
$3C00...$3DDF   Screen memory
$3DE0...$3FBF   Attribute memory
$3FC0...$3FFF   64 bytes wasted. :(

Screen memory and attributes only need to be written once when setting up, and then the character set definitions are just treated as one large bitmap that can be written to with fewer addressing updates as described above. One never has to think about them as characters or consider which character is being writing to, so it's as easy as dealing with a normal bitmap mode. If you have sprites (or windows or whatever you're moving around) that are wide but short, this arrangement could end up being less efficient, but there is still one perk. Let's say you have a 10 cell (or 80 pixel) wide area that needs to shift straight up or down. Doing so is a simple matter of 10 block copy operations, one for each column. Otherwise you'd need as many block copies as the area is tall. So for 80 pixels x 20 lines moving up or down, you need one copy for each line, so that's 20 copies as opposed to 10. (Of course, cleaning up the copied-from area is something that will need consideration, too.)

Using big characters like these presumes one is content to have very limited color options, which is fine by me. Since the VDC's color restrictions  can be somewhat challenging to design with anyway, I'm all for the simple beauty of two-color displays. The Mac had some great black and white software, and on the 128 you don't even need to stop at black and white -- pick any two colors from 16. :)

hydrophilic

That's a very clever way of using the VDC memory.  Effecient for coding and effecient in terms of used VDC memory... I like it!

With 64K you might think there would be more color options, but if I understand your method, it seems one would run out of character definitions.  Also, you can only set the foreground color for VDC characters.  But with your method, you could make sprites 'flash' automatically :P

I don't why you'd want to reduce the resolution horizontally (maybe so you can use fast mode and still have a screen to look at), but if you used the VDC's double-pixel-width feature, you could reduce your characters to 8x16.

nikoniko

Yeah, the maximum size of the character set does limit the possibilities. However, one nice thing with 64K is that you can take advantage of double- or triple-buffering, switching screens on the fly simply by updating the character set address.

Thanks for remind me about double-pixel-width. Although I'm not sure I'll ever really use it, I'd love to try it now that I have a real machine to work with. VICE doesn't even attempt to support it, so I've never seen it in action.

By the way, hydro, do you have any feeling of the difference between VDC timing on a real machine and VICE? I read somewhere that VICE completes VDC operations far more quickly than a real C128 will. That's something I should probably look into while I can.

hydrophilic

Sorry I don't know how accurate the VDC timing is.  It seems it never has to wait to access register/RAM of VDC but I haven't tested it enough to say for sure.

But you do know that you don't need to test the 'ready bit' when reading/writing a register on the real thing, unless its the RAM register (31?).  There's an excellent article on the net about this and also how even when accessing RAM, you don't need to check the 'ready bit' if you read/write 8 or less bytes at a time AND you previously set the DRAM refresh to zero.  I haven't tested the RAM access but for other registers it really works.

nikoniko

The RAM access was something I tested this morning, and I was happy to see the article was right. In fact, I could read or write any number of bytes without any problems that I could detect. Still, even if one does wait after every 8 to be safe, that's pretty fast. I don't know how much that generalizes to other VDCs, but I suppose if screen updates are holding back a program's performance one could release a "super speedy" version  along with a "safe" one . :) Block copy/fill may be another matter, I haven't checked yet, but I imagine if you don't change the address or do any other reads or writes to RAM while copy/fill is in progress, there won't be any problems. Reading or writing to unrelated registers would probably be fine.