During the last days I've been reading a little about Z80 programming on the Commodore 128 and today I wrote a program that sets the border color and screen color to black and prints "HELLO WORLD!" at the top of the VIC screen. Here is the source code, which is for Power Assembler (Buddy):
10 REM START PROGRAM WITH BANK0:SYS65488 AFTER ASSEMBLING IT, THE NEXT LINE STARTS THE ASSEMBLER
40 SYS4000
45 .BANK 0 ;ASSEMBLE TO BANK 0
50 .ORG $3000 ;PUT THE PROGRAM AT $3000
60 .MEM ;ASSEMBLE TO MEMORY (NOT TO FILE)
62 LD A,0 ;PUT 0 (BLACK) IN ACCUMULATOR
64 LD BC,$D020 ;PUT $D020 IN REGISTERS B AND C, NOTE HOW THE Z80 CAN USE 2 8-BIT REGS AS 1 16-BIT REG
66 OUT (C),A ;SET BORDER COLOR, OUT MUST BE USED RATHER THAN LD FOR I/O REGS TO AVOID BLEED-THROUGH TO UNDERLYING RAM
68 INC BC ;INCREASE BC REGS TO $D021
70 OUT (C),A ;SET BACKGROUND COLOR
80 LD HL,MESSAGETXT ;LOAD HL REGS WITH SOURCE ADDRESS
90 LD DE,$0400 ;LOAD DE REGS WITH DESTINATION ADDRESS (START OF SCREEN MEMORY)
100 LD BC,MESSAGELEN ;LOAD BC REGS WITH NUMBER OF BYTES TO COPY
110 LDIR ;BLOCK COPY INSTRUCTION
120 ETERNAL'LOOP JP ETERNAL'LOOP ;LET'S STAY IN Z80 MODE
200 MESSAGETXT .SCR "HELLO WORLD!"
210 MESSAGELEN =*-MESSAGETXT
250 * =$FFD2
260 .BYT $BE ;CHANGE BANK 0 TO BANK 2 IN 8502->Z80 ROUTINE TO AVOID CP/M BOOT ROM AT ADR 0 (CONFLICTS WITH SCREEN MEM)
300 * =$FFEE
310 .BYT 195,0,48 ;THE Z80 RESUMES AT $FFEE, LET'S PUT JP $3000 THERE SO THAT OUR CODE IS CALLED
What I want to do now is to learn about how to program Z80 interrupts on the C128. The problem is that I have no books containing that information. I know very little about interrupt handling on the Z80. Perhaps you here at the forum have some information.
A tutorial on Z80 assembly
http://users.hszk.bme.hu/~pg429/z80guide/
Complete Z80 reference in PDF format:
http://www.myquest.nl/z80undocumented/z80-documented-v0.91.pdf
The complete Z80 instruction set:
http://www.ticalc.org/archives/files/fileinfo/128/12883.html
Suymmary of Z80 Instruction set for quick reference
http://www.ticalc.org/archives/files/fileinfo/1/109.html
And finally, for the non faint of heart ;-) the Z80 Family CPU User Manual
http://www.myquest.nl/z80undocumented/z80cpu_um.pdf
hope this helps.
Thank you for the links, mystikshadows.
You're very welcome. Let me know if this still doesn't have the obvious answer to your interrupts we'll see what else can be found. :-)
Christian, also check out the books here (http://www.smspower.org/dev/docs/books/). Included among them is the well-regarded Rodney Zaks book, Programming the Z-80. I'm not sure how in-depth his treatment of interrupts is, but since a number of people consider it the bible of Z-80 programming, it's probably worth a look.
Some Z80 memory configuration information that is hardly documented anywhere:
1. When the Z80 processor is active with the MMU set so there is I/O at $D000-$DFFF and RAM bank 0 or 2 in the rest of the 64 kB of address space, color RAM is at $1000-$13FF and NOT at $D800-$DBFF where it is on the C64 and on the C128 with the 8502 active. $1000-$13FF is a part of the I/O space so just as for $D000-$DFFF, "OUT (C),A" should be used to write a byte and "IN A,(C)" to read a byte.
2. The difference between RAM bank 0 and RAM bank 2 when the Z80 is active is that in RAM bank 0, the Z80 ROM (containing reset and CP/M boot code) is seen at $0000-$0FFF while in RAM bank 2, RAM is seen in the same area.
Now I've got Z80 IRQs working :D so here comes the Power Assembler source code for a small example program I've written. The program sets up CIA #1 timer A to generate an IRQ 60 times per second. At each interrupt, the border color is increased by 1.
50 REM START PROGRAM WITH BANK0:SYS65488 AFTER ASSEMBLING IT, THE NEXT LINE STARTS THE ASSEMBLER
100 SYS4000
150 .BANK 0 ;ASSEMBLE TO BANK 0
200 .ORG $3000 ;PUT THE PROGRAM AT $3000
250 .MEM ;ASSEMBLE TO MEMORY (NOT TO FILE)
300 LD SP,$2000 ;SET STACK POINTER, NOTE THAT THE Z80 HAS A 16-BIT STACK POINTER
320 LD A,$31
350 LD I,A ;SET THAT THE IRQ VECTOR WILL BE TAKEN FROM SOMEWHERE BETWEEN $3100-$31FF IF IM 2 IS USED
400 IM 2 ;SET INTERRUPT MODE 2 (IM 1 SHOULD PROBABLY WORK ON THE C128 AS WELL BUT I HAVEN'T GOT THAT WORKING)
402 ;THE FOLLOWING 6 LINES FILL $3100-$3200 WITH $32, THE IRQ ROUTINE THEN HAS TO BE AT $3232
405 LD A,$32
410 LD BC,$3100
412 - DEC C
415 LD (BC),A
420 JR NZ,-
430 LD ($3200),A
450 LD A,0
500 LD BC,$D01A
550 OUT (C),A ;DISABLE VIC II IRQs
600 LD A,$25 ;LO BYTE OF 1/60 S FOR PAL, USE $95 INSTEAD FOR NTSC
650 LD BC,$DC04
700 OUT (C),A ;SET LO BYTE OF CIA #1 TIMER A
750 LD A,$40 ;HI BYTE OF 1/60 S FOR PAL, USE $42 INSTEAD FOR NTSC
800 INC C
850 OUT (C),A ;SET HI BYTE OF CIA #1 TIMER A
900 LD A,$81
950 LD BC,$DC0D
1000 OUT (C),A ;ENABLE CIA #1 TIMER A IRQ
1050 LD A,$01
1100 INC C
1150 OUT (C),A ;START CIA #1 TIMER A IN CONTINUOUS MODE
1200 EI ;ENABLE Z80 INTERRUPTS
1250 ETERNAL'LOOP JP ETERNAL'LOOP
1300 * =$3232 ;IRQ ROUTINE BEGINS HERE
1350 LD BC,$DC0D
1400 IN A,(C) ;CLEAR CIA TIMER A INTERRUPT
1420 EI ;Z80 INTERRUPTS HAVE TO BE ENABLED AGAIN IN THE IRQ ROUTINE, ELSE NO MORE INTERRUPTS WILL OCCUR
1450 LD BC,$D020
1500 IN A,(C) ;READ BORDER COLOR
1550 INC A ;INCREASE BORDER COLOR BY 1
1600 OUT (C),A ;WRITE NEW BORDER COLOR
1650 RETI ;RETURN FROM IRQ
1700 * =$FFD2
1750 .BYT $BE ;CHANGE BANK 0 TO BANK 2 IN 8502->Z80 ROUTINE TO AVOID Z80 BOOT ROM AT ADR 0
1800 * =$FFEE
1850 JP $3000 ;THE Z80 RESUMES AT $FFEE, LET'S PUT A JUMP INSTRUCTION TO OUR CODE THERE
Pretty nifty! I'll have to try that out later today.
Hmmm, where can I get this Power Assembler? I haven't found it on this site...don't mean it's not there just means I didn't see it ... yet.....(points to his empty cup of coffee) ;-).
If not, anyone happen to have that in crt or dXX format handy? As always, if it's legal of course. if not point me to a site that sells it or has the right to distribute it.
I owned it at one point, later had it as an emulator file, then deleted it by accident. Can't seem to find it on the net again, but I haven't searched very hard.
It's not difficult to use MXASS (http://www.weihenstephan.org/~michaste/mxass/) to assemble Z80 (and 6502/10/c02/ce02/sc02 and HuC6580) code targetted for the 128. The format is different, but for a short program it's not difficult to translate by hand. The docs for it have an example program demonstrating setting up the processor switch, executing a short Z80 routine, and returning to the 8502.
I know the topic has been brought up before and I've seen people argue over it. But I still would like to see how fast a program written for the 128 for operating on the Z80 would actually run. I mean something like a demo or a GUI or something like that. I realize the CPU has a few bottleneck issues, but I bet it could do some hardcore math better than the 6502 code.
I've got Z80 code for drawing to the VIC bit map. My idea was you could do killer flight sim with the 2MHz power of the Z80; I thought with that speed and all those 16-bit registers it would really fly. But no. 2MHz Z80 is about as efficient as 8502 at 1MHz. Of course it's not 'hard core' math, and I don't claim to be an expert Z80 programmer... I'll see if I can dig it up.
Quote from: Christian JohanssonThe difference between RAM bank 0 and RAM bank 2 when the Z80 is active is that in RAM bank 0, the Z80 ROM (containing reset and CP/M boot code) is seen at $0000-$0FFF while in RAM bank 2, RAM is seen in the same area.
Can somebody confirm this (my c128 is dead)? I pretty sure the MMU has no output for Bank 2/3 (only internal registers), so how would the PLA / ROM know to vanish in Bank 2 but appear in Bank 0 ??
I think the only way to get full performance out of the Z80 is to turn off the VIC and use the VDC, which is one speedup technique used in the enhanced CP/M which is around here somewhere. Even then, as you pointed out, the 8502 is more efficient at many operations, so the higher clock speed doesn't help that much overall. Sometime when I have a 128 again, though, I'd like to do some real benchmarks. Or maybe Christian has done some comparison already? I'm also curious how long it takes to switch control from the 8502 to the Z80 and vice versa.
There was an 8Mhz Z80 replacement chip for the 128, manufactured by a German company if I remember correctly, but the name escapes me at the moment.
What clock speed can the VDC handle? I was messing around with MESS last night and it says that the PET 30xx, 40xx, and 80xx series all use a 6502 clocked over 7 mhz! I didn't realize any of Commodore's 8-bit systems were that fast.
I wonder if they're confusing the PET with the first Amigas, which I think were something like 7.14Mhz. Or possibly they were thinking of the Apple II series which had a number of accelerator boards available?
Quote from: nikonikoI wonder if they're confusing the PET with the first Amigas, which I think were something like 7.14Mhz. Or possibly they were thinking of the Apple II series which had a number of accelerator boards available?
No, it was definitely the PET (CBM) line and it was definitely reporting a 6502 CPU. When you start an emulator in MESS it gives you the vitals on CPU, Memory, Sound and Video chips it's emulating.
Oh, I don't doubt that it *says* 7Mhz. I just doubt that Commodore ever made any 8-bit machines that were that fast. :)
I can't find anywhere that mentions it, and places like Zimmers (http://www.zimmers.net/cbmpics/) have those models listed at 1Mhz.
Some answers and comments:
* No, I haven't made any speed comparisons between the 8502 and the Z80 but I have read that people who have made speed comparions have written that everything goes faster with the 8502. With the Z80, you can do some things more easily. You can for example do 16-bit arithmetic and you can copy a block of data with just one instruction. However, even if more instructions are needed on the 8502, it is still faster to do these things on the 8502. I have read that a 4 MHz Z80 is roughly comparable in speed with a 1 MHz 6502. This is because of the pipelining technique used in the 6502. Since the Z80 in the C128 is only running at an effective speed of 2 MHz it means that it is slower than the 1 MHz 8502.
* The information about that the Z80 ROM exists with RAM bank 0 but not with RAM bank 2, I found the information at comp.sys.cbm and I have verified it myself. With RAM bank 0 and screen memory at $0400 (the normal place for screen memory), I just got garbage on the screen but with RAM bank 0 it worked, i.e. I got RAM instead of ROM at $0400.
* Regarding the PET running at over 7 MHz, could it perhaps be the dot clock that is meant? At least, in the VIC-II, I think think there is a dot clock operating at around 8 MHz that is divided down to about 1 MHz for the CPU to use.
* It takes too much time for me to explain right now how to switch between the Z80 and the 8502 but basically there are routines in RAM at $FFD0 and $FFE0 (if I remember correctly) that have been copied there at startup. If you want to switch to the Z80, you can jump to the routine at $FFD0. When it switches over to the Z80, the Z80 will wake up a bit into the Z80 routine to switch in the opposite direction (that begins at $FFE0). You can change instructions in the just mentioned two routines to jump to your own code. For example, the Z80 wakes up at a RST 8 instruction that you can change to a JP to jump to your own code if you don't want CP/M to be booted (which I think RST 8 results in).
* I can add that there are some things you can only do with the 8502 but not with the Z80. There is an NMI input on the Z80 but it is connected to ground so this means that you can't get interrupts to the Z80 when pressing RESTORE or from CIA #2. Furthermore, the I/O port registers at addresses 0 and 1 are 8502 specific so you can't change anything related to those bits when the Z80 is active, i.e. you can't decide if the CPU and the VIC should see color RAM from bank 0 or 1 (a little known feature), you can't use a Datassette, you can't detect if CAPS LOCK has been pressed and you can't switch out the character ROM shadow that by default exists in all 4 VIC banks on the C128. On the C64 you always have character ROM shadow in two of the VIC banks and never in the other two but with the C128 with the 8502 active you can select if you want to have character ROM shadow in all four banks or in none of the banks.
Wow, what a post! Thanks for checking the Bank 0, Bank 2 ROM thing. I am working on a super memory map for the C128 and I would feel real stupid if I said Bank 2 is the same as Bank 0. I need to go over the schematics again... I still don't see how the hardware would know bank 2 from bank 0. Commodore never ceases to amaze me!
In my opinion (yes, there are lots!), the Z80 is slower because a memory access requires (at least) 3 clock cycles, while the 8502 only requires 1.
I noticed you are using IM 2. I always thought that was completely stupid (no offense to you). I think Commodore used it for compatibility with CP/M (duh) because common programs (SID and DDT, I think) use RST $38 for debugging. I always used IM 1 for my Z80 programs which always jumps to $0038. I used Bank 1 so I had complete control... I can't remember what's at that address in ROM Bank 0 right now (my memory map is on another PC)...
Any new Z80 programs you've made?
Actually, I tried to use IM 1 first but I didn't get it working so therefore I tried IM 2, which worked. Thanks to your post I now know that IM 1 should also work. I might try again later.
About new Z80 programs, I'm thinking about writing a VIC-II text scroller for the Z80 and post the source code here when I get it working.
I don't know if its necessary but CP/M did it so I did it, which is when using Bank 1, disable common RAM (at least in the low area) and reprogram the MMU to use Bank 1 for zero page and page 1. Like this:
LD BC, $D50A
LD A,1
OUT (BC),A ;page 1 use bank 1
DEC C
OUT (BC),A ;page 1 use page 1
DEC C
OUT (BC),A ;page 0 use bank 1
DEC C
XOR A
OUT (BC),A ;page 0 use page 0
DEC C
OUT (BC),A ;no common ram
Before the last line you can set A to something else; CP/M sets the top of RAM common. I think the only requirement is the bottom must NOT be common. I hope that helps.
Thank you for the information. I'll try that. I used common RAM in the low area and that might have been the problem.
Just remember that code is for Bank 1 and if you use VIC in Bank 1 then before the last line you need LD A,$40. I could only guess what it needs to be for Bank 2...
But if you use Bank 0, the instruction in ROM at address $0038 is JP $FDFD (I looked it up last night to be sure). CP/M (and if I use bank 0) store a JP in that location so the routine can be anywhere. Of course if KERNAL ROM is active that probably won't work. I say probably because I assume the KERNAL would hide the JP instruction, but the MMU is wacky with Z80 active so who knows...
Now I've got IM 1 to work :) . The problem was not related to common RAM. I found that I don't have to write to the registers you mentioned, hydrophilic ($D506-$D50A), when using RAM bank 2 since those registers are already set up in a way that works with bank 2 by the C128 Kernal. The problem was that Power Assembler uses the BASIC editor. Unfortunately, BASIC seems to overwrite the memory location with address $38 in bank 0 (bank 2) whenever you RUN a program (you compile a Power Assembler program by issuing RUN) or when issuing a SYS command. I solved that by letting the Z80 program write a JP instruction to $38 in bank 2 before interrupts are enabled (by using EI) rather than putting the JP instruction there already at compilation time.
I now think I prefer IM1 since it is easier to set up than IM2. The issue with that you have to fill a whole page with the same address when using IM2 since you don't know from which address the Z80 will fetch the address to jump to at an IRQ doesn't appeal to me very much.
Here is the IM 1 version of the latest program I posted. Btw, do you know if you should return from the interrupt routine with RET or RETI when using IM 1? Both seem to work. I think I've read somewhere that RETI should only be used for interrupt mode 2 and that interrupt mode 1 works like a RST $38 instruction (for RST I think you return with RET) but I'm not sure.
50 REM START PROGRAM WITH BANK0:SYS65488 AFTER ASSEMBLING IT, THE NEXT LINE STARTS THE ASSEMBLER
100 SYS4000
150 .BANK 0 ;ASSEMBLE TO BANK 0
200 .ORG $3000 ;PUT THE PROGRAM AT $3000
250 .MEM ;ASSEMBLE TO MEMORY (NOT TO FILE)
300 LD SP,$2000 ;SET STACK POINTER, NOTE THAT THE Z80 HAS A 16-BIT STACK POINTER
420 LD A,195 ;OPCODE FOR THE JP INSTRUCTION
422 LD BC,$38 ;THE Z80 STARTS AT $38 AT IRQ IN INTERRUPT MODE 1
424 OUT (C),A
426 LD A,428 INC C
430 OUT (C),A
431 LD A,>IRQ'ROUTINE
432 INC C
434 OUT (C),A
450 LD A,0
500 LD BC,$D01A
550 OUT (C),A ;DISABLE VIC II IRQs
600 LD A,$25 ;LO BYTE OF 1/60 S FOR PAL, USE $95 INSTEAD FOR NTSC
650 LD BC,$DC04
700 OUT (C),A ;SET LO BYTE OF CIA #1 TIMER A
750 LD A,$40 ;HI BYTE OF 1/60 S FOR PAL, USE $42 INSTEAD FOR NTSC
800 INC C
850 OUT (C),A ;SET HI BYTE OF CIA #1 TIMER A
900 LD A,$81
950 LD BC,$DC0D
1000 OUT (C),A ;ENABLE CIA #1 TIMER A IRQ
1050 LD A,$01
1100 INC C
1150 OUT (C),A ;START CIA #1 TIMER A IN CONTINUOUS MODE
1170 IM 1 ;SET INTERRUPT MODE 1
1200 EI ;ENABLE Z80 INTERRUPTS
1250 ETERNAL'LOOP JP ETERNAL'LOOP
1350 IRQ'ROUTINE LD BC,$DC0D
1400 IN A,(C) ;CLEAR CIA TIMER A INTERRUPT
1420 EI ;Z80 INTERRUPTS HAVE TO BE ENABLED AGAIN IN THE IRQ ROUTINE, ELSE NO MORE INTERRUPTS WILL OCCUR
1450 LD BC,$D020
1500 IN A,(C) ;READ BORDER COLOR
1550 INC A ;INCREASE BORDER COLOR BY 1
1600 OUT (C),A ;WRITE NEW BORDER COLOR
1650 RET ;RETURN FROM IRQ
1700 * =$FFD2
1750 .BYT $BE ;CHANGE BANK 0 TO BANK 2 IN 8502->Z80 ROUTINE TO AVOID Z80 BOOT ROM AT ADR 0
1800 * =$FFEE
1850 JP $3000 ;THE Z80 RESUMES AT $FFEE, LET'S PUT A JUMP INSTRUCTION TO OUR CODE THERE
Cool. And IM 1 is faster too cause it doesn't have to lookup the vector from the Interrupt table. I've read RETI is the same as RET but the former is designed to work with special Zilog interrupt chips to allow interrupt chaining/multiplexing. Since there's none of them in the C128, RET is probably better because the opcode is 1 byte (faster).
I just noticed something. You are storing the interrupt opcodes to RAM with the OUT instruction. I never tried that. People usually use LD (HL),A when storing to RAM and only use OUT for I/O chips. Obviously it works, but it is slower. LD(HL),A opcode is only 1 byte but OUT (BC),A is two.
Something I never tried. There is undocumented OUT(BC),0 -- store zero to I/O port BC. Maybe you might find it useful? Its opcode ED 71 (or so I've read).
Actually, it was a mistake to use OUT. Yesterday evening when I had turned off my computer I thought: "Oh, I happened to use OUT instead of LD. Somebody is surely going to comment about that." and I was right about that ;) . Oh well, it shows that OUT works as well even though it shouldn't be used in this case. That might be good to know. It also shows that there are people reading what I post, which is also good :) .
I just changed the code so that it uses LD in the following way. I have tested that it works.
50 REM START PROGRAM WITH BANK0:SYS65488 AFTER ASSEMBLING IT, THE NEXT LINE STARTS THE ASSEMBLER
100 SYS4000
150 .BANK 0 ;ASSEMBLE TO BANK 0
200 .ORG $3000 ;PUT THE PROGRAM AT $3000
250 .MEM ;ASSEMBLE TO MEMORY (NOT TO FILE)
300 LD SP,$2000 ;SET STACK POINTER, NOTE THAT THE Z80 HAS A 16-BIT STACK POINTER
420 LD A,195 ;OPCODE FOR THE JP INSTRUCTION
422 LD IX,$38 ;THE Z80 STARTS AT $38 AT IRQ IN INTERRUPT MODE 1
424 LD (IX+0),A
425 LD A,426 LD (IX+1),A
427 LD A,>IRQ'ROUTINE
431 LD (IX+2),A
450 LD A,0
500 LD BC,$D01A
550 OUT (C),A ;DISABLE VIC II IRQs
600 LD A,$25 ;LO BYTE OF 1/60 S FOR PAL, USE $95 INSTEAD FOR NTSC
650 LD BC,$DC04
700 OUT (C),A ;SET LO BYTE OF CIA #1 TIMER A
750 LD A,$40 ;HI BYTE OF 1/60 S FOR PAL, USE $42 INSTEAD FOR NTSC
800 INC C
850 OUT (C),A ;SET HI BYTE OF CIA #1 TIMER A
900 LD A,$81
950 LD BC,$DC0D
1000 OUT (C),A ;ENABLE CIA #1 TIMER A IRQ
1050 LD A,$01
1100 INC C
1150 OUT (C),A ;START CIA #1 TIMER A IN CONTINUOUS MODE
1170 IM 1 ;SET INTERRUPT MODE 1
1200 EI ;ENABLE Z80 INTERRUPTS
1250 ETERNAL'LOOP JP ETERNAL'LOOP
1350 IRQ'ROUTINE LD BC,$DC0D
1400 IN A,(C) ;CLEAR CIA TIMER A INTERRUPT
1420 EI ;Z80 INTERRUPTS HAVE TO BE ENABLED AGAIN IN THE IRQ ROUTINE, ELSE NO MORE INTERRUPTS WILL OCCUR
1450 LD BC,$D020
1500 IN A,(C) ;READ BORDER COLOR
1550 INC A ;INCREASE BORDER COLOR BY 1
1600 OUT (C),A ;WRITE NEW BORDER COLOR
1650 RET ;RETURN FROM IRQ
1700 * =$FFD2
1750 .BYT $BE ;CHANGE BANK 0 TO BANK 2 IN 8502->Z80 ROUTINE TO AVOID Z80 BOOT ROM AT ADR 0
1800 * =$FFEE
1850 JP $3000 ;THE Z80 RESUMES AT $FFEE, LET'S PUT A JUMP INSTRUCTION TO OUR CODE THERE
Btw, I read that there are variants of the LD instructions that I think should make it possible to change the following code:
420 LD A,195 ;OPCODE FOR THE JP INSTRUCTION
422 LD IX,$38 ;THE Z80 STARTS AT $38 AT IRQ IN INTERRUPT MODE 1
424 LD (IX+0),A
425 LD A,426 LD (IX+1),A
427 LD A,>IRQ'ROUTINE
431 LD (IX+2),A
into:
422 LD IX,$38 ;THE Z80 STARTS AT $38 AT IRQ IN INTERRUPT MODE 1
424 LD (IX+0),195
426 LD (IX+1),431 LD (IX+2),>IRQ'ROUTINE
However, when I tried that, the code compiled without errors but it didn't work.
After assembling, you might enter the ML Monitor and issue:
M 3000
To get the codes generated by the assemblier. Assemblers (and other programs) are known to silently fail, this will let us see what code it made and we could check that it is correct.
Also, you used LD (IX+d),A instead of LD (HL),A. I would recommend against that unless there is a _really_ good reason because it is sooo much slower. On the 8502, STA (zp),Y is pretty fast (6 cycles) and this compares favoribly with LD (HL),A (7 cycles), but LD (IX+d),A takes 19 cycles!! Also, LD (HL),A is 1 opcode while the IX version is 3.
What I'm trying to say is the 8502 is really good at indexing, either directly or through zero page, but the Z80 sucks. On the other hand, the primary register pairs (BC, DE, and especially HL) are pretty fast and should generally be used for 'indexing'.
But it doesn't hurt to experiment... unless you're programming a nuclear or missle or something :)
Thank you for the information :) . I thought that instead of doing "INC L; LD (HL),A;" it must be shorter to write "LD (IX+1),A;" but apparently I was wrong.
Now I've written a VIC text scroller for the Z80 :D so here is the next lesson in "the Z80 school" or "how to win over Spectrum fans to the bright side" ;) . I've transscribed all source code listings in this thread from the screen of my C128 so it is a lot of job.
20 REM START PROGRAM WITH BANK0:SYS65488 AFTER ASSEMBLING IT, THE NEXT LINE STARTS THE ASSEMBLER
50 SYS4000
150 .BANK 0 ;ASSEMBLE TO BANK 0
250 .ORG $3000 ;PUT THE PROGRAM AT $3000
350 .MEM ;ASSEMBLE TO MEMORY (NOT TO FILE)
450 LD SP,$2000 ;SET STACK POINTER, NOTE THAT THE Z80 HAS A 16-BIT STACK POINTER
550 LD BC, $0038 ;THE Z80 STARTS AT $38 AT IRQ IN INTERRUPT MODE 1
650 LD A,195 ;OPCODE FOR THE JP INSTRUCTION
750 LD (BC),A
850 LD A,950 INC C
1050 LD (BC),A
1150 LD A,>IRQ'ROUTINE
1250 INC C
1350 LD (BC),A
1655 LD BC,$0400 ;START OF SCREEN MATRIX
1660 LD DE,1000 ;MAX NUMBER OF CHARACTERS ON SCREEN
1665 LD A,32 ;SCREEN CODE FOR SPACE
1670 - LD (BC),A ;START OF CLEAR SCREEN LOOP
1675 INC BC
1680 DEC DE
1685 JR NZ,- ;END OF CLEAR SCREEN LOOP
1690 LD BC,$1000 ;START OF COLOR RAM WHEN Z80 IS ENABLED
1695 LD D,40 ;MAX NUMBER OF CHARACTERS ON FIRST ROW
1700 LD A,7 ;YELLOW COLOR
1705 - OUT (C),A ;START OF SET FIRST ROW TO YELLOW LOOP
1710 INC C
1715 DEC D
1720 JR NZ,- ;END OF SET FIRST ROW TO YELLOW LOOP
1725 XOR A ;TRICK TO SET ACCUMULATOR TO 0 WITH ONE-BYTE INSTRUCTION
1727 LD BC,53280 ;BORDER COLOR REGISTER
1730 OUT (C),A ;SET BORDER COLOR TO BLACK
1732 INC C ;INCREASE BC TO 53281 (SCREEN COLOR REGISTER)
1735 OUT (C),A ;SET SCREEN COLOR TO BLACK
1737 LD BC,$D016 ;HORIZONTAL SCROLL REGISTER
1740 OUT (C),A ;SET 38-COLUMN MODE
1750 LD A,(ACTUAL'SCROLL'TXT)
1850 LD ($0427),A ;INITIALIZE LAST CHARACTER ON FIRST ROW TO FIRST CHARACTER OF SCROLL-TEXT
2250 LD HL,ACTUAL'SCROLL'TXT ;INITIALIZE CHARACTER POINTER
2300 ;THE KERNAL HAS ALREADY SET UP RASTER IRQ IN THE VIC CHIP SO WE DON'T DO IT AGAIN
2650 IM 1 ;SET INTERRUPT MODE 1
2750 EI ;ENABLE Z80 INTERRUPTS
2850 ETERNAL'LOOP JP ETERNAL'LOOP
2950 IRQ'ROUTINE LD BC,$D019
3050 IN A,(C)
3150 OUT (C),A ;CLEAR VIC RASTER IRQ
3250 EI ;Z80 INTERRUPTS HAVE TO BE ENABLED AGAIN IN THE IRQ ROUTINE, ELSE NO MORE INTERRUPTS WILL OCCUR
3350 LD BC,$D016
3450 IN A,(C) ;READ HORIZONTAL SCROLL REGISTER IN VIC
3550 AND 7
3650 CALL Z,SCROLL'ONE'CHAR ;JUMP IF IT'S TIME TO SCROLL A WHOLE CHARACTER
3750 DEC A
3950 OUT (C),A ;WRITE NEW HORIZONTAL SCROLL VALUE TO VIC
4050 RET ;RETURN FROM IRQ
4150 SCROLL'ONE'CHAR EXX ;SWITCH TO ALTERNATE REGISTER SET TO NOT OVERWRITE CHAR POINTER IN HL
4160 LD HL,$0401 ;SOURCE ADDRESS
4250 LD DE,$0400 ;DESTINATION ADDRESS
4350 LD BC,39 ;NUMBER OF CHARACTERS TO COPY
4450 LDIR ;BLOCK-COPY INSTRUCTION TO SCROLL FIRST ROW ONE CHARACTER TO THE LEFT
4550 EXX ;SWITCH BACK TO NORMAL REGISTER SET
4850 INC HL ;INCREASE CHARACTER POINTER
4950 LD A,(HL) ;READ CHARACTER THAT POINTER POINTS TO
4960 CP @"@" ;COMPARE WITH "@" (INDICATES END OF SCROLL TEXT), THE FIRST @ INDICATES SCREEN CODE IN POWER ASSEMBLER
4965 CALL Z,RESTART'SCROLL ;JUMP IF END OF SCROLL TEXT REACHED
5050 LD ($0427),A ;WRITE NEW CHARACTER AT LAST POSITION ON FIRST ROW
5250 LD A,8 ;A DEC A INSTRUCTION AFTERWARDS RESETS THE $D016 REGISTER TO 7
5350 RET ;RETURN TO LINE NUMBER 3750
5450 RESTART'SCROLL LD HL,SCROLL'TXT ;RESET TEXT POINTER TO START OF SPACES BEFORE SCROLL-TEXT
5550 LD A,(HL) ;READ FIRST SPACE
5650 RET ;RETURN TO LINE NUMBER 5050
5750 SCROLL'TXT .SCR " " ;38 SPACES, .SCR MEANS SCREEN CODES IN POWER ASSEMBLER (NOT PETSCII)
5850 ACTUAL'SCROLL'TXT .SCR "THE Z80 HAS WOKEN UP AND IS GIVING YOU THIS SCROLLY MESSAGE.@"
6050 * =$FFD2
6150 .BYT $BE ;CHANGE BANK 0 TO BANK 2 IN 8502->Z80 ROUTINE TO AVOID Z80 BOOT ROM AT ADR 0
6250 * =$FFEE
6350 JP $3000 ;THE Z80 RESUMES AT $FFEE, LET'S PUT A JUMP INSTRUCTION TO OUR CODE THERE
Cool
Only thing that confused me is LD A,(ACTUAL'SCROLL'TXT). Looking at it, that seems to be loading the accumulator from memory specified by a label? I've never seen a label with apostrophes before!
One optimization you could do is in the clear screen routine by using LDIR
LD HL,$0400 ;SOURCE ADDRESS
LD (HL),$20 ;fill first character with space
LD DE,$0401 ;DESTINATION ADDRESS
LD BC,999 ;NUMBER OF (remaining) CHARACTERS TO COPY
LDIR
I know that won't work for color memory at $1000 because you have to use OUT, but I always wondered if you could use LD(HL),A if HL is the 'normal' address = $d800 ?
Yes, in Power Assembler (Buddy) it's possible to have labels containing apostrophes. Thank you for the optimization tip!
Hi, all, first
Quote from: David Murray on January 17, 2007, 04:24 AM
I know the topic has been brought up before and I've seen people argue over it. But I still would like to see how fast a program written for the 128 for operating on the Z80 would actually run. I mean something like a demo or a GUI or something like that. I realize the CPU has a few bottleneck issues, but I bet it could do some hardcore math better than the 6502 code.
In a C128 architecture the z80 does not perform well. From wikipedia:
"The Z80 machine cycles are sequenced by an internal state machine which builds each M-cycle out of 3,4,5 or 6 discrete steps (i.e. clock cycles) depending on context. This avoids cumbersome asynchronous logic and makes the control signals behave consistently at a wide range of clock frequencies. Naturally, it also means that a higher frequency crystal must be used than without this subdivision of machine cycles (approximately 2-3 times higher). It does not imply tighter requirements on memory access times however, as a high resolution clock allows more precise control of memory timings and memory therefore can be active in parallel with the CPU to a greater extent (i.e. sitting less idle), allowing more efficient use of available memory performance. For instruction execution, the Z80 combines two full clock cycles into a long memory access period (the M1-signal) which would typically last only a fraction of a (longer) clock cycle in a more asynchronous design (such as the 6800, or similar)."
What does it mean? that for a z80 running at 4Mhz is 'normal' as running at 1Mhz for a 6502 is.
Comparing the two processor performances is not an easy task: they have too much difference in their architecure. What it's possible to say is that under majority of situation having a z80 at 4Mhz perform usually better in average of 10-30% than a 65xx or 85xx running at 1Mhz. (We are talking of common clock speed for the old days, today we can have those cpu running at several hundreds of Mhz, but is useless)
Of course, because on the C128 the z80 waste 1/2 of the time to do nothing you end up with a very slow cpu compared to the 8502 especially when referring to fast mode.
So imho there is no really reason to do z80 asm programming on C128. Even if you can do 16 bit math a bit more easy...
(That's my point, of course)
I know, bumping an old thread. ;/
Quote from: Christian Johansson on January 31, 2007, 05:18 AMWith the Z80, you can do some things more easily. You can for example do 16-bit arithmetic and you can copy a block of data with just one instruction.
So this raises the question: Would it make sense to use the Z80 to do 16-bit math and move blocks of data? Or is it faster to go ahead and use 6502 routines for these tasks?
Quote from: airshipOr is it faster to go ahead and use 6502 routines for these tasks?
You have an REU (jealous) so let it do the moving if raw speed is the concern! Seriously, the LDIR instruction takes 21 cycles/byte. These are (effectively) 2MHz cycles so about 10.5us / byte. The generic 6502 code is like
LDA (s),Y
STA (d),Y
INY
BNE loop
That's (5+6+2+3) = 16 cycles/byte. At 1MHz (VIC screen) this is about 16us/byte so the Z80 wins. But at 2MHz (VDC only) this is only 8us/byte so the 6502 wins!
Of course there are other factors to consider, like page-crossing and less-than-a-page situations. However, that was the "generic" 6502 version. You can save 2 cycles/byte by using absolute indexed addressing (which requires static addresses or self-modifying code). You can save an additonal cycle/byte with zero-page relocation or 2 cycles/byte with stack-page relocation. Imagine:
LDA abs,Y
PHA
INY
BNE loop
Assuming 'abs' is page-aligned, that's 12us / byte at 1MHz, close to the Z80, or 6us / byte at 2MHz, smoking the Z80. Of course you must consider the overhead of manipulating the MMU in this case. Then again, unless you're program is entirely Z80 code, you must consider the Z80 calling overhead...
It's really a design consideration. If you're moving a lot of data frequently, the Z80 sounds like a good choice, especialy if using the VIC.
I wonder about 16-bit math. With fast multiply routines, it seems like 6502 would beat or tie the Z80 since the Z80 is a bit slow at reading tables. Otherwise I think the Z80 would win by a fraction...
LD HL,(a1)
EX DE,HL
LD HL,(a2)
ADD HL,DE
LD (a3),HL
That's 16+4+16+11+16 = 63 cycles or about 31.5us. Compare with
CLC
LDA <a1
ADC <a2
STA <a3
LDA >a1
ADC >a2
STA >a3
That's 2+6*3 = 20 cycles(us) if all zero-page addresses or 2+6*4 = 26 cycles(us) if all non-zero page addresses. I was wrong! The 6502 smokes the Z80! (we won't even mention the fact the 6502 can do this double speed if needed).
Now Z80 fans might cry foul because my example uses absolute addressing while most Z80 progs are written with register addressing which is generally faster. On the other hand, the 6502 was using simple (non-indexed) addressing. Only a few extra cycles are needed by the 6502 to add indexing while the Z80 would need several extra cycles.
I worry I'm short-changing the Z80 so lets try multiply by 10.
LD HL,(s)
LD D,H
LD E,L ;DE=HL
ADD HL,HL ;*2
ADD HL,HL ;*4
ADD HL,DE ;*5
ADD HL,HL ;*10
LD (d),HL
That's 16+4*2+11*4+16 = 84 cyles or about 42us. Compare with
LDA >s
STA >t
STA >d
LDA <s
STA <t ;t = s
ASL A
ROL >d ;*2
ASL A
ROL >d ;*4
ADC <t
PHA
LDA >d
ADC >t
STA >d
PLA ;*5
ASL A
STA <d
ROL >d ;*10
Oh boy, that's 5*3+7*2+4*3+7+3+7 = 58 cycles if all zero-page addresses. Or 5*4+8*2+4*4+7+4+8 = 71 cycles if all not zero-page addreses. So in the worst case (1MHz, all non-zp) the 6502 takes about 71us so the Z80 wins by 69%. In the best case (2MHz, all zp) the 6502 takes 58/2 = 29us so the 6502 wins by 45%. So here, again, it's going to be a design consideration.
Based on these examples, I conclude you should seriously consider the Z80 if using 1MHz mode. Otherwise stick with the 6502.
Of course there is also the issue of code size. Many bytes for the 6502 code but only about 1/2 or 1/3 as many with Z80 code.
So many factors in software design!
I'm wondering how well the Z80 would do at floating-point math...
Thanks, HP. I'm always skeptical when I hear an assertion without backup, so 'With the Z80, you can do some things more easily. You can for example do 16-bit arithmetic and you can copy a block of data with just one instruction.' set off alarms in my head. But I wasn't smart enough to do the math. I'm glad you are. :)
And yes, my REU will do data transfers quite nicely, thank you. But I'm always curious as to how the C128s 'natural resources' might be used for more power and capability. The idea of somehow leveraging the Z80, or a 1541, or whatever as a math co-processor has appeal, but I don' t think it generally has legs. Still, for some applications, maybe...
Something to consider: while the Z80 is busy doing its thing, what do *you* plan to do about interrupts?
LD HL,(a1)
EX DE,HL
LD HL,(a2)
ADD HL,DE
LD (a3),HL
That's 16+4+16+11+16 = 63 cycles or about 31.5us. Compare with
--
CLC
LDA <a1
ADC <a2
STA <a3
LDA >a1
ADC >a2
STA >a3
ASL A
STA <d
ROL >d ;*10
Quote
Oh boy, that's 5*3+7*2+4*3+7+3+7 = 58 cycles if all zero-page addresses. Or 5*4+8*2+4*4+7+4+8 = 71 cycles if all not zero-page addreses. So in the worst case (1MHz, all non-zp) the 6502 takes about 71us so the Z80 wins by 69%. In the best case (2MHz, all zp) the 6502 takes 58/2 = 29us so the 6502 wins by 45%. So here, again, it's going to be a design consideration.
Based on these examples, I conclude you should seriously consider the Z80 if using 1MHz mode. Otherwise stick with the 6502.
Of course there is also the issue of code size. Many bytes for the 6502 code but only about 1/2 or 1/3 as many with Z80 code.
So many factors in software design!
I'm wondering how well the Z80 would do at floating-point math...
the z80 code could be written more faster
LD HL,(a1)
LD DE,(a2)
ADD HL,DE
LD (a3),HL
without the EX HL,DE that eats considerable time.
About multiplication, please consider that using tables eats memory. For a generic multiply routine it's not a suitable solution.
It's interesting to see that most z80 systems works at 3-4 Mhz normally. At this speed (that is the common one) results are very different.
But, unfortunately the design of the C128 does not allowed Commodore Engineers to make better integration