Finding BASIC strings in ML

Started by xlar54, January 19, 2008, 10:04 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

xlar54

FINDING BASIC STRINGS FROM ML

BASIC's string descriptors are stored in the C64 and C128 in much the same way, the only major difference being where they are stored.
It is possible to get all the information about a string you may need, from its descriptor, amd it is not as hard to find as you might think.

First, we need to know a little about the descriptor. Each non-array  variable descriptor is 7 bytes long.  Here is what is stored in these 7 bytes:

1st byte contains the ASCII value of  the first letter of the variable. If  our string were AB$, then this location will contain a 65 ($41).

2nd byte contains the ASCII of the second character (letter or number) of the variable, (zero if none). Therefore, with the variable AB$ this location should contain a 66 ($42), BUT!!!

In order for BASIC to recognize this variable as a string, Bit #7 of the second byte is set. So, the value here is not 66 ($42), but 66 plus 128, or 194 ($C2). If the variable were just A$, then the value would be 128 ($80).

The other non-array variables are identified as follows:

Defined FN -Bit#7 of 1st byte set
Integer(%) -Bit#7 of both bytes set
Numeric    -Bit#7 of both un-set

3rd byte of the descriptor, contains the string length. This should help to explain why a string cannot be over 255 characters long.

4th byte contains the LO-byte AND..

5th byte contains the HI-byte of the  actual address in memory where the string data is stored. This address can be in the BASIC program area, or in the string storage area.

6th and 7th bytes are not used, and contain zeros.

Now we know what the string descriptor is. Now we will locate it.
The starting address of the non-array variable descriptors storage is pointed to by the pointers in VARTAB, and the end by the pointers in ARYTAB.

         C-64                          C-128
     --------------            --------------
VARTAB 45-46($2D-$2E)  47-48($2F-$30)
ARYTAB 47-48($2F-$30)  49-50($31-$32)

With this information, all that is necessary is to write a routine that will find the descriptor of let's say, A$ like this.

Search memory from the location pointed to by VARTAB to that pointed to by ARYTAB, looking for the bytes 65 and 128 ($41 & $80).
When these are found, then the next 3 bytes contain the string length, and the location in LO/HI byte order,  respectivly, of the string stored in the A$ variable.

INDIRECT Y addressing works well for this routine in the C-64, and in the 128, if you locate your code in BANK 1, since this is where the descriptor is stored in the C-128.

Of course, you can locate your code anywhere in memory, and use the KERNAL INDFET routine to get the descriptor.

Now you know where the string data is located, and how long it is, so you can manipulate it to suit your needs. Be aware that you can not make it any longer then it was originally defined from BASIC, or you'll clobber something else.

If you plan to manipulate the string from ML for future use in BASIC, you should make sure you initially define it long enough to suit. Any length up to 255 is
permitted. Something like this can be used.

     FORI=1TO255:A$=A$+" ":NEXT


BigDumbDinosaur

You could also take a look at $7AAF in the BASIC ROM.  This subroutine searches RAM1 for a given variable name.
x86?  We ain't got no x86.  We don't need no stinking x86!