Tips - Writing fast Commodore 64 Basic code

Basic memory layout and garbage collection

In the default memory layout of the C64, Basic memory is allocated at address 2049 to 40959. When the computer starts, this is reported as 38911 Basic bytes free. When you load or enter a Basic program, or even start executing Basic commands at the prompt, Basic will start to use this memory area for several different purposes. It is divided into five different regions, each of which can be zero bytes in size. They are always stored in this order:

Basic program storage (starts at 2049)
Normal variable storage
Array variable storage
Free memory
String heap (ends at 40959)

The string heap is located at the top of Basic memory. Whenever you create a new string value, it is added to the heap. If a string variable is assigned this value, the two-byte pointer associated with the string variable will now point to this address in the string heap. If the string variable already had a value, the old value will just be left on the heap as an orphan (nothing points to it). In this way, the string heap grows downwards until there is no more free memory. At this point, a garbage collection is triggered, which will usually shrink the string heap in size, so program execution can continue. If not enough memory can be freed up, the Basic interpreter will print an out of memory error and stop.

A garbage collection is quite disruptive as it will cause a pause in program execution. Depending on the program, this pause may last anywhere from a few tenths of a second up to a minute or more. If you want to avoid this during a certain part of your program, you can force a garbage collection event just before that part, by using the function FRE() in Basic. FRE() is used to calculate the amount of free memory, and to do this, Basic first does a garbage collection to minimize the string heap.

Also note that array variables are stored directly after normal variables. This means that whenever you create a normal variable, all array variables must be moved. For this reason, you want to start using all your normal variables first, then issue the DIM statements to setup your array variables.

Optimizations for speed

There are a number of things to keep in mind when writing code that needs to be fast.

Focus on the most important parts of the program

If you write a game, it's probably important that the inner game loop is as fast as possible. Printing a menu may not be as crucial. Spend your efforts where they matter.

Minimize the number of Basic lines

Write many Basic statements on the same line whenever possible, since it takes more time for the interpreter to go to the next line than to pass a ":" on the current line. Also, this will keep the number of program lines down, which will make jumps (GOTO etc) faster.

Spaces slow the code down

Just make it a habit to never write spaces in code, at least when speed is a factor. It saves some time (albeit not very much), and it will allow you squeeze more code into a single program line.

REM statements slow the code down

It takes time for the interpreter to read through the REM statement or just pass the line which holds the REM statement. Also, if your program starts getting big, you will start making optimizations where you use up more memory to make something faster. If your program is filled with REM statements, you will have less free memory available for this kind of optimization. To still make the code readable, you can keep a text file on the side with comments.

Keep line numbers as short as possible

All statements which refer to a line number, such as GOTO, are slower the more digits there are in the line number. Even when the jump is not to be performed, it takes more time for the interpreter to read through the number if it is longer, just to get to the next statement.

Jumps are faster if there are fewer lines of code

Program code is stored as a singly linked list. This means that from each line of code in memory there is a pointer to the beginning of the next line of code, but there is no pointer to the previous line. And there is no index on the side saying where each line starts in memory. In essence, this means that the Basic interpreter can only search forward through the program code to find a certain line. If the line number to jump to is greater than the current line number, it will search forward from the line after the current one, otherwise it will search from the beginning of the program. It checks the first line. If the line number it's looking for is higher than this, it follows the pointer to where the next line starts, checks that line in the same way etc. This means that the time it takes to find a line increases with the number of lines preceding it in the program. For this reason, you want to put the code that has to be as fast as possible as close to the beginning of the program as possible. Unless, of course, no jumps are made within this code.

Use FOR-loops to emulate DO UNTIL-loops

When you find that you need to perform a certain piece of code over and over again, as long as a certain condition is met, you can use a FOR loop to avoid having to use GOTO to go back. This is usually faster, unless it's close to the beginning of the program. This is because GOTO to a previous line, or even the current line, has to search from the beginning of the program, but a FOR loop keeps a pointer to exactly where it should go back on each iteration. To use this technique, you type something like FOR C=-1 TO 0:[Perform your in-loop actions here]:C=(Condition to stay in loop):NEXT . I.e. to read through all the numbers stored in DATA statemens until you find 999, you can use: FOR C=-1 TO 0:READ X:C=(X<>999):NEXT . This will set C to -1 each time, until X is 999 - then C will be set to 0, and the loop will end.

Myth debunked: Forward jumps are faster if they pass a multiple of 256

When the Basic interpreter encounters a jump (due to a GOTO, GOSUB, IF THEN etc), it will check if the highbyte of the target line number is larger than the highbyte of the current line number. If it is, it will start searching forward from the current line until it finds the right line number. Otherwise, it will start searching from the beginning of the program. Because of this, "500 GOTO 520" is faster than "500 GOTO 510", since 2*256=512.

PRINT beats POKE

When you want to get more than a few characters onto the screen at once, try to use PRINT rather than POKE to do it, as it will usually be a lot faster. This is mainly because a single PRINT statement can print a string up to 255 characters in length, including cursor movements, HOME, REVERSE ON/OFF etc, effectively doing the work which you would otherwise use many FOR/NEXT and POKE statements for. The printing will be executed quite quickly by the Basic interpreter.

Use short variable names

Use one character variable names whenever possible, since it takes less time for the interpreter to read them in the code. If you need more than 26 variables, make sure the ones that are used most in the code which needs to be as fast as possible only use one character.

Use floating point variables, not integers

Basic can use integer variables, like "A%". Integer values will only take up two bytes, as opposed to the seven bytes used for floating point values. Also, integer arithmetic is a lot simpler and faster than floating point arithmetic. However, this last statement does not hold true for the C64. In order to make room for floating point arithmetics, integer arithmetics had to be left out of the Basic interpreter. So, in essence, if you write "A%=B%+C%" this means "Convert B% to a float. Convert C% to a float. Add them up using floating point arithmetic. Convert the result to an integer. Store it in A%.". Since storing integer values takes up so much less space, you may still choose to use it for some large arrays, but bear in mind that you do it at the cost of some speed.

Use . instead of 0

When writing a number, you can omit the digits before the dot, after the dot, or both. If you just write the dot, it will be interpreted as the value 0. Due to the details of the implementation of the Basic interpreter, parsing "." into a number is substantially faster than parsing "0" into that very same number. So, for example "A=." is faster than "A=0"

Skip the variable name in NEXT

NEXT is faster then NEXT I. Also, NEXT:NEXT is faster than NEXT J,I.

Initialize variables in order of importance

C64 Basic generally doesn't require you to declare or initialize variables at all. However, it is wise to do so if you want to maximize performance.

When a variable is first assigned a value in the program, the variable gets stored last in the normal variable storage area (or the array variable storage area, if it's an array variable). Whenever it is accessed after this, for setting or reading its value, the interpreter has to look through all the variables preceding it to find it. This, of course, is slower the more variables have been placed before it. Thus, create a section of your program which initializes all variables, and start with the ones which have to be the fastest. Initializing a variable is as simple as assigning it a value. If its initial value should be zero, or its initial value doesn't matter, use DIM to initalize it, like this: DIM A,B,C

Initialize all normal variables before all array variables

All array variables are stored in a separate piece of continuous memory, just after the normal variables. This means that whenever you start using a new normal variable, the entire array variable storage area has to moved up in memory. This takes time. So, first initialize all non-array variables and then all array-variables.

Use variables for constants

If you need to refer to a certain constant value in a context where speed is important, consider placing it in a variable. "POKE S+K,E" is faster then "POKE 1024+K,32". This is because the interpreter needs to read the digits and parse them into a floating point number.

Minimize arithmetic operations

Try to reduce the number of arithmetic operations, especially inside a loop that has to be fast. If you have nested loops, you may be able to perform parts of the calculations outside the innermost loop.

Prefer simpler artihmetic operations

"A=B+B" is almost always faster than "A=2*B" (However, it will need to look up the value of B twice, an operation which can be expensive if B was initialized late in the program). In some cases you may be able to use subtraction instead of division too.

Don't store intermediate values

While storing intermediate values in variables may look nicer and make the code more readable, it slows the code down quite a bit. I.e. A=1024+K:POKEA,64 is a lot slower than just POKE1024+K,64.

Chained IFs are faster than ANDs

If you need to check that two conditions are both met, the natural thing may be to do i.e. "IF A=B AND C=D THEN 100" but it's faster to chain IFs, i.e. "IF A=B THEN IF C=D THEN 100" and it has the same effect. Also make sure you put the condition that is the least likely to be met first. If A<>B in the above example, Basic won't have to bother with evaluating the second condition.

Use look-up tables instead of logic/arithmetic

If you need to make several expensive logic and/or arithmetic operations to calculate a value, it is often faster to place the possible results in a lookup table (an array variable or a number of consecutive bytes in memory)

Use address 214 and 211 to place the cursor

There is a fast way to place the cursor on a line and column of your choice. POKE the line number minus one into address 214, then print a newline character (just a simple PRINT without arguments will do the trick). Then POKE the column into address 211 and print the string you want. I.e. to print HELLO on line 10, column 20, you do: POKE214,9:PRINT:POKE211,20:PRINT"HELLO"

Place DATA statements early in the code

When you perform READ for the first time, or for the first time after a RESTORE, Basic has to search through your entire code to find the first DATA statement. If performance is important at this point, you want to place your DATA statements early on in the code. Of course, this will slow down jumps in the code to line numbers which come after the DATA statements, so sometimes you'll choose to take the performance hit for READ to get faster jumps.

Don't do Bubble Sort

Many programmers learn Bubble Sort, and then they never learn any other sort algorithms. Bubble Sort has the worst performance of all sort algorithms. If you ever need to sort anything, learn how to use some better options. For Basic, I recommend Insertion Sort and Shell Sort. Insertion Sort is about as easy to learn as Bubble Sort, but a lot better. If the list to be sorted is almost in order (like a highscore list where you've just added a new score at the end, which should now find its proper place in the list), Insertion Sort is perfect. For unsorted short lists, it's usually good enough. For sorting larger amounts of data which is probably not in order, use Shell Sort. It's like an advanced version of Insertion Sort, which also happens to be quite well suited to be implemented in Basic. I have created a disk image with implementations of BubbleSort, InsertionSort and QuickSort. Here are some average sort times (in seconds) I came up with using these implementations:

(Maybe) use an index array for sorting

Whenever you assign a new value to a string, or create a temporary string in your code (i.e by using MID$), the string value is added to the string heap. When the string heap is full, a garbage collection occurs. If you have a lot of string variables, this will take a lot of time and probably be quite disturbing. One thing you can do to avoid this is to use an index array when you need to sort strings - you keep all the strings in their original order in the string array, but create an integer array on the side to hold the sorted order of the strings. This means that instead of lettng two string values trade places in an array, as you typically do in a sort algorithm, you let two numbers trade places. I.e. instead of T$=S$(I):S$(I)=S$(J):S$(J)=T$ you do T%=S%(I):S%(I)=S%(J):S%(J)=T% . This is often slower than sorting without an index array, but it does make garbage collections a lot more rare.

Think about which code path has to be the fastest

When optimizing a program, it's common to get to points where the code can perform a number of different tasks, depending on the current state of variables etc. In these cases, there is always many ways of writing the code. Make sure you think about how important different code paths are. I.e. let's say you need to write code to decide between going through path A, B or C. If the code will be going through path A nine times out of ten, you probably want to make sure path A is as fast as possible, even if it means sacrificing a bit of speed in path B and/or C.

Test your alternatives

Whenever in doubt about which way is faster, you can usually decide by writing a separate test program which does the same operation 1000 times and print how long it took. In its simplest form it can look like this: 'TI$="000000":A=.:B=3:FORI=1TO1000:A=B+B:NEXT:PRINTTI' . This will print out the number of jiffies needed to perform A=B+B 1000 times. This can then be run again but replacing B+B with 2*B to see how that performs.

Fredrik Ramsberg, 2019. microheaven.com