Both stacks in the 6809 build downwards and pop upwards. At any time the stack pointer points at the location above that into which the next byte will be pushed. So, the push sequence is; decrement the stack pointer, then enter the byte where the pointer now points. The pull (or pop) sequence is the direct opposite; the byte is read from the location pointed at by the pointer, and then the pointer is incremented,
e.g. the following are equivalent in function:
STA ,-S => PSHS A LDA ,S+ => PULS AAny information on the stack may be accessed using the indexed addressing mode. The top of the stack may always be looked at using the operand S (or U). Other bytes (or words) on the stack may be accessed using an offset from S or U. This offset may be a constant or be held in a register, A, B or D (a sixteen bit register pair),
LDA 4,S ; 5th byte down on stack LDA B,U ; U offset by BThe stack pointers themselves may be manipulated using the LEAS and LEAU instructions, which take only the indexed addressing mode. Either stack pointer may be incremented or decremented using these instructions:
LEAS ,S+ ; Increments S LEAU 4,U ; Adds 4 to UMany operations may be done directly on data stored on the stack, using the indexed mode, with U or S as the base register (after the comma). To write re-entrant code which uses locations in memory to store temporary variables, it is necessary to allocate this space on the stack to ensure that each invocation of the routine gets a new area of memory for its variables. Variables whose space is allocated way are called "automatic" variables because the actual locations used are automatically allocated each time the routine is called.
The stack may also be used to pass parameters to a routine and to receive the results. The calling routine must allocate space on the stack for the parameters and then call the subroutine. The subroutine then picks up the parameters from the stack, processes them and returns. The calling routine must then fiddle the stack, using LEAS, to restore the stack to its original state. If results are required, the calling routine must also allocate space on the stack for them, and deallocate the space when the subroutine has returned, as for the parameters. The locations for the parameters and results are then reached via a constant offset from S.
Number to be converted: $3F5A Divide by ten: $655 remainder 8 Divide by ten again: $A2 remainder 1 Divide by ten yet again: $1O remainder 2 Divide by ten one more time: $1 remainder 6 Positively the last time: $O remainder 1 So $3F5A, when converted to decimal is 16218The major drawback to this method is that the digits are not generated in a convenient order. We would like them to be generated most significant digit first, so that we can print them as we produce them, but this method yields the least significant digit first. So we need to reverse the order of the digits before printing them. Needless to say there are many different ways of achieving this end; we could store the digits as a list in memory and then print them out in reverse order, for example.
Another way would be to push the digits onto the stack as we generate them and then pop them off for printing when all the digits have been produced. Both these methods require that we maintain a digit counter to keep track of the number of digits we have generated, and the first method also requires a reserved area of memory for the use of the conversion routine. A third method will be explained below.
The routine first divides the number by ten and stacks the resulting digit. It then calls itself to print the more significant digits beiore unstacking and printing its own digit. For instance, let us print the decimal number 145 (in binary within the 6809 of course). The routine first divides by ten to yield 14 and a remainder of 5, which it stacks. It then calls itself, passing 14 as the number. The cailed routine divides 14 by ten to yield 1 and a remainder of 4, which it stacks. The third calling of the routine divides 1 by ten to get 0 and a remainder of 1. As the number is now zero, it prints the 1 and returns to its cailer. The caller pops 4, which it stacked previously and prints it before returning to its caller, which pops and prints 5. The first called routine then returns to the routine which originally called it, the task having been accomplished.
; Recursive Binary to Decimal Conversion routine. ; The number to be converted is passed in X. ; X returns as zero and A is destroyed. ; The existence of 'divten' and 'outdigit' is assumed ; but the routines are not given here. 'divten' ; divides X by ten, returning the result in X ; and the remainder in A. 'outdigit' outputs the digit ; in A to the terminal or whatever. bintodec: jsr divten ; divide X by 10 pshs a ; save remainder cmpx #0 ; finished? beq print ; if so, print digits jsr bintodec ; not done, get next digit print: puls a ; retrieve digit jsr outdigit ; print it rts ; and returnSome features of this program are worth noting:
When you start writing real programs, particularly in Machine/Assembly Code you soon discover the need for subroutines. These are sections of code which you want to use over and over again in much the same way that you use procedures in high level languages.
For example in the lab task to write software for the Pelican Crossing you need a delay routine as written in lab session 1
Set traffic lights to AMBER Delay for 10 secs Set traffic lights to RED Delay for 2 secs Set CROSS lights on and beeper on Delay for 30 secs etc.Obviously it would be wasteful in terms of programming time and memory to write out the delay routine each time it is required - ideally we want to write it out once and call it up as needed. This is exactly what we mean by a subroutine.
Jumping to the start of a subroutine is no problem at all - we have seen how branch and jump instructions can be implemented. The problem comes at the end of the subroutine when control has to return to the section of program from which it came.
Set traffic lights to AMBER Call DELAY subroutine Set traffic lights to RED Call DELAY subroutine etc.This means that when a subroutine is called the processor must somehow store the address to which control should return - in other words the address of the instruction which follows the call subroutine instruction. There are many ways in which this problem has been tackled.
1000 JSR $1200 000 0000 1003 Next Inst 0000 : 0000 : 1200 Start of SR1 001 1003 : 0000 : 0000 122A JSR $13A0 122D Next Inst 001 1003 : 0000 1235 RTS 0000 13A0 Start of SR2 002 1003 : 122D : 0000 13D8 RTSThe disadvantage of this approach is the that no matter how many registers are provided people will always want more and it adds to the processor complexity. The advantage is that it is fast since all transfers are internal to the processor. (A number of Intel processors such as 8048 have used this approach).
Since addresses are 16 bits and memory locations in 6809 can only store 8 bits two locations are need to store each return address. The exact sequence of events when a JSR is executed are. E.g. If SP is initialised to $2000
(SP)-1 -> SP $2000-1 = $1FFF -> SP (PCL) -> (SP) $03 -> $1FFF (SP)-1 -> SP $1FFF-1 = $1FFE -> SP (PCH) -> (SP) $10 -> $1FFEWhen an RTS is executed the reverse process takes place
((SP)) -> PCH ($1FFE) = $10 -> PCH (SP)+l -> SP $1FFE + 1 = $1FFF -> SP ((SP)) -> PCL ($1FFF) = $03 -> PCL (SP)+l -> SP $1FFF + 1 = $2000 -> SPUsing this technique the user can allocate as much space as is necessary for the return address stack. Notice, though that there are no set boundaries so that if you had:-
$1000 JSR $1000The whole of memory would be quickly used up by the stack. This is an obvious situation which can be easily avoided but there are more subtle ones which can have the same effect.
There is no standardisation between processors as to the order in which the H and L bytes of the PC are stored. Nor is the pre-decrement, post increment for JSR and RTS fixed.
Nevertheless this is the method of handling return addresses which is used in all 8 & 16 bit processors these days. Its only disadvantage is the length of time it takes to store the address bytes in external memory,
One solution is for the subroutine to store away the register contents in set memory locations before using them and to retrieve these values before returning. The problem is where should these values be stored? Ideally we are writing subroutines which can be used in many different programs or we may even be using other peoples or commercially available routines. If set memory locations are used they may clash with program or data areas in another application.
A better solution is to use the stack which the user always sets to be in an unwanted area of memory. At the start of the routine the contents of any registers which will be changed is stored on the stack and they are retrieved before the RTS is executed. Putting data onto the stack is known as Pushing data and retrieving it is known as Pulling. The 6809 has instructions for doing this known as Push (PSH) and Pull (PUL) - the codes for these are 34 and 35. Each of these has an additional instruction byte which indicates which registers are to be stored or retrieved as follows:-
LSBit 0 - Condition Code Register 1 - Accumulator A 2 - Accumulator B 3 - Direct Page Register 4 - Index Register X 5 - Index Register Y 6 - Stack Pointer MSBit 7 - Program CounterNotice that the first four are 8 bit registers requiring only one byte to be stored and that the last four are 16 bit registers requiring two bytes to be stored.
It is rarely necessary to store the PC and SP on the stack and obviously the more registers stored the longer the instruction takes (It's 5 cycles + 1 for each byte - 17 max).
34 Pushes all registers 34 Pushes A and X FF 12Clearly the number of bytes pushed at the start of the subroutine must equal the number of bytes pulled at the end otherwise the return address will not be correct and the program will crash. To avoid this happening the 6809 provides two stack pointers U and S, the User and System stack pointers. The JSR and RTS always use the S pointer but the Push and Pull instructions can use either. If the User stack is always used then there is little chance of the return address being corrupted. The codes given are for the system stack but the user stack works in the same way.