DECSYSTEM-20 Assembly Language Guide Edited by: Frank da Cruz Chris Ryland Columbia University Center for Computing Activities New York, New York 10027 3 July 1980 Assembly Language Guide Page 1 Preface This document is intended to be a comprehensive introduction to assembly language programming on the DECSYSTEM-20. It consists of excerpts from various DEC manuals and other documents, with the addition of programming examples and some original material. Appropriate credit is given in each chapter or section in which the material is not original. Chapter 8 attempts to present a programming standard for Macro programs; in a sense it is the most important chapter because unless a program is clear and understandable, it will not be adaptable to new circumstances, and its usefulness and lifetime will be limited. This is a draft. Various sections still need to be filled in or refined, and more material added. This will be done from time to time. Comments are welcome. Introduction Page 2 1. Introduction Assembly language is a tool for writing computer programs consisting, at the source (program text) level, of actual machine instructions. It is sometimes desirable or necessary to use assembly language for two reasons: 1. Only in assembly language can you write a program that can take advantage of all the features of a given machine. Higher-level languages purposely conceal the machine from programmers so that programs may be transported from one machine to another. 2. You have maximum control over every aspect of the operation of your program, especially storage allocation and efficiency. In order to use an assembler, you must first be familiar with the machine's instruction set. This is described in Chapter 2. You will notice that this chapter actually describes three different machines: the KA10, the KI10, and the KL10. You should be aware that DEC-20's are KL10's (2020's are KS10's, but these are identical to KL10's for all practical purposes). There are several assemblers suitable for use on the DEC-20. These include Macro-20 (the standard DEC assembler), Midas (an alternative from MIT), and Fail (a fast 1-pass block structured assembler from Stanford). Only Macro-20 is supported by DEC, but the other two have certain distinct advantages. Macro-20 is described in Chapter 3. Another thing that assembly-language programmers need to know about is monitor calls. On a timesharing system, such as the DECSYSTEM-20, there are many things that you cannot do even in assembly language, such as issue input/output instructions; only the monitor can do such things. You can ask the monitor to perform services for you by issuing a monitor call (a DEC-20 monitor call is called a 'JSYS' (Jump to SYStem)), which amounts to calling a subroutine in the monitor. Information about DEC-20 monitor calls is given in Chapters 4 and 5 There are various aids available to assembly language programmers; these include libraries of helpful macros and routines (Chapters 6, 7), and interactive, symbolic debugging facilities (Chapter 10). In addition, there are chapters on how to write, run, and debug programs on the DECSYSTEM-20 (9), and some sample programs (11). And, as pointed out in the Preface, a very important Chapter on programming style and standards (8). The major intent of this guide is to provide a consolidated resource for those who wish to write assembly language programs, not assembly language subroutines to be called from higher-level languages; such subroutines can only be written with detailed knowledge of the calling conventions and internal data representations of the given language. The reader should have some knowledge of programming and some familiarity with DECSYSTEM-20 commands and procedures. Introduction Page 3 1.1. Basic Concepts Before we can proceed with descriptions of the instruction set, assembler, and monitor calls, some terminology and a few basic concepts of machine organization must be introduced. 1.1.1. Terminology Timesharing One way of running a computer. Timesharing allows many users to use the computer at once, seated at terminals, and to converse via the terminal with various programs on the computer. DECSYSTEM-20 A timesharing computer system, the subject of this manual. DEC Digital Equipment Corporation. The manufacturer of the DECSYSTEM-20, and of its predecessors, the PDP-10 and PDP-6. Operating System A program, or set of programs, that controls the operation of the computer. On a timesharing computer, the operating system functions include scheduling among users, allocating resources to users, controlling devices, and performing various other services for users that could not be done by the users themselves. Tops-20 Timesharing OPerating System-20. This is the name of the DECSYSTEM-20's operating system. Monitor One component of Tops-20. It is a program that is always running, and that performs most operating system functions. Exec Another component of Tops-20. It is the program by which users communicate their desires to the monitor, and through which the monitor communicates to the user. 1.1.2. Machine Organization The computer consists of various entities. You must be aware of what some of them are in order to program in assembly language: Memory This is a device that allows the computer to store and retrieve a limited amount of data very quickly. It is not permanent storage. It is often referred to as "core" memory (it is sometimes made from magnetic cores), to differentiate it from registers, disks, and other kinds of memory. Whether it's made from cores or not, it is solid-state memory, i.e. it has no moving parts. Register (Also called an Accumulator) This is a special kind of solid- state memory that is much faster than ordinary memory, and that allows operations such as arithmetic to be performed on data that is stored there. The DECSYSTEM-20 has 16 registers. Disk A kind of mechanical memory that is much slower than registers Introduction Page 4 or core memory, but which allows long-term storage of data in files whose names are kept in directories. Disks can hold much more data than core memory. CPU Central Processing Unit. This is the part of the computer that does most of the work, and where the major "intelligence" is to be found. It consists of the registers and the logic to move data to and from the registers, as well as logic to operate on the data in the registers (e.g. to do arithmetic). Instruction An instruction is a code to activate some function of the CPU. Typically, it specifies what operation to perform, what data to operate on, and where to put the result. Typical operations include arithmetic (add, subtract, etc.), transfer of control, comparison of numbers, etc. Assembly language programs consist of a sequence of instructions. When the program is being executed, the instructions are kept in memory. Address An address is a number that expresses the location of a quantity (instruction or data) in memory. Used as a verb, "address" means to "refer to". Bit (BInary Digit) The smallest unit of storage in memory. A bit is a quantity whose value can be 0 or 1. Word The major unit of storage in memory. In the DECSYSTEM-20, each word consists of 36 bits, and has its own address. You can address 262144 words of memory on the DECSYSTEM-20. Byte An intermediate unit of storage; any sequence of bits within a word. A byte is often the unit of storage for a character. Input/Output This is the act of transferring data between memory and some device, typically a disk or a terminal. 1.1.3. Instructions and Addressing Modes [ this is mostly taken care of in Ch. 2 ] 1.1.4. Internal Representation of Numbers 1.1.4.1. Binary Numbers [ to be filled in ] Introduction Page 5 1.1.4.2. Two's Complement Representation [ to be filled in ] 1.1.4.3. Integers [ to be filed in ] 1.1.4.4. Floating Point Numbers [ to be filled in ] 1.1.5. Arithmetic [ to be filled in ] 1.1.5.1. Integer Arithmetic [ to be filled in ] 1.1.5.2. Floating Point Arithmetic [ to be filled in ] 1.1.6. Logical Operations [ to be filled in ] 1.1.7. Character String Manipulation [ to be filled in ] 1.1.8. Elementary Data Structures 1.1.8.1. Tables (Arrays) and Indexing [ to be filled in ] Introduction Page 6 1.1.8.2. Stacks [ to be filled in ] Instruction Set Page 7 2. The PDP-10/DECSYSTEM-20 Instruction Set 2.1. Introduction This chapter was written by Ralph E. Gorin at Stanford University and modified slightly at Columbia. The PDP-10 is a general purpose stored program computer. There are four different processors (computers) in the PDP-10 family: the PDP-6, the KA10, the KI10 and the KL10. The newest of these is the KL10 which is the central processor in various DECsystem-10 and DECSYSTEM-20 configurations. (The KS10, found only in the DECSYSTEM-2020, is nearly identical to the KL10.) In general, we shall discuss the KL10 processor. There are three principal aspects of assembly language programming: the machine instructions, the assembler, and the operating system. The machine instructions are the primitive operations with which we write programs. Learning the instruction set means learning what operations are performed by each instruction. Programming is the art or science of combining these operations to accomplish some particular task. The machine instructions, like everything else in a computer, are in binary. The assembler is a program that translates the mnemonic names by which we refer to instructons into the binary form that the computer recognizes. The assembler also does a variety of other chores that are essentially bookkeeping. The operating system, or "monitor", is a special program that handles all input and output and which schedules among user programs. For its own protection and the protection of other users the operating system places various restrictions on user programs. User mode programs are resticted to memory assigned to them by the operating system; they may not perform any machine input-output instructions, nor can they perform several other restricted operations (e.g., HALT instruction). To facilitate user input-output and core allocation the operating system provides various monitor calls ( or JSYS operations) by which a user program can communicate its wishes to the system. Essentially all programs except the operating system itself are run as user mode programs. Editors, assemblers, compilers, utilities, and programs that you write yourself are all user mode programs. The PDP-10 is a word oriented machine. Words contain 36 data bits, numbered (left to right) 0 to 35. Every machine instruction is one word. There are two formats for machine instructions. Most instructions have the format: Bit 000000000 0111 1 1111 112222222222333333 Position 012345678 9012 3 4567 890123456789012345 ________________________________________ | | | | | | | OP | AC |I| X | Y | |_________|____|_|____|__________________| In the diagram the field names are: Instruction Set Page 8 - OP = operation code - AC = accumulator field - I = indirect bit - X = index field - Y = address field Some example intructions are: move 1, @100 ; MOVE is the OP. AC is 1. ; @ sets the I bit. ; X is zero, Y is 100. hrrz 17, 1(3) ; HRRZ is the OP. AC is 17, ; Y = 1, X = 3, I = 0 sos foo ; SOS is OP, FOO is symbolic ; for the Y field. AC, X, I ; are 0. All instructions without exception calculate an "effective address". The effective address may itself be used as data or it may be used to address the data or result word for a particular instruction. The effective address computation is described by the following program. MA means memory address. means program counter. C(MA) means contents of the word addressed by MA. Effective Address Calculation: IFETCH: MA := PC OP := Bits 0:8 of C(MA); AC := Bits 9:12 of C(MA); EACOMP: I := Bit 13 of C(MA); X := Bits 14:17 of C(MA); Y := Bits 18:35 of C(MA); E := Y; IF X <> 0 then E := Bits 18:35 of E+C(X); IF I=0 then go to DONE; MA := E; go to EACOMP DONE: The effective address is an 18 bit quantity. If the I and X fields of the instruction are zero then the effective address is simply the address (Y) field of the instruction. If X isn't zero, then the contents of the word addressed by X (i.e., the contents a register serving as the index register for this instruction) are added to the contents of the Y field (the sum is truncated to 18 bits). This sum serves as the effective address, unless the indirect (I) bit is set. If the I bit is set, a word is read from the address specified by X and Y, and the I, X, and Y fields of that new word are used for repeating the effective address calculation. Note that this calculation will loop until a word is read in which the I field is zero. Instruction Set Page 9 The result of the effective address calculation may be thought of as an instruction word where bits 0:12 are copied from the original instruction, bits 13:17 are zero, and 18:35 contain the effective address. In programming the PDP-10 it is convenient to imagine that your program occupies contiguous virtual memory locations from 0 to some maximum address. All memory locations are equivalent for most purposes (but some operating systems reserve some of your space for their own purposes). Sixteen memory locations (addresses 0 to 17 - note that addresses will appear in octal) are distinguished by their use as general purpose registers (also called accumulators or index registers). Most PDP-10 instructions address one memory operand and one accumulator (so-called "one and a half address" architecture). This means that nearly all instructions affect some accumulator. These registers are actually implemented in high speed solid state memory rather than in slower main memory. For any purpose where it is convenient to do so, a user may reference an accumulator as memory. Instruction classes are formed by a mnemonic class name and one or more modifier letters. The modifiers usually signify some transformation on the data or the direction of data movement or the skip or jump condition. Some functional duplications and some no-ops result from this scheme. However, despite these drawbacks, this notion of instruction classes and modifiers makes the instruction set easy to learn. 2.2. Full Word Instructions These are the instructions whose basic purpose is to move one or more full words of data from one location to another, usualy from an accumulator to a memory location or vice versa. In some cases, minor arithmetic operations are performed, such as taking the magnitude or negative of a word. 2.2.1. MOVE The MOVE class of instructions perform full word data transmission between an accumulator and a memory location. There are sixteen instructions in the MOVE class. All mnemonics begin with MOV. The first modifier specifies a data transformation operation; the second modifier specifies the source of data and the destination of the result. |E no modification | from memory to AC MOV |N negate source |I Immediate. Source is 0,,E to AC |M magnitude |M from AC to memory |S swap source |S to self. If AC<>0 to AC also C(E) signifies contents of E (effective address) prior to the execution of the instruction. C(AC) signifies contents of the AC specified. CS(E) and CS(AC) signify the contents of E or AC with left and right halves swapped. CR(AC) and CL(AC) signify the 18 bit right and left contents of the AC. PC signifies the 18 bit contents of the program counter. Instruction Set Page 10 MOVE C(AC) := C(E) MOVEI C(AC) := 0,,E MOVEM C(E) := C(AC) MOVES C(E) := C(E); if AC<>0 then C(AC) := C(E) MOVN C(AC) := -C(E) MOVNI C(AC) := -E MOVNM C(E) := -C(AC) MOVNS C(E) := -C(E); if AC<>0 then C(AC) := -C(E) MOVM C(AC) := |C(E)| MOVMI C(AC) := 0,,E MOVMM C(E) := |C(AC)| MOVMS C(E) := |C(E)|; if AC<>0 then C(AC) := |C(E)| MOVS C(AC) := CS(E) MOVSI C(AC) := E,,0 MOVSM C(E) := CS(AC) MOVSS C(E) := CS(E); if AC<>0 then C(AC) := CS(E) 2.2.2. EXCH - Exchange EXCH exchanges the contents of the selected ac with the contents of the effective address. EXCH C(AC):=:C(E) 2.2.3. BLT - Block Transfer The instruction copies words from memory to memory. The left half of the selected AC specifies the first source address. The right half of the AC specifies the first destination address. The effective address specifies the last destination address. Words are copied, one by one, from the source to the destination, until a word is stored in an address greater than or equal to the effective address of the BLT. Caution: BLT clobbers the specified AC. Don't use the BLT AC in address calculation for the BLT, results will be random. If source and destination overlap, remember that BLT moves the lowest source word first. If the destination of the BLT includes the BLT AC, then the BLT AC better be the last destination address. 2.2.4. Programming Examples Using Fullword Instructions In these examples, several standard PDP-10 assembly languange notations are used: [] Square brackets enclose a "literal". The contents of the brackets are assembled in another place, and the bracketed expression is replaced by the address of that place. Instruction Set Page 11 <> Angle brackets enclose an expression. ,, Separates left- and right-half quantities in a word. In , a is right-adjusted in bits 0:17, and b is right- adjusted in bits 18:35. If either quantity is too big, it is truncated on the left. () Parentheses enclose an expression which denotes an index register. . (pronounced "dot") Denotes the current location. ; Save all the accumulators: movem 17, savac+17 movei 17, savac ; Source is 0, destination blt 17, savac+16 ; is SAVAC. ; Restore all the accumulators: movsi 17, savac ; Source is SAVAC, blt 17, 17 ; destination is 0. ; Zero 100 words starting at TABLE. setzm table move t1, [table,,table+1] ; Source and blt t1, table+77 ; destination overlap ; Move 77 words from TABLE thru TABLE+76 to TABLE+1 thru ; table+77: BLT can't be done here because the source and ; destination overlap. (See the description of POP.) move t1, [400076,,table+76] pop t1, 1(t1) ; Store TABLE+76 into jumpl t1, .-1 ; table+77, etc. 2.3. Stack Instructions These two instructions insert and remove full words in a pushdown list. The address of the top of the list is kept in the right half of the AC referenced by these instructions. The program may keep a control count in the left half of the AC. There are also two subroutine calling instructions (PUSHJ and POPJ) that use this same format pushdown list. 2.3.1. PUSH - Push on Stack PUSH C(AC):=C(AC)+<1,,1>; C(CR(AC)):=C(E) The specified accumulator is incremented by adding 1 to each half (in the KI10 and KL10 carry out of the right half is suppressed). If, as result of the addition, the left half of the AC becomes positive, a pushdown overflow condition results (but the instruction procedes to completion). The word Instruction Set Page 12 addressed by the effective address is fetched and stored on the top of the stack which is addressed by the right half of the (incremented) accumulator. 2.3.2. POP - Pop Stack POP C(E):=C(CR(AC)); C(AC):=C(AC)-<1,,1> POP undoes PUSH as follows: the word at the top of the stack (addressed by the right half of the selected AC) is fetched and stored at the effective address. Then the AC is decremented by subtracting 1 from both halves (in the KI10 and KL10 carry out of bit 18 is suppressed). If the AC becomes negative as a result of the subtraction a pushdown overflow results. Often the accumulator used as the pushdown pointer is given the symbolic name P. To initialize a pushdown pointer (e.g., for N words starting at PDLIST), one might do the following: move p, [iowd n, pdList] where the IOWD pseudo op assembles -N,,PDLIST-1. Elsewhere in the program should appear: pdList: block n which defines the symbolic label PDLIST and reserves N words following it for the stack. 2.3.3. ADJSP - Adjust Stack Pointer ADJSP CL(AC) := CL(AC)+E; CR(AC) := CR(AC)+E E is added algebraically, with bit 18 acting as the sign bit, to both halves of AC. If a negative E changes the count in AC left from positive or zero to negative, or a positive E changes the count from negative to positive or zero, set trap 2. 2.4. Halfword Instructions The halfword class of instructions perform data transmission between one half of an accumulator and one half of a memory location. There are sixty-four halfword instructions. Each mnemonic begins with H and has four modifiers. The first modifier specifies which half of the source word; the second specifies which half of the destination. The third modifier specifies what to do to the other half of the destination. The fourth modifier specifies the source of data and the destination of the result. Instruction Set Page 13 H halfword from |R right of source |L left |R right of destination |L left | no modification of other half |Z zero other half |O set other half to ones |E sign extend source to other half | from memory to AC |I Immediate |M from AC to memory |S to self. If AC<>0 to AC also. 2.4.1. HR - Halfword Right HRR CR(AC) := CR(E) HRRI CR(AC) := E HRRM CR(E) := CR(AC) HRRS CR(E) := CR(E); if AC<>0 then CR(AC) := CR(E) HRRZ C(AC) := 0,,CR(E) HRRZI C(AC) := 0,,E HRRZM C(E) := 0,,CR(AC) HRRZS C(E) := 0,,CR(E); if AC<>0 then C(AC) := 0,,CR(E) HRRO C(AC) := 777777,,CR(E) HRROI C(AC) := 777777,,E HRROM C(E) := 777777,,CR(AC) HRROS C(E) := 777777,,CR(E); if AC<>0 then C(AC) := 777777,,CR(E) HRRE C(AC) := 777777*C18(E),,CR(E) HRREI C(AC) := 777777*E18,,E HRREM C(E) := 777777*C18(AC),,CR(AC) HRRES C(E) := 777777*C18(E),,CR(E); if AC<>0 then C(AC) := 777777*C18(E),,CR(E) HRL CL(AC) := CR(E) HRLI CL(AC) := E HRLM CL(E) := CR(AC) HRLS CL(E) := CR(E); if AC<>0 then CL(AC) := CR(E) HRLZ C(AC) := CR(E),,0 HRLZI C(AC) := E,,0 HRLZM C(E) := CR(AC),,0 HRLZS C(E) := CR(E),,0; if AC<>0 then C(AC) := CR(E),,0 Instruction Set Page 14 HRLO C(AC) := CR(E),,777777 HRLOI C(AC) := E,,777777 HRLOM C(E) := CR(E),,777777 HRLOS C(E) := CR(E),,777777; if AC<>0 then C(AC) := CR(E),,777777 HRLE C(AC) := CR(E),,777777*C18(E) HRLEI C(AC) := E,,777777*E18 HRLEM C(E) := CR(AC),,777777*C18(AC) HRLES C(E) := CR(E),,777777*C18(E); if AC<>0 then C(AC) := CR(E),,777777*C18(E) 2.4.2. HL Halfword Left HLR CR(AC) := CL(E) HLRI CR(AC) := 0 HLRM CR(E) := CL(AC) HLRS CR(E) := CL(E); if AC<>0 then CR(AC) := CL(E) HLRZ C(AC) := 0,,CL(E) HLRZI C(AC) := 0 HLRZM C(E) := 0,,CL(AC) HLRZS C(E) := 0,,CL(E); if AC<>0 then C(AC) := 0,,CL(E) HLRO C(AC) := 777777,,CL(E) HLROI C(AC) := 777777,,0 HLROM C(E) := 777777,,CL(AC) HLROS C(E) := 777777,,CL(E); if AC<>0 then C(AC) := 777777,,CL(E) HLRE C(AC) := 777777*C0(E),,CL(E) HLREI C(AC) := 0 HRREM C(E) := 777777*C0(AC),,CL(AC) HRRES C(E) := 777777*C0(E),,CL(E); if AC<>0 then C(AC) := 777777*C0(E),,CR(E) HLL CL(AC) := CL(E) HLLI CL(AC) := 0 HLLM CL(E) := CL(AC) HLLS CL(E) := CL(E); if AC<>0 then CL(AC) := CL(E) HLLZ C(AC) := CL(E),,0 HLLZI C(AC) := 0 HLLZM C(E) := CL(AC),,0 HLLZS C(E) := CL(E),,0; if AC<>0 then C(AC) := CL(E),,0 HLLO C(AC) := CL(E),,777777 HLLOI C(AC) := 0,,777777 HLLOM C(E) := CL(E),,777777 HLLOS C(E) := CL(E),,777777; if AC<>0 then C(AC) := CL(E),,777777 Instruction Set Page 15 HLLE C(AC) := CL(E),,777777*C0(E) HLLEI C(AC) := 0 HLLEM C(E) := CL(AC),,777777*C0(AC) HLLES C(E) := CL(E),,777777*C0(E); if AC<>0 then C(AC) := CL(E),,777777*C0(E) 2.5. Arithmetic Testing 2.5.1. AOBJ - Add One to Both Halves and Jump The AOBJx (Add One to Both halves of AC and Jump) instructions allow forward indexing through an array while maintaining a control count in the left half of an accumulator. Use of AOBJN and AOBJP can reduce loop control to one instruction. AOBJN C(AC):=C(AC)+<1,,1>; if C(AC)<0 then PC:=E AOBJP C(AC):=C(AC)+<1,,1>; if C(AC)>=0 then PC:=E Example. Add 3 to N words starting at location TAB: movsi 1, -N ; Initialize register 1 to -N,,0. movei 2, 3 ; Register 2 gets the constant 3. addm 2, tab(1) ; Add 3 to one array element. aobjn 1, .-1 ; Increment both the index and the ; control. Loop until the ADDM has ; been done N times. By the way, for the sake of consistency, AOBJN should have been called AOBJL and AOBJP should have been called AOBJGE. However, they weren't. 2.5.2. JUMP The JUMP instructions compare the selected accumulator to zero and jump (to the effective address of the instruction) if the specified relation is true. JUMP Jump never. JUMPL if C(AC) < 0 then PC:=E JUMPLE if C(AC) <= 0 then PC:=E JUMPE if C(AC) = 0 then PC:=E JUMPN if C(AC) <> 0 then PC:=E JUMPGE if C(AC) >= 0 then PC:=E JUMPG if C(AC) > 0 then PC:=E JUMPA PC:=E Instruction Set Page 16 2.5.3. SKIP The SKIP instructions compare the contents of the effective address to zero and skip the next instruction if the specified relation is true. If a non-zero AC field appears, the selected AC is loaded from memory. SKIP if AC<>0 then C(AC):=C(E); don't skip SKIPL if AC<>0 then C(AC):=C(E); if C(E) < 0 then skip SKIPLE if AC<>0 then C(AC):=C(E); if C(E) <= 0 then skip SKIPE if AC<>0 then C(AC):=C(E); if C(E) = 0 then skip SKIPN if AC<>0 then C(AC):=C(E); if C(E) <> 0 then skip SKIPGE if AC<>0 then C(AC):=C(E); if C(E) >= 0 then skip SKIPG if AC<>0 then C(AC):=C(E); if C(E) > 0 then skip SKIPA if AC<>0 then C(AC):=C(E); skip 2.5.4. AOS - Add One and Skip The AOS (Add One to memory and Skip) instructions increment a memory location, compare the result to zero to determine the skip condition, If a non-zero AC field appears then the AC selected will be loaded (with the incremented data). AOS C(E) := C(E)+1; if AC<>0 then C(AC):=C(E) AOSL C(E) := C(E)+1; if AC<>0 then C(AC):=C(E); if C(E) < 0 then skip AOSLE C(E) := C(E)+1; if AC<>0 then C(AC):=C(E); if C(E) <= 0 then skip AOSE C(E) := C(E)+1; if AC<>0 then C(AC):=C(E); if C(E) = 0 then skip AOSN C(E) := C(E)+1; if AC<>0 then C(AC):=C(E); if C(E) <> 0 then skip AOSGE C(E) := C(E)+1; if AC<>0 then C(AC):=C(E); if C(E) >= 0 then skip AOSG C(E) := C(E)+1; if AC<>0 then C(AC):=C(E); if C(E) > 0 then skip AOSA C(E) := C(E)+1; if AC<>0 then C(AC):=C(E); skip 2.5.5. SOS - Subtract One and Skip The SOS (Subtract One from memory and Skip) instructions decrement a memory location, compare the result to zero to determine the skip condition, If a non-zero AC field appears then the AC selected will be loaded (with the decremented data). Instruction Set Page 17 SOS C(E) := C(E)-1; if AC<>0 then C(AC):=C(E) SOSL C(E) := C(E)-1; if AC<>0 then C(AC):=C(E); if C(E) < 0 then skip SOSLE C(E) := C(E)-1; if AC<>0 then C(AC):=C(E); if C(E) <= 0 then skip SOSE C(E) := C(E)-1; if AC<>0 then C(AC):=C(E); if C(E) = 0 then skip SOSN C(E) := C(E)-1; if AC<>0 then C(AC):=C(E); C(E) <> 0 then skip SOSGE C(E) := C(E)-1; if AC<>0 then C(AC):=C(E); if C(E) >= 0 then skip SOSG C(E) := C(E)-1; if AC<>0 then C(AC):=C(E); if C(E) > 0 then skip SOSA C(E) := C(E)-1; if AC<>0 then C(AC):=C(E); skip 2.5.6. AOJ - Add One and Jump The AOJ (Add One to AC and Jump) instructions increment the contents of the selected accumulator. If the result bears the indicated relation to zero then the instruction will jump to the effective address. AOJ C(AC) := C(AC)+1; AOJL C(AC) := C(AC)+1; if C(AC) < 0 then PC:=E AOJLE C(AC) := C(AC)+1; if C(AC) <= 0 then PC:=E AOJE C(AC) := C(AC)+1; if C(AC) = 0 then PC:=E AOJN C(AC) := C(AC)+1; if C(AC) <> 0 then PC:=E AOJGE C(AC) := C(AC)+1; if C(AC) >= 0 then PC:=E AOJG C(AC) := C(AC)+1; if C(AC) > 0 then PC:=E AOJA C(AC) := C(AC)+1; PC:=E 2.5.7. SOJ - Subtract One and Jump The SOJ (Subtract One from AC and Jump) instructions decrement the contents of the selected accumulator. If the result bears the indicated relation to zero then the instruction will jump to the effective address. SOJ C(AC) := C(AC)-1 SOJL C(AC) := C(AC)-1; if C(AC) < 0 then PC:=E SOJLE C(AC) := C(AC)-1; if C(AC) <= 0 then PC:=E SOJE C(AC) := C(AC)-1; if C(AC) = 0 then PC:=E SOJN C(AC) := C(AC)-1; if C(AC) <> 0 then PC:=E SOJGE C(AC) := C(AC)-1; if C(AC) >= 0 then PC:=E SOJG C(AC) := C(AC)-1; if C(AC) > 0 then PC:=E SOJA C(AC) := C(AC)-1; PC:=E Instruction Set Page 18 2.5.8. CAM - Compare Accumulator to Memory The CAM (Compare Accumulator to Memory) class compare the contents of the selected accumulator to the contents of the effective address. If the indicated condition is true, the instruction will skip. The CAM instruction is suitable for arithmetic comparision of either fixed point quantities or normalized floating point quantities. Needless to say, for the comparison to be meaningful both C(AC) and C(E) should be in the same format (i.e., either both fixed or both floating). CAM no op (references memory) CAML if C(AC) < C(E) then skip CAMLE if C(AC) <= C(E) then skip CAME if C(AC) = C(E) then skip CAMN if C(AC) <> C(E) then skip CAMGE if C(AC) >= C(E) then skip CAMG if C(AC) > C(E) then skip CAMA skip 2.5.9. CAI - Compare Accumulator Immediate The CAI (Compare Accumulator Immediate) class compare the contents of the selected accumulator to the effective address. If the indicated condition is true, the instruction will skip. Note than an effective address is an 18 bit quantity that is always considered to be positive. CAI no op CAIL if C(AC) < E then skip CAILE if C(AC) <= E then skip CAIE if C(AC) = E then skip CAIN if C(AC) <> E then skip CAIGE if C(AC) >= E then skip CAIG if C(AC) > E then skip CAIA skip Skipping instructions can be combined to achieve ANDing or ORing of logical expressions, e.g. the sequence cail t1, 1 caile t1, 1000 jrst bad is equivalent to if C(t1) < 1 or C(t1) > 1000 then go to BAD Instruction Set Page 19 2.6. Fixed Point Arithmetic In positive numbers bit 0 is zero. Bit 1 is most significant; bit 35 is least significant. Negative numbers are the two's complement of positive numbers. Results (of ADD, SUB or IMUL) outside the range -2^35 to 2^35-1 will set overflow ( PC bit 0). 2.6.1. ADD ADD C(AC) := C(AC) + C(E) ADDI C(AC) := C(AC) + E ADDM C(E) := C(AC) + C(E) ADDB C(AC) := C(AC) + C(E); C(E) := C(AC) 2.6.2. SUB - Subtract SUB C(AC) := C(AC) - C(E) SUBI C(AC) := C(AC) - E SUBM C(E) := C(AC) - C(E) SUBB C(AC) := C(AC) - C(E); C(E) := C(AC) 2.6.3. IMUL - Single-Word Multiply The IMUL instructions are for multiplying numbers where the product is expected to be representable as one word. IMUL C(AC) := C(AC) * C(E) IMULI C(AC) := C(AC) * E IMULM C(E) := C(AC) * C(E) IMULB C(AC) := C(AC) * C(E); C(E) := C(AC) 2.6.4. IDIV - Single-Word Divide The IDIV instructions are for divisions in which the dividend is a one word quantity. AC+1 signifes the quantity (AC+1 modulo 20 octal). If the divisor is zero set overflow and no divide; don't change AC or memory operands. The remainder will have the same sign as the dividend. IDIV C(AC) := C(AC) / C(E); C(AC+1) := remainder IDIVI C(AC) := C(AC) / E; C(AC+1) := remainder IDIVM C(E) := C(AC) / E IDIVB C(AC) := C(AC) / C(E); C(AC+1) := remainder; C(E) := C(AC) Instruction Set Page 20 2.6.5. MUL - Multiply The MUL instructions produce a double word product. A double word integer has 70 bits of significance. Bit 0 of the high order word is the sign bit. In data, Bit 0 of the low order word is ignored by the hardware. In results, bit 0 of the low word is the same as bit 0 in the high word. MUL will set overflow if both operands are -2^35. MUL C(AC AC+1) := C(AC) * C(E) MULI C(AC AC+1) := C(AC) * E MULM C(E) := high word of product of C(AC) * C(E) MULB C(AC AC+1) := C(AC) * C(E); C(E) := C(AC) 2.6.6. DIV - Divide The DIV instructions are for divisions in which the dividend is a two word quantity (such as produced by MUL). If C(AC) is greater than the memory operand then set overflow and no divide. DIV C(AC) := C(AC AC+1) / C(E); C(AC+1) := remainder DIVI C(AC) := C(AC AC+1) / E; C(AC+1) := remainder DIVM C(E) := C(AC AC+1) / E DIVB C(AC) := C(AC AC+1) / C(E); C(AC+1) := remainder; C(E) := C(AC) 2.7. Double Word Move Instructions (KI10 and KL10) There are four double word move instructions. These are suitable for manipulating KI10 and KL10 double precision floating point numbers, and for KL10 double precision integers. DMOVE C(AC AC+1) := C(E E+1) DMOVEM C(E E+1) := C(AC AC+1) DMOVN C(AC AC+1) := -C(E E+1) DMOVNM C(E E+1) := -C(AC AC+1) Note that the DMOVN and DMOVNM are NOT to be used for KA10 double precision floating point numbers! If a program is written that may be have to be run on a KA10, the use of all double word instructions should be avoided. 2.8. Double Precision Integer Arithmetic (KL10 only) There are four instructions for double precision integer arithmetic. None of these instructions have any modifier: they all operate on double (or quadruple) accumulators and double words in memory with results to double (or quadruple) accumulators. Instruction Set Page 21 The format for a double word integer is the same as that produced by MUL, i.e., a 70 bit integer in two's complement, with bit 0 of the most significant word is the sign; in operands, bit 0 of the low order word is ignored. A quadruple word has 140 bits; bit 0 of the most significant word is the sign; in operands, bit 0 in all other words is ignored. In double (and quadruple) arithmetic results bit 0 of the low order word(s) is stored with the same value as bit 0 of the high order word. DADD C(AC AC+1) := C(AC AC+1) + C(E E+1) DSUB C(AC AC+1) := C(AC AC+1) - C(E E+1) DMUL C(AC AC+1 AC+2 AC+3) := C(AC AC+1) * C(E E+1) DDIV C(AC AC+1) := C(AC AC+1 AC+2 AC+3) / C(E E+1) C(AC+2 AC+3) := remainder 2.9. Floating Point Arithmetic Single precision floating point numbers are represented in one 36 bit word as follows: Bit 0 00000000 011111111112222222222333333 Position 0 12345678 901234567890123456789012345 ______________________________________ | | | | |S| Exp | Fraction | |_|________|___________________________| If S is zero, the sign is positive. If S is one the sign is negative and the word is in two's complement format. The fraction is interpreted as having a binary point between bits 8 and 9. The exponent is a power of 2 represented in excess 200 (octal) notation. In a normalized floating point number bit 9 is different from bit 0, except in a negative number bits 0 and 9 may both be one if bits 10:35 are all zero. A floating point zero is represented by a word with 36 bits of zero. Floating point numbers can represent numbers with magnitude within the range 0.5*2^-128 to (1-2^-27)*2^127, and zero. A number in which bit 0 is one and bits 9-35 are zero can produce an incorrect result in any floating point operation. Any word with a zero fraction and non-zero exponent can produce extreme loss of precision if used as an operand in a floating point addition or subtraction. In KI10 (and KL10) double precision floating point, a second word is included which contains in bits 1:35 an additional 35 fraction bits. The additional fraction bits do not significantly affect the range of representable numbers, rather they extend the precision. The KA10 lacks double precision floating point hardware, but there are several instructions by which software may implement double precision. These instructions are DFN, UFA, FADL, FSBL, FMPL, and FDVL. Users of the KL10 are strongly advised to avoid using these intructions. In the PDP-6 floating point is somewhat different. Consult an wizard. Instruction Set Page 22 F floating |AD add | result to AC |SB subtract |R rounded |I Immediate. result to AC |MP multiply | |M result to memory |DV divide | |B result to memory and AC | | | no rounding | result to AC |L Long mode |M result to memory |B result to memory and AC |AD add DF double floating |SB subtract |MP multiply |DV divide Note: In immediate mode, the memory operand is . In long mode (except FDVL) the result appears in AC and AC+1. In FDVL the AC operand is in AC and AC+1 and the quotient is stored in AC with the remainder in AC+1. 2.10. Other Floating Point Instructions 2.10.1. FSC - Floating Scale FSC (Floating SCale) will add E to the exponent of the number in AC and normalize the result. One use of FSC is to convert an integer in AC to floating point (but FLTR, available in the KI and KL is better). To use FSC to float an integer, set E to 233 (excess 200 and shift the binary point 27 bits). The integer being floated must not have more than 27 significant bits. FSC will set AROV and FOV if the resulting exponent exceeds 127. FXU (and AROV and FOV) will be set if the exponent becomes smaller than -128. 2.10.2. FIX - Convert Floating Point to Integer FIX will convert a floating point number to an integer. If the exponent of the floating point number in C(E) is greater than (decimal) 35 (which is octal 243) then this instruction will set AROV and not affect C(AC). Otherwise, convert C(E) to fixed point by the following procedure: Move C(E) to AC, copying bit 0 of C(E) to bits 1:8 of AC (sign extend). Then ASH AC by X-27 bits (where X is the exponent from bits 1:9 of C(E) less 200 octal). FIX will truncate towards zero, i.e., 1.9 is fixed to 1 and -1.9 is fixed to -1. Instruction Set Page 23 2.10.3. FIXR - Fix and Round FIXR (Fix and round) will convert a number to an integer by rounding. If the exponent of the floating point number in C(E) is greater than (decimal) 35 (which is octal 243) then this instruction will set AROV and not affect C(AC). Otherwise, convert C(E) to fixed point by the following procedure: Move C(E) to AC, copying bit 0 of C(E) to bits 1:8 of AC (sign extend). Then ASH AC by X-27 bits (where X is the exponent from bits 1:9 of C(E) less 200 octal). If X-27 is negative (i.e., right shift) then the rounding process will consider the bits shifted off the right end of AC. If AC is positive and the discarded bits are >=1/2 then 1 is added to AC. If AC is negative and the discarded bits are >1/2 then 1 is added to AC. Rounding is always in the positive direction: 1.4 becomes 1, 1.5 becomes 2, -1.5 becomes -1, and -1.6 becomes -2. 2.10.4. FLTR - Float and Round FLTR (FLoaT and Round) will convert C(E), an integer, to floating point and place the result in AC. The data from C(E) is copied to AC where its is arithmetic shifted right 8 places (keeping the bits that fall off the end) and the exponent 243 is inserted in bits 1:8. The resulting number is normalized until bit 9 is significant (normalization may result in some or all of the bits that were right shifted being brought back into AC). Finally, if any of the bits that were right shifted still remain outside the AC the result is rounded by looking at the bit to the right of the AC. 2.11. Shift Instructions The following instructions shift or rotate the AC or the formed by AC and AC+1. The number of places to shift is specified by the effective address which is considered to be a signed number modulo 256 in magnitude. That is, the effective shift is the number composed of bit 18 (the sign) of the effective address and bits 28:35 of the effective address. If E is positive, a left shift occurs. If E is negative a right shift occurs. LSH Logical Shift. C(AC) is shifted as specified by E. Zero bits are shifted into the AC. LSHC Logical Shift Combined. C(AC AC+1) is shifted as a 72 bit quantity. Zero bits are shifted in. ASH Arithmetic Shift. Bit 0 is not changed. In a left shift zero bits are shifted into the right end of AC. In a left shift, if any bit of significance is shifted out of bit 1, AROV ( overflow) is set. In a right shift, bit 0 is shifted into bit 1. ASHC Arithmetic Shift Combined. AC bit 0 is not changed. If E is non zero, AC bit 0 is copied to AC+1 bit 0. C(AC AC+1) is shifted as a 70 bit quantity. In a left shift zero bits are shifted into the right end of AC+1. In a left shift, if any bit of significance is shifted out of AC bit 1 then AROV is set. In a right shift AC bit 0 is shifted into AC bit 1. Instruction Set Page 24 ROT Rotate. The 36 bit C(AC) is rotated. In a left rotate bits shifted out of bit 0 are shifted into bit 35. In a right rotate, Bit 35 is shifted into bit 0. ROTC Rotate Combined. AC and AC+1 are rotated as a 72 bit quantity. In a left rotate AC bit 0 shifts into AC+1 bit 35 and AC+1 bit 0 shifts into AC bit 35. In a right rotate, AC+1 bit 35 shifts into AC bit 0, etc. 2.12. Byte Instructions In the PDP-10 a "byte" is some number of contiguous bits within one word. A byte pointer is a word that describes the byte. There are three parts to the description of a byte: the word (i.e., address) in which the byte occurs, the position of the byte within the word, and the length of the byte. A byte pointer has the following format: Bit 000000 000011 1 1 1111 112222222222333333 Position 012345 678901 2 3 4567 890123456789012345 _________________________________________ | | | | | | | | POS | SIZE |U|I| X | Y | |______|______|_|_|____|__________________| - POS is the byte position: the number of bits remaining in the word to the right of the byte. - SIZE is the byte size in bits. - The U field is reserved for future use and must be zero. - I, X, and Y are the same as in an instruction. 2.12.1. LDB - Load Byte The contents of the effective address of the LDB instruction is interpreted as a byte pointer. The byte described there is loaded, right adjusted, into the AC. The rest of the AC is cleared. 2.12.2. DPB - Deposit Byte The contents of the effective address of the DPB instruction is interpreted as a byte pointer. The byte described there is deposited from the byte of the same size at the right end of the AC. AC and the remainder of the word into which the byte is deposited are left unchanged. Instruction Set Page 25 2.12.3. IBP - Increment Byte Pointer The AC field must be zero. The contents of the effective address are fetched. The POS field is changed by subtracting the size field from it. If the result of the subtraction is greater than or equal to zero, store the difference in the POS field. If the difference is negative, add 1 to the Y field (in the KA10 and PDP-6 if Y contains 777777 then this will carry into the X field; in the KI10 and KL10 the carry out is suppressed) and set POS field to 44-SIZE (44 is octal). The effect of this is to modify the byte pointer to address the next byte (of the same size) that follows the byte addressed by the original pointer, skipping over any bits that may be left over at the end of a word (when the bytesize does not divide evenly into the wordsize, 36). 2.12.4. ILDB - Increment and Load Byte Increment the byte pointer contained at the effective address. Then perform a LDB function using the updated byte pointer. 2.12.5. IDPB - Increment and Deposit Byte Increment the byte pointer contained at the effective address. Then perform a DPB function using the updated byte pointer. 2.12.6. ADJBP - Adjust Byte Pointer Fetch the byte pointer at the effective address, increment or decrement it by the number of bytes specified in the AC, then place the adjusted byte pointer in the AC. The original byte pointer is unchanged. 2.12.7. POINT - Construct a Byte Pointer For convenience, the Macro assembler (and Fail) has a pseudo op for creating byte pointers. The POINT pseudo op has three parameters: size, address, and position. In the POINT pseudo op, the position argument specifies the bit number of the right most bit in the byte. If the position field is omitted, bit number "-1" is assumed (this assembles 44 in the POS field) which doesn't address any byte, but which, when incremented once, will address the first byte in the specified word. POINT 7, 100(1) 440701,,100 POINT 36, @2000,35 004420,,2000 POINT 7, FOO 440700,,foo Instruction Set Page 26 2.13. Logical Testing and Modification The TEST instructions are for testing and modifying bits in an accumulator. There are 64 instructions. Each mnemonic begins with a T and is followed by three modifiers. Test accumulator |R right half immediate |L left half immediate |D direct mask |S swapped mask |N no modification |Z zero selected bits |O set selected bits to One |C complement selected bits | never skip |N skip unless all selected bits are zero |E skip if all selected bits are zero |A skip always The test operation considers two 36 bit quantities. One of these is the contents of the selected AC. The other quantity, called the mask, depends on the first modifier letter. For R the mask is <0,,E>; for L it is . For D the mask is C(E), and for S the mask is CS(E), the swapped contents of E. - If the skip condition N is specified, then the test instruction will skip if the AND of the mask and the AC operand is Not equal to zero. - If the skip condition E is specified, then the test instruction will skip if the AND of the mask and the AC operand is Equal to zero. - If the modification code Z appears then bits that are one in the mask are made zero in the AC. - If the modification code O appears then bits that are one in the mask are made one in the AC. - If the modification code C appears then bits that are one in the mask are complemented in the AC. Note that the skip condition is determined on the basis of the contents of the AC before it is modified. The principle use for the Test instructions is in testing and modifying single bit flags that are kept in an accumulator. 2.14. Boolean Logic There are 16 possible boolean functions of 2 variables. The PDP-10 has 16 instruction classes (each with 4 modifiers) that perform these operations. Each boolean function operates on the 36 bits of AC and memory as individual bits. Instruction Set Page 27 Table of the Boolean functions C(AC) 0 0 1 1 C(E) 0 1 0 1 SETZ 0 0 0 0 SET to Zero AND 0 0 0 1 AND ANDCM 0 0 1 0 AND with Complement of Memory SETA 0 0 1 1 SET to AC ANDCA 0 1 0 0 AND with Complement of AC SETM 0 1 0 1 SET to Memory XOR 0 1 1 0 eXclusive OR IOR 0 1 1 1 Inclusive OR ANDCB 1 0 0 0 AND with Complements of Both EQV 1 0 0 1 EQuiValence SETCM 1 0 1 0 SET to Complement of Memory ORCA 1 0 1 1 OR with Complement of Memory SETCA 1 1 0 0 SET to Complement of AC ORCA 1 1 0 1 OR with Complement of AC ORCB 1 1 1 0 OR with Complements of Both SETO 1 1 1 1 SET to One Each of the 16 instructions above have four modifiers that specify where to store the result. No modifier means result to AC. Modifier I means Immediate: the memory data is <0,,E> and the result goes to AC. M as a modifier means result should be stored in memory. B means store the results in both memory and AC. 2.15. PC Format JSR, JSP, and PUSHJ all store a full word that contains the PC (Program Counter) and various flags. The format of a PC word is: 0 0 0 0 0 0 0 0 0 0 1 1 1 11111 112222222222333333 0 1 2 3 4 5 6 7 8 9 0 1 2 34567 890123456789012345 __________________________________________________ |A|C|C|F|F|U|I|P|A|T|T|F|D| | | |R|R|R|O|P|S|O|U|F|R|R|X|C|00000| PC | |O|Y|Y|V|D|E|T|B|I|A|A|U|K| | | |V|0|1| | |R| |L| |P|P| | | | | | | | | | | | | | |2|1| | | | | |_|_|_|_|_|_|_|_|_|_|_|_|_|_____|__________________| AROV, ARithmetic OVerflow, is set by any of the following: - A single instruction has set one of CRY0 or CRY1 without setting them both. - An ASH or ASHC has left shifted a significant bit out of AC bit 1. - A MULx instruction has multiplied -2^35 by itself. Instruction Set Page 28 - A DMUL instruction has multiplied -2^70 by itself. - An IMULx instruction has produced a product less than -2^35 or greater than 2^35-1. - A FIX or FIXR has fetched an operand with exponent greater than 35. - FOV (Floating Overflow) has been set. - DCK (Divide ChecK) has been set. - CRY0, CaRrY 0, if set without CRY1 being set will set AROV. This indicates any of the following conditions: 1. An ADDx has added two negative numbers with sum less than -2^35. 2. A SUBx has subtracted a positive number from a negative number and produced a result less than -2^35. 3. A SOSx or SOJx has decremented -2^35. - If CRY0 and CRY1 are both set, this indicates that one of the following non-overflow events has occurred: 1. In ADDx both summands were negative, or their signs differed and the postive one was greater than or equal to the magnitude of the negative summand. 2. In SUBx the sign of both operands was the same and the AC operand was greater than or equal to the memory operand, or the AC operand was negative and the memory operand was positive. 3. An AOJx or AOSx has incremented -1. 4. SOJx or SOS has decremented a nonzero number other than -2^35. 5. A MOVNx has negated zero. - CRY1, CaRrY 1, if set without CRY0 being set will set AROV. This indicates any of the following conditions: 1. An ADDx has added two positive number with a sum greater than 2^35-1. 2. A SUBx has subtracted a negative number from a positive number to form a difference greater than 2^35-1. 3. An AOSx or AOJx instruction has incremented 2^35-1. 4. A MOVNx or MOVMx has negated -2^35. 5. A DMOVNx has negated -2^70 The following conditions are also indicated in the PC word: - FOV, Floating point OVerflow, is set by any of: Instruction Set Page 29 1. In a floating point instruction other than FLTR, DMOVNx, or DFN the exponent of the result exceeds 127. 2. FXU (Floating eXponent Underflow) has been set. 3. DCK ( Divide ChecK) has been set by FDVx, FDVRx, or DFDV. - FPD, First Part Done, is set when the processor responds to a priority interrupt, after having completed the first part of a two part instruction (e.g., ILDB). This flag is not usually of interest to the programmer. - USER is set while the processor is in user mode. In user mode, various instruction and addressing restrictions are in effect. - IOT, User IN-Out mode, (also called IOT User), is a special mode in which some of the user mode instruction (but not addressing) restrictions are removed. In this mode a user program may perform the hardware I/O instructions. - PUBL, Public mode, signifies that the processor is in user public mode or in exec supervisor mode [KI10, KL10 only]. - AFI, Address Failure Inhibit, if this flag is set, address break is inhibited for during the execution of the next instruction [KI10, KL10 only]. - TRAP2 - if bit 10 is not also set, pushdown overflow has occurred. If traps are enabled, setting this flag immediately causes a trap. At present no hardware condition sets both TRAP1 and TRAP2 simultaneously. [KI10 KL10 only] - TRAP1 - if bit 9 is not also set, arithemetic overflow has occurred. If traps are enabled, setting this flag immediately causes a trap. At present no hardware condition sets both TRAP1 and TRAP2 simultaneously. [KI10 KL10 only] - FXU, Floating eXponent Underflow, is set to signify that in a floating instruction other than DMOVNx, FLTR, or DFN, the exponent of the result was less than -128 and AROV and FOV have been set. - DCK, Divide ChecK, signifies that one of the following conditions has set AROV: 1. In a DIVx the high order word of the dividend was greater than or equal to the divisor. 2. In an IDIVx the divisor was zero. 3. In an FDVx, FDVRx, or DFDV, the divisor was zero, or the magnitude of the dividend fraction was greater than or equal to twice the magnitude of the divisor fraction. In either case, FOV is also set. Bits 13 through 17 of the PC word are always zero to facilitate the use of indirect addressing to return from a subroutine. Bits 18 through 35 store an address that is one greater than the address of Instruction Set Page 30 the instruction that stores the PC. Thus, the PC word points at the instruction immediately following the subroutine call. 2.16. Program Control 2.16.1. JSR - Jump to Subroutine JSR C(E):=; PC:=E+1 JSR, Jump to SubRoutine, stores the PC in the word addressed by the effective address and jumps to the word following the word where the PC is stored. This is the only PDP-10 instruction that stores the PC and flags without modifying any ACs; however, it is non-reentrant, so PUSHJ is favored in most cases. The usual return from a subroutine called by a JSR is via JRST (or JRST 2,) indirect through the PC word. (See JRST) 2.16.2. JSP - Jump & Save PC JSP C(AC):=; PC:=E JSP, Jump and Save PC, stores the PC and flags in the selected accumulator and jumps. 2.16.3. JSA - Jump and Save Accumulator JSA C(E):=C(AC); C(AC):=; PC:=E+1 JSA, Jump and Save AC, stores the AC in word addressed by the effective address. Then the left half of the AC is set to the effective address and the right half of AC is set to the return PC. Then the PC is set to one greater than the effective address. The JRA instruction unwinds this call. The advantage of this call is that a routine may have multiple entry points (which is difficult to do with JSR) and it's easy to find (and later to skip over) arguments that follow the calling instruction (which is possible to do with PUSHJ, but not quite so convenient). Among the disadvantages of this call is that it is not reentrant, and it doesn't save flags. 2.16.4. JRA - Jump and Restore Accumulator JRA C(AC):=C(CL(AC)); PC:=E JRA, Jump and Restore AC, is the return from JSA. If, e.g., a subrountine is called with JSA AC, then the return is made by: JRA AC,(AC). Instruction Set Page 31 2.16.5. PUSHJ - Push on stack and Jump PUSHJ C(AC):=C(AC)+<1,,1>; C(CR(AC)):=; PC:=E PUSHJ (PUSH return address and Jump) is like PUSH except the data that is pushed onto the top of the stack is the PC and flags word. The PC that is stored is the PC of the instruction that follows the PUSHJ. Then the PC is set to the effective address of the instruction. Pushdown overflow results if the AC becomes positive when it is incremented. 2.16.6. POPJ - Pop stack and Jump POPJ PC:=CR(CR(AC)); C(AC):=C(AC)-<1,,1> POPJ (POP return address and Jump) undoes PUSHJ. The right half of the word at the top of the stack is loaded into the PC (the flags are unchanged). Then the stack pointer is decremented as in POP. The effective address of POPJ is ignored. Pushdown overflow obtains if the AC becomes negative as a result of the subtraction. 2.16.7. Programming Hints Using PUSHJ and POPJ If a subroutine called by PUSHJ AC, wants to skip over the instruction following the PUSHJ, the following sequence accomplishes that result: aos (ac) ; AC better be nonzero. popj ac, If you must restore the flags that PUSHJ saved, the following sequence should be used instead of POPJ: pop ac, (ac) ; Adjust the stack jrst 2, ; Restore flags and PC from ; old stack top. 2.16.8. JRST - Jump and Restore JRST, Jump and ReSTore, is an unconditional jump instruction. In JRST, the AC field does not address an accumulator. Instead, the AC is decoded to signify various things. JRST PC:=E JRST 2, PC:=E; flags are restored (see text) JRST 10, PC:=E; Dismiss current priority interrupt JRST 12, PC:=E; restore flags and dismiss priority interrupt Instruction Set Page 32 If the AC field is zero, only a jump occurs. JRST is everyone's favorite unconditional jump instruction (the only other one is JUMPA which is more typing, also, on older machines JUMPA was slower than JRST). JRST 2, (i.e., JRST with AC field set to 2) signifies jump and restore flags. (The assembler also recognizes the mnemonic JRSTF for JRST 2,). If indirection is used in JRSTF, then the flags are restored from the last word fetched in the address calculation. If indexing is used with no indirection, the flags are restored from the left half of the specified index register. If neither indexing nor indirection is used in the address calculation the flags are restored from the left half of the JRSTF itself! In a user mode program JRSTF cannot clear USER nor can it set IOT User (it can however, clear IOT User). JRST 4, JRST 10, and JRST 12 are illegal in user mode and are trapped as UUOs. 2.16.9. JFCL - Jump on Flag and Clear The JFCL instruction is another case in which the AC field is decoded to modify the instruction. The AC field selects the four flags in PC bits 0 through 3. PC bits 0 to 3 correspond to bits 9 to 12 in the JFCL instruction. JFCL will jump if any flag selected by the AC field is a 1. All flags selected by the AC field are set to zero. JFCL 0, since it selects no PC bits, is a no-op. JFCL 17, will clear all flags, and will jump if any of AROV, CRY0, CRY1, or FOV are set. JFCL 1, (JFOV) jumps if FOV is set and clears FOV. JFCL 10, (JOV) jumps if AROV is set and clears AROV. 2.16.10. XCT - Execute XCT, the eXeCuTe instruction, fetches the word addressed by the effective address and executes that word as an instruction. In the case of XCTing an instruction that stores a PC, the PC that is stored is the address of the instruction that follows the XCT. If the executed instruction skips, then that skip is relative to the XCT. The AC field of the XCT should be zero. [In monitor mode a nonzero AC field in an XCT is significant.] The XCT instruction can be used to acheive the effect of a CASE statement, as in the following example: xct @[ call foo move q1, p1 jfcl aos q1 ](t1) where t1 contains the case index, which should have a value (in this case) between 0 and 3. Instruction Set Page 33 2.16.11. JFFO - Jump if Find First One JFFO tests the selected AC. If C(AC)=0 then set C(AC+1) to zero and execute the next instruction. If C(AC)<>0 then set C(AC+1) to the count of the number of zero bits in C(AC) to the left of the first one bit and jump to the effective address. C(AC) is unchanged. 2.17. References The following manual presents the instruction set of the PDP-10/DECSYSTEM-20 in complete detail: DECsystem-10/DECSYSTEM-20 Hardware Reference Manual, Volume I, Central Processor, EK-10/20-HR-001, Digital Equipment Corporation (1978). Historical information can be found in: Bell, et al., "The Evolution of the DECsystem-10", CACM Jan 1978. Macro Page 34 3. The DECSYSTEM-20 Macro Assembler 3.1. Introduction This chapter presents excerpts from the DEC Macro-20 Reference Manual. Certain advanced (or rarely used) features - mainly those dealing with storage allocation and polish fixups - have been omitted; see the actual manual if you must have them. Unfortunately, the Macro manual does not present the material in a tutorial fashion; this rendition of it is not much better than the original in that respect, although examples have been added here and there, and explanations are a little more thorough. Given the absence of a good tutorial, the best approach to teaching yourself Macro is to read through this chapter from beginning to end to get an idea of what kinds of things Macro does, and what its syntax is, and then to study some well-written Macro programs. At the time of this writing (3 July 1980), Macro suffers from certain limitations traceable to its origins in Tops-10 (PDP-6, really): symbols are limited to 6 characters in length, and input (.MAC) and output (.REL, usually) files are limited to 6 characters in the filename and 3 in the extension (file type). Macro is the symbolic assembler program for the DECSYSTEM-20. The assembler reads a file of Macro statements and composes relocatable binary machine instruction code suitable for loading by Link, the system's linking loader. Macro is a statement-oriented language; statements are in free format and are processed in two passes. In processing statements, the assembler: 1. Interprets machine instruction mnemonics. 2. Accepts symbol definitions. 3. Interprets symbols. 4. Interprets pseudo-ops. 5. Accepts macro definitions. 6. Expands macros on call. 7. Assigns memory addresses. 8. Generates a relocatable binary program file ( .REL file) for input to Link. 9. Optionally generates a program listing file showing source statements, the corresponding binary code, and any errors found. 10. Optionally generates a universal ( .UNV) file that can be searched during other assemblies. The following conventions are used throughout this chapter: 1. All numbers in the examples are octal unless otherwise noted. Macro Page 35 2. All numbers in the text are decimal unless otherwise noted. 3. The name of the assembler, Macro, is capitalized; references to user-defined macros are in lower case. 3.2. Elements of Macro The character set recognized in Macro statements includes all ASCII alphanumeric characters and 28 special characters (ASCII 40 through 137 (octal)). Lowercase letters are not distinguished from uppercase letters. Macro recognizes several ASCII control codes including horizontal tab (^I), linefeed (^J), formfeed (^L), carriage-return (^M), and control-underscore (^_). Macro accepts any ASCII character in quoted text, or in text argument to the ASCII and ASCIZ pseudo-ops. The line continuation character, ^_, is always effective. Delimiters for certain pseudo-ops (such as ASCII, ASCIZ, and COMMENT) can be any nonblank, nontab ASCII character. A Macro program consists of statements made up of Macro language elements. Separated into general types, these are: 1. Special characters. 2. Numbers. 3. Literals. 4. Symbols. 5. Expressions. 6. Macro-defined mnemonics. 7. Pseudo-ops. 8. Macros. The format of a Macro statement is discussed later. 3.2.1. Special Characters Some characters and combinations of characters have special interpretations. These interpretations apply only in the contexts described. In particular, they do not apply within comment fields or text strings. Uparrow (^) is to be taken literally, e.g. ^B means the uparrow character followed by a B, not control-B. Char(s) (Form) Context and Interpretation B (mBn) Between 2 integer expressions, causes the binary representation of m to be placed with rightmost bit at bit n (decimal). ^B (^Bn) Before an integer expression, shows that n is a binary Macro Page 36 (base 2) number. ^D (^Dn) Before an integer expression, shows that n is a decimal number. E (fE+n,fE-n,fEn) Between a floating-point decimal number and a signed decimal integer, multiplies f by the +nth power of 10 (decimal). ^O (^On) Before an integer expression, shows that n is an octal number. : (sym:) After a symbol, shows that sym is a label, i.e. that its value is to become the current value of the location counter. :: (sym::) After a symbol, shows that sym is a global INTERNAL label. ; (;text) Before the end of a line, shows that text is a comment. ;; (;;text) Before end of line (usually in a macro definition), shows that text is a comment to be printed in the macro definition but not at call, i.e. in macro expansion. . (.) As an expression, is replaced by the current value of the location counter. , (,) Among numbers and symbols, delimits operands, accumulator, arguments. In a macro call, delimits a null argument. ,, (lhw,,rhw) Between two expressions, delimits left halfword from right halfword. ! (A!B) Between two expressions, generates the logical inclusive OR of A and B. ^! (A^!B) Between two expressions, generates the logical exclusive OR of A and B. & (A&B) Between two expressions, generates the logical AND of A and B. ^- (^-A) Before an expression, generates the logical complement of A (NOT A). * (A*B) Between two expressions, generates the product of A and B. / (A/B) Between two expressions, generates the quotient of A by B. + (A+B) Between two expressions, generates the sum of A and B. - (A-B) Between two expressions, generates the difference of A and B. - (-A) Before an expression, generates the two's complement of the value of A. " ("text") In pairs around text, shows that text is a 7-bit ASCII string, to be right justified in a field of five characters. ' ('text') In pairs around text, shows that text is a SIXBIT string, to be right justified in a field of six characters. Macro Page 37 ' (arg'text, text'arg) Adjoing a dummy argument in the body of a macro definition, concatenates the value of the argument to the text during macro expansion. \ (\expr) Prefixed to an expression in a macro call, directs that the argument passed be the string for the ASCII value of expr in the current radix. \' (\'expr) Prefixed to an expression in a macro call, directs that the argument passed be the string whose SIXBIT code is the value of expr. \" (\"expr) Prefixed to an expression in a macro call, directs that the argument passed be the string whose ASCII code is the value of expr. ^_ (Control-underscore, not uparrow underscore) before a CRLF, continues its argument to the next line. Does not operate across end-of-macro. _ (A_B) Between two expressions, shifts the binary representation of A to the left B positions (if B is negative, shift is to the right). @ (@address) Prefixed to an address, sets bit 13 of the instruction word, indicating indirect addressing. % (%arg) As the first character of a dummy argument in a macro definition, directs that %arg be replaced by a created symbol during macro expansion; Macro will substitute a different symbol for it on each invocation of the macro. () Encloses index field, encloses dummy arguments in macro definition or parameters in a macro invocation, quotes characters for macro argument handling, swaps the two halves of a value. <> Nests expressions, encloses conditional assembly code, encloses code in REPEAT, IRP, and IRPC pseudo-ops, encloses macro body in DEFINE pseudo-op, quotes characters for macro argument handling, forces evaluation of a symbol. [] delimits literals (causing the contents of the brackets to be moved to the literal pool, and the bracketed expression to be replaced by the location of its contents); delimits argument in ARRAY, .COMMON, and OPDEF pseudo-ops; quotes characters for macro argument handling. = (sym=expr) Between symbol and expression, assigns the value of expr to sym. =: (sym=:) Between symbol and expression, assigns the value of expr to sym and declares sym to be global INTERNAL. Macro Page 38 3.2.2. Numbers The two properties of numbers that are important in Macro are 1. The radix (base) in which the number is written. 2. How the number should be placed in memory. You can control the interpretation of these properties by using pseudo-ops or special characters to indicate your choices. 3.2.2.1. Integers Macro stores an integer in its binary form, right justified in bits 1 to 35 of its storage word. If you use a sign, place it immediately before the integer (if you omit the sign, the integer is assumed to be positive). For a negative number, Macro first forms its absolute value in bits 1 to 35, then replaces it by its two's complement. Therefore a positive integer is stored with 0 in bit 0, while a negative integer has 1 in bit 0. The largest positive integer that Macro can store is 377777 777777 (octal); the smallest (most negative) is 400000 000000 (octal). 3.2.2.2. Radix The initial implicit radix for a Macro program is octal (base 8). The integers you use in your program will be interpreted as octal unless you indicate otherwise. You can change the radix to any base from 2 to 10 by using the RADIX pseudo-op. The new radix will be in effect until you change it again. Without changing the prevailing radix, you can write a particular integer in binary, octal, or decimal. To do this, prefix the integer with ^B for binary, ^O for octal, ^D for decimal. The indicated radix applies only to the single integer immediately following it. A single-digit number is always interpreted as radix 10. Thus 9 is seen as decimal 9, even if the current radix is 8. For example, suppose the current radix is 8. Then you can write the decimal number 23 as: 27 octal (current radix) ^d23 decimal ^b10111 binary but you cannot write decimal 23 as ^d45-22 since the ^d applies only to the first number, 45; the 22 is still interpreted as octal. However, you can write decimal 23 as ^d<45-22>. Macro Page 39 3.2.2.3. Floating-point Decimal Numbers In your program, a floating-point decimal number is a string of digits with a leading, trailing, or embedded decimal point and an optional leading sign. Macro recognizes this as a mixed number in radix 10. Macro forms a floating-point decimal number with the sign in bit 0, a binary exponent in bits 1 to 8, and a normalized binary fraction in bits 9 to 35. The normalized fraction can be viewed as followed: its numerator is the binary number in bits 9 to 35, whose value is less than 2 to the 28th power, but greater than or equal to 2 to the 27th power. Its denominator is 2 to the 28th power, so that the value of the fraction is always less than 1, but greater than or equal to 0. (This value is 0 only if the entire stored number is 0.) The binary exponent is incremented by 128 so that exponents from -128 to 127 are represented as 0 to 255. For a negative floating-point number, Macro first forms its absolute value as a positive number, then takes the two's complement for the entire word. Examples: The floating point number 17.0 generates the binary word 0 10 000 101 100 010 000 000 000 000 000 000 000 where bit 0 shows the positive sign, bits 1 to 8 show the binary exponent, and bits 9 to 35 show the proper binary fraction. The binary exponent is 133 (decimal), which after subtracting the added 128 gives 5. The fraction is equal to 0.53125 decimal. And 0.53125 times 2 to the 5th power is 17, which is the number given. Similarly, 153. generates 0 10 001 000 100 110 010 000 000 000 000 000 000 while -153. generates 1 01 110 111 011 001 110 000 000 000 000 000 000 These two examples show that a negative number is two's complemented. Notice that since the binary fraction for a negative number always has some nonzero bits, the exponent field (taken by itself) appears to be one's complemented. As in Fortran, you can write a floating point decimal number with a suffixed E+/-n, and the number will be multiplied by 10 to the +/-nth power. If the sign is missing, n is assumed to be positive. Examples: 2840000. can be written 284.E+4 or .284E7; .0000284 can be written .284E-4 or 284.E-7. Using this E notation with an integer (no decimal point) is not allowed, and causes an error. Therefore you can use 284.E4 but not 284E4. Note: Macro's algorithm for handling numbers given with the E notation is not identical for Fortran's. The binary values generated by the two translators may differ in the lowest order bits. Macro Page 40 3.2.2.4. Binary Shifting Binary shifting of a number with Bn sets the location of the rightmost bit at bit n in the storage word, where n is a decimal integer. The shift takes place after the binary number is formed. Any bits shifted outside the range (bits 0 through 35) of the storage word are lost. For example, here are some numbers with their binary representations given in octal: 300000 000000 ^d3b2 000000 042000 ^d17b25 000001 000000 1b17 777777 777777 -1b35 000000 000001 1b35 000000 777777 -1b17 3.2.2.5. Underscore Shifting You can also shift a number by using the underscore operator. If V is an expression with value n, suffixing _V to a number shifts it n bits to the left. If n is negative, the shift is to the right. In an expression of the form W_V, W and V can be any expressions including symbols. The binary value of W is formed in a register, V is evaluated, and the binary value of W is shifted V bits when placed in storage. An expression such as -3.75E4_^d18 is legal, but the shift occurs after conversion to floating point decimal storage format. Therefore the sign, exponent, and fraction fields are all shifted away from their usual locations. This is true also of other storage formats. 3.2.3. Literals A literal is a character string within square brackets inserted in your source code. Macro stores the code generated by the enclosed string in a literal pool beginning with the first available literal storage location, and places the address of this location in place of the literal. The literal pool is normally at the end of the binary program. The statements Macro Page 41 ldb t1, [point 6, .JBVER, 17] LIT are equivalent to ldb t1, foo foo: point 6, .JBVER, 17 A literal can also be used to generate a constant: push p, [0] ; Generate a zero fullword. move q1, [3,,14] ; Generate a word with 3 in ; the left half and 14 in the right. Multiline literals are also allowed: getChr: ildb t2, t1 ; Get a character. cain t2, 0 ; Is it a null? jrst [ move t1, txtPtr ; Yes, use this pointer ildb t2, t1 ; to get a new character. cain t2, "?" ; Is it a question mark? jrst [ move t1, txtPt2 ; Yes, use this pointer ildb t2, t1 ; to get the message character, jrst getHlp ] ; and go to the help routine. ret ] ; Not a question mark, return from getChr. ret ; Not a null, return with char in t2. The text of a literal continues until a matching closing square bracket is found (unquoted and not in a comment field). A literal can include any term, symbol, expression, or statement, but it must generate at least one but no more than 99 words of data. A statement that does not generate data (such as a direct assignment statement or a RADIX pseudo-op) can be included in a literal, but the literal must not consist entirely of such statements. You can nest literals up to 18 levels. You can include any number of labels in a literal, but a forward reference to a label in a literal is illegal. If you use a dot (.) in a literal to retrieve the location counter, remember that the location counter is pointing at the statement containing the literal, not at the literal itself. In nested literals, a dot location counter references a statement outside the the outermost literal. In the sequence jrst [ hrrz t1, v caie t1, op jrst .+1 jrst foo ] skipe t3 the expression .+1 generates the address of skipe t3, not jrst foo. Macro Page 42 Literals having the same value are collapsed in Macro's literal pool. Thus for the statements push p,[0] caml t2,[0] movei t1, [asciz /frotz/] the same address is shared by the two literals [0], and by the null word generated at the end of [asciz /frotz/]. Literal collapsing is suppressed for those literals that contain errors, undefined expressions, or EXTERNAL symbols. 3.2.4. Symbols Macro symbols include: 1. Macro-defined pseudo-ops. 2. Macro-defined mnemonics. 3. User-defined macros. 4. User-defined opdefs. 5. User-defined labels. 6. Direct-assignment symbols. 7. Dummy arguments for macros. Macro stores symbols in three symbol tables: one for opcodes and pseudo-ops, one for macros and opdefs, and one for user defined labels and direct-assignment symbols. An entry in one of these tables shows the symbol, its type, and its value. Symbols are helpful in your program because: 1. Defining a symbol as a label gives a name to an address. You can use the label in debugging or as a target for program control statements. 2. In revising a program, you can change a value throughout your program by changing a single symbol definition. 3. You can give names to values to make computations clearer. 4. You can make values available to other programs. Macro Page 43 3.2.4.1. Selecting Valid Symbols Follow these rules in selecting symbols: 1. Use only letters, numerals, dots (.), dollar signs ($), and percent signs (%). Macro will consider any other character (including blank) as a delimiter. 2. Do not begin a symbol with a numeral. 3. If you use a dot for the first character, do not use a numeral for the second. Do not use dots for the first two characters; doing so can interfere with Macro's created symbols. 4. Make the first six characters unique among your symbols. You can use more than six characters, but Macro will use only the first six. 5. Don't choose symbols composed of 1 to 4 letters followed by a percent sign; names of this form are reserved for new monitor calls. Examples: velocity Legal, only "veloci" is heeded by Macro. chg.vel Legal, only "chg.ve" is heeded by Macro. chg vel Illegal, looks like two symbols to Macro. chgVel Legal. 1stNum Illegal, begins with numeral. .11111 Illegal, begins with dot-numeral. ..1111 Unwise, could interfere with created symbols. 3.2.4.2. Defining Symbols You can define a symbol by making it a label or by giving its value in a direct-assignment statement. Labels cannot be redefined, but direct-assignment symbols can be redefined anywhere in your program. You can also define special-purpose symbols called OPDEFs and macros using the pseudo-ops OPDEF and DEFINE. A label is always a symbol with a suffixed colon. A label is in the first (leftmost) field of a Macro statement and is one of the forms: errFound: Macro uses only "errfou". case1: Legal label. OK:contin: Legal; you can use more than one label at a location. Macro Page 44 case2:: Double colon declares the label to be global INTERNAL. When Macro processes the label, the symbol and the current value of the location counter are entered in the user symbol table. A reference to the symbol addresses the code at the label. You cannot redefine a label to have a value different from its original value. A label is relocatable if the address is represents is relocatable; otherwise it is absolute. You can define a direct-assignment symbol by associating it with an expression. A direct assignment is in one of the forms: symbol=expression - The symbol and the value of the expression are entered together in the user symbol table. symbol=:expression - The symbol and the value of the expression are entered together in the user symbol table and the symbol is declared INTERNAL. You can redefine a direct-assignment symbol at any time; the new direct assignment simply replaces the old definition. Note: If you assign a multiword value using direct assignment, only the first word of the value is assigned to the symbol. For example, A=asciz/abcdefgh/ is equivalent to A=asciz/abcde/, since only the first five characters in the string correspond to code in the first word. 3.2.4.3. Symbol-table Search Order When you use a symbol in your program, Macro looks it up in the symbol tables. Normal Macro searches the macro table first, then the opcode table, and finally the user symbol table. However, if Macro has already found an operator in the current statement and is expecting operands, then it searches the user symbol table first. 3.2.4.4. Symbol Attributes Each symbol in your program has one of the following attributes: local, INTERNAL global, or EXTERNAL global. This attribute is determined when the symbol is defined. A local symbol is defined for the use of the current program only. You can define the same symbol to have different values in separately assembled programs. A symbol is local unless you indicate otherwise. A global symbol is defined in one program, but is also available for use in other programs. Its table entry is visible to all programs in which the symbol is declared global. A global symbol must be declared INTERNAL in the progam where it is defined; it can be defined in only one program. In other programs sharing the global symbol, it must be declared EXTERNAL; it can be Macro Page 45 EXTERNAL in any number of programs. To declare a symbol as INTERNAL global, you can: 1. Use the INTERN pseudo-op, e.g. INTERN flag1 2. Insert a colon after the "=" in a direct assignment statement, e.g. flag2=:200. 3. Use an extra colon with a label, e.g. flag3::. 4. For subroutine entry points, use the ENTRY pseudo-op, e.g. ENTRY foo. To declare a symbol as an EXTERNAL global, you can use the EXTERN pseudo-op, e.g. EXTERN flag4. 3.2.5. Expressions You can combine numbers and defined symbols with arithmetic and logical operators to form expressions. You can nest expressions by using angle brackets. Macro evaluates each expression (innermost nesting levels first), and either resolves it to a fullword value, or generates a Polish expression to pass to Link. 3.2.5.1. Arithmetic Expressions An arithmetic expression can include any number or defined symbol, and any of the following operators +, -, *, /. Examples (in which words, x, y, and z have been defined elsewhere): movei t3, words/5 addi q2, addi q2, <+1>*5 3.2.5.2. Logical Expressions A logical expression can include any number or defined symbol whose value is absolute, and any of the operators &, !, ^!, ^-. Each of the binary operations &, !, and ^! generates a fullword by performing the indicated operation over corresponding bits of the two operands. For example, a&b generates a fullword whose bit 0 is the result of a's bit 0 ANDed with b's bit 0, and so forth for all 36 bits. Macro Page 46 3.2.5.3. Evaluating Expressions Macro has a hierarchy of operations in evaluating expressions. In an expression without nests (angle brackets), Macro performs its operations in this effective order: 1. All unary operations and shifts: +, -, ^-, ^d, ^o, ^b, b (binary shift), _ (underscore shift), E. 2. Logical binary operations (from left to right): !, ^!, &. 3. Multiplication and division (from left to right): *, /. 4. Addition and subtraction (binary operations): +, -. You can override this hierarchy by using angle brackets to show what you want done first. For example, suppose you want to calculate the sum of a and b, divided by c. You cannot do this with a+b/c because Macro will perform the division b/c first, then add the result to a. With angle brackets you can write the expresssion /c to force Macro to add a and b first, then divide the result by c. Expressions can be nested to any level. The innermost nest is evaluated first; the outermost last. Some examples of legal expressions (assuming that a1, b1, and c are defined symbols) are: a1+b1/5 /5 ^-a1&b1^!c ^b101b17-^d98+6 An expression given in halfword notation has each half evaluated separately in a 36-bit register. Then the 18 low-order bits of each half are joined to form a fullword. For example, the expression <4,,6>/2 generates the value 000002 000003. Macro Page 47 3.3. Pseudo-ops A pseudo-op is a statement that gives the assembler information to allow it to assemble your program in the desired way. For example, the pseudo-op RADIX does not generate code itself, but tells Macro how to interpret numbers in your program. The pseudo-op EXP generates one word of code for each argument given with it. To use a pseudo-op in your program, follow it with a space or tab, and any required or optional arguments or parameters. This section describes the use and functions of each pseudo-op. The headings for each description, if applicable, are Format, Function, Examples, and Optional Notations. 3.3.1. ARRAY Format ARRAY sym[expression] Expression is an integer value (to be intterpreted in the current radix, indicating the number of words to be allocated; the expression cannot be EXTERNAL, relocatable, or negative. Note that although the expression must be in square brackets, this use of the brackets is completely unrelated to literals. Function Reserves a block of storage whose length is the value of the expression, and whose location is identified by the symbol. Storage is allocated along with other variable symbols in the program, usually at the end. The allocated storage is not necessarily zeroed. 3.3.2. ASCII Format ASCII dtextd where d = delimiter; first nonblank character, whose second appearance terminates the text, and text = a string of text characters to be entered. Function Enters ASCII text in the binary code. Each character uses seven bits. Characters are left justified in storage, five per word, with bit 35 in each word set to 0, and any unused bits in the last word set to 0. Examples ASCII /This is a string/ ascii !ps:foo.txt! Ascii?foo? Optional Notations: Omit the space or tab after ASCII. Not allowed if the delimiter is a letter, number, dot, dollar or percent sign (i.e. a possible symbol constituent), or if the delimiter character is a control character. Macro Page 48 Right-justified ASCII can be entered by using double quotes to surround up to five characters, as in: movei t1, "A". 3.3.3. ASCIZ Exactly like ASCII, except that a trailing null is guaranteed, even if an extra word of nulls must be added. Most Tops-20 monitor calls expect strings to be in this format. 3.3.4. BLOCK Format BLOCK expression where the expression is not EXTERNAL or relocatable, and evaluates to a positive integer. Function Reserves, at the point of invocation, a block of locations whose length is the value of the expression. The location counter is incremented by this value. The allocated locations are not necessarily zeroed. Note that the BLOCK pseudo-op does not generate or store code; therefore it should not be used in a literal, since this will result in overwriting the reserved space during literal pooling. Examples BLOCK 500 block <^d512/5+1> block <$nPag_9 - fooLen> Optional Notations: Use the pseudo-op Z inside literals. 3.3.5. BYTE Format BYTE bytedef ... bytedef bytedef = (n)expression, ..., expression n = byte size in bits; n is a decimal expression in the range 1 to 36. Function Stores values of expressions in n-bit bytes, starting at bit 0 of the storage word. The first value is stored in bits 0 to n-1; the second in bits n to 2n-1, and so forth for each given value. If a byte will not fit in the remaining bits of a word, those bits are zeroed and the byte begins in bit 0 of the next word. If a value is too large for a byte, it is truncated on the left. If the byte size is 0 or is missing (empty parentheses), a zero word is generated. Examples Macro Page 49 V=2 BYTE (6),5,0,,101,5,V generates the storage value 050000 010502. The two commas indicate a null argument; the 101 (octal) is too large for the byte size and is left truncated. byte (6)7,0,1(9)7,0,1,"A" generates two words: 070001 007000 and 001101 000000. Notice that "A" is right-justified in its 9-bit byte. Note that a comma before a left parenthesis will generate a null byte. 3.3.6. COMMENT Format COMMENT dtextd d = delimiter; the first nonblank character, whose second appearance terminates the text. Function Treats the text between the delimiters as a comment. The text can include carriage returns to facilitate multiline comments. Optional Notations: Omit the space or tab after COMMENT. This is not allowed if the delimiter is a valid character for identifiers, or is a control character. Use a semicolon (;) to make the rest of the line into a comment. Be careful not to use the delimiter in the text of the comment, and avoid the use of nonprinting delimiters. Example : foo: comment | Subroutine Foo This subroutine writes a poem in the style of a given poet. Enter with: t1/ Pointer to asciz name of poet. t2/ Destination designator for poem. t3/ Pointer to asciz string describing desired topic. Returns: +1: always, with t1, t2, and t3 updated appropriately. | Macro Page 50 3.3.7. DEC Format DEC expression, ..., expression Function Defines the local radix for the line as decimal, then enters the value of each given expression in a fullword of code. The location counter is incremented by 1 for each expression. Example DEC 10, 938, 512, 4.5, 6.03E-5 Optional Notation: Use the EXP pseudo-op and prefix ^d to each expression that must be evaluated in radix 10. 3.3.8. DEFINE Format DEFINE macroName(dArgList) where - macroName is a symbolic name for the macro being defined. This name must be unique among all macro and OPDEF symbols. - dArgList is a list of dummy arguments. - macroBody is the source code to be assembled when the macro is invoked. Function defines the macro. See the section on macros. 3.3.9. END Format END expression where expression is an optional operand that specifies the address of the first instruction to be executed, which can be EXTERNAL, in the right half and, optionally, the length of the entry vectory in the left. Function Must be the last statement in a Macro program. Statements after END are ignored. The starting address is optional and normally is given only in the main program (since subroutines are called from the main program, they should not specify a starting address). When the assembler first encounters an END statement, it terminates pass 1 and begins pass 2. The END terminates pass 2 on the second encounter, and unallocated literals and variables (e.g. ARRAYs) are assembled at the current location. Examples: end end start end <3,,entVec> Macro Page 51 end 3.3.10. ENTRY Format ENTRY symbol, ..., symbol Each symbol is the name of an entry point in a library subroutine. Function Defines each symbol in the list following the ENTRY pseudo-op as an INTERNAL symbol, and puts appropriate information in the .REL file to allow the symbols to be included in an index (such as that constructed by the MAKLIB program). Each symbol must correspond to a label of the same name. Programs referring to these symbols must, of course, declare them EXTERNAL. 3.3.11. EXP Format EXP expression, ..., expression Function Enters the value of each expression in a fullword of code. 3.3.12. EXTERN Format EXTERN symbol, ..., symbol Function Identifies symbols as being defined in other programs. EXTERNAL symbols cannot be defined within the current program. At load time, the value of an EXTERNAL symbol is resolved by Link if you load a module that defines the symbol as an INTERNAL symbol; if you do not load such a module, Link gives an error message for the undefined global symbol(s). 3.3.13. IFx Group Gives criterion and code for conditional assembly. A symbol or expression used to define the conditions for assembly must be defined before Macro reaches the conditional statement. If the value of such a symbol or expression is not the same on both assembly passes, a different number of words of code may be generated (resulting in a phase error). The forms of IF pseudo-op are listed below; in the first six forms, n is the value of the given expression. IFE expression, - assemble code if n=0. IFN expression, - assemble code if n not = 0. IFG expression, - assemble code if n>0. Macro Page 52 IFGE expression, - assemble code if n>=0. IFL expression, - assemble code if n<0. IFLE expression, - assemble code if n<=0. IF1 - assemble code on Pass 1. IF2 - assemble code on Pass 2. IFDEF symbol, - assemble code if the symbol is defined as user-defined, an opcode, or a pseudo-op. IFNDEF symbol, - assemble code if the symbol is not defined as user-defined, an opcode, or a pseudo-op. Code is also assembled if the symbol has been referenced, but is not yet defined. This can occur during pass 1. IFIDN , - assemble code if the strings are identical. IFDIF , - assemble code if the strings are different. IFB , - assemble code if the strings contain only blanks and tabs. IFNB , - assemble code if the string does not contain only blanks and tabs. Example: $cc=$cc+1 ; Count a character. ifg <$cc-5>,< ; Word overflowed? $cc=0 ; Yes, reset character counter $wc=$wc+1 ; and count a word. > Optional notations: Omit angle brackets enclosing code for single-line conditionals. For IFIDN, IFDIF, IFB, and IFNB only: use a nonblank, nontab character other than < as the initial and terminal delimiters for a string (as in the pseudo-ops ASCII and ASCIZ); you can then include angle brackets in the string. 3.3.14. INTERN Format INTERN symbol, ..., symbol Function Declares each given symbol to be INTERNAL global; therefore its definition, which must be in the current program, is available to other programs at load time. Each such symbol must be defined as a label, a variable, or a direct assignment symbol. OPDEF symbols can be declared INTERNAL, and thus be made available to other programs at load time. However, if the current program has another symbol (besides the OPDEF symbol) of the same name, the INTERNAL declaration will apply to that symbol rather than to the OPDEF symbol. Macro Page 53 3.3.15. IOWD Format IOWD exp1,exp2 Function Generations a word in a special format for use by all five pushdown stack instructions (adjsp, push, pop, pushj, popj), as well as for i/o instructions (hence its name). The left half of the assembled word contains the 2's complement of the value of exp1, and the right half contains the value exp2-1. Example: Setting up a pushdown stack. stkLen=100 array stack[stkLen] move p, [iowd stkLen,stack] 3.3.16. IRP Format IRP arg, where arg is one of the dummy arguments of the enclosing macro definition; you can only use IRP in the body of a macro definition. Function Generates one expansion of the code enclosed in angle brackets for each subargument of the string that replaces arg. Each occurrence of arg within the expansion is replaced by the subargument currently controlling the expansion (see the section on macros). Example: define sum(a,b)< movei q, 0 irp a, movem q, b > sum (,foo) This invocation of sum is replaced by: movei q, o add q, x add q, y add q, z movem q, foo Macro Page 54 3.3.17. IRPC Format IRPC arg, Function Generates one expansion of the bracketed code for each character of the string that replaces arg (a dummy argument of the enclosing macro definition). Each occurrence of arg within the expansion is replaced by the character currently controlling the expansion. Concatenation and line continuation are not allowed across end-of-IRPC, since a carriage return and linefeed are appended to each expansion. Example: define deposit (string,bp)< irpc string, < movei q, "string" idpb q, bp > > deposit ,x expands to: movei q, "f" idpb q, x movei q, "o" idpb q, x movei q, "o" idpb q, x 3.3.18. LIT Format LIT Function Assembles literals beginning at the current address. The literals assembled are those found since the previous LIT, or since the beginning of the program, whichever is later. The location counter is incremented by 1 for each word assembled. A literal found after the LIT is not affected. It will be assembled at the next following LIT, or at the END statement, whichever is earlier. Literals having the same value are collapsed in Macro's literal pool. 3.3.19. OCT Format OCT expression, ..., expresssion Function Defines the local radix for the line as octal; the value of each each expression is entered in a fullword of code. The location counter is incremented by 1 for each expression. Equivalent to using EXP with each argument prefixed by ^o. Macro Page 55 3.3.20. OPDEF Format OPDEF symbol[expression] Function Defines the symbol as an operator equivalent to the expression, giving the symbol a fullword value. When the operator is later used with operands, the accumulator fields are added, the indirect bits are ORed, the memory addresses are added, and the index register addresses are added. An OPDEF can be declared INTERNAL, using the INTERN pseudo-op. However, if a symbol of the same name exists, the INTERNAL declaration will apply only to that symbol, and not to the OPDEF. Although the expression portion of an OPDEF must be in square brackets, this use of the brackets is completely unrelated to literals or literal handling. 3.3.21. POINT Format POINT byteSize,address,bitPosition Function Generates a byte pointer word for use with the machine language byte instructions adjbp, ldb, ibp, ildb, and idbp. The byteSize gives the decimal number of bits in the byte, and is assembled in bits 6 to 11 of the storage word. Address gives the location of the word containing the byte, and is assembled in bits 13 through 35 (with normal indirect, index, and offset fields). BitPosition gives the position (in decimal) of the rightmost bit in the byte. Macro places the value <35-bitPosition> in bits 0-5 of the storage word. The default bitPosition is -1, so that the byte increment instructions ipb, ildb, and idpb will operate on the first byte in the address word. 3.3.22. PRGEND Format PRGEND Function Replaces the END statement for all except the last program of a multiprogram assembly. PRGEND closes the local symbol table for the current module. You can use PRGEND to place several small programs into one file to save space and disk accesses. The resulting binary file can be loaded in library search mode (refer to the Link and MakLib manuals for more information about this). PRGEND is not allowed in macros. Like END, PRGEND causes the assembly of all unassembled literals. In a file containing PRGENDs, using more than one LIT pseudo-op in any but the last program produces unpredictable results. A starting address for the program terminated by PRGEND may optionally be given as an argument to PRGEND. Macro Page 56 3.3.23. PRINTX Format PRINTX text Function Causes text to be output at the terminal during assembly, normally once for each pass. Example: define prVal (v,msg) if2, < $$len=.-buffer prVal \$$len, ifg <.-20000>,> > 3.3.24. PURGE Format PURGE symbol, ..., symbol Function Deletes symbols from the symbol tables. Normally used at the end of a program to fix multiply defined global symbol errors that occur at Link time, or to remove unwanted symbols from DDT typeout. If you use the same symbol for both a macro name or OPDEF and a label, a PURGE statement deletes the macro name or OPDEF. Repeating the PURGE statement then purges the label. 3.3.25. RADIX Format RADIX expression Function Sets the radix to the value of the expression, which is interpreted in decimal, and must be in the range 2 to 10. An implicit RADIX 8 statement begins each Macro program. All numerical expressions that follow (up to the next RADIX pseudo-op) are interpreted in the given radix unless another local radix is indicated via ^d, ^o, ^b, OCT, DEC, etc. Ordinarily, numbers outside the range of the given radix are not interpreted. For example, in radix 8, the number 99 causes an error. However, a single-digit number is interpreted in any case. For example, in radix 8, the number 9 is recognized as octal 11. 3.3.26. REPEAT Format REPEAT expression, Function Generates the bracketed code n times, where n is the value of the expression, and must be a nonnegative integer. REPEAT statements can be nested to any level. Line continuation is Macro Page 57 not allowed across end-of-REPEAT, since a carriage return and linefeed are appended to each expansion of the code. Note that REPEAT 0, is logically equivalent to a false conditional (and is often used to "comment out" a large section of code), and REPEAT 1, is logically equivalent to a true conditional. 3.3.27. .REQUIRE Format .REQUIRE filespec Function Causes the specified file to be loaded automatically at Link time. The filespec must not include a file type, and it must be a Tops-10 (not Tops-20!) file specification. 3.3.28. SEARCH Format SEARCH tableName(fileName), ..., tableName(fileName) Function Defines a list of symbol tables for Macro to search if a symbol is not found in the current symbol table. A maximum of ten tables can be specified. Tables are searched in the order specified. When the SEARCH pseudo-op is seen, Macro checks its internal UNIVERSAL tagble for a memory-resident UNIVERSAL of the specified name. If no such entry is found, Macro reads in the symbol table using the given file specification. If no file specification is given, Macro reads tableName.UNV from the connected directory, and on failure tries the same filename in UNV: and SYS:, in that order. When all the specified files are found, Macro builds a table for the search sequence. If Macro cannot find a given symbol in the current symbol table, the UNIVERSAL tables are searched in the order specified. When the symbol is found, it is moved into the current symbol table. This procedure saves time (at the expense of core) on future references to the same symbol. A UNIVERSAL file can search other UNIVERSAL files, provided all names in the search list have been assembled. Optional notation: Omit the filename and its enclosing parentheses. Macro then looks on DSK:, UNV:, and SYS: (in that order) for tableName.UNV. 3.3.29. SIXBIT Format SIXBIT dtextd d = delimiter (first nonblank character, whose second appearance terminates the text) Function Enters strings of text characters in 6-bit format. Six characters per word are left justified in sequential storage words. Any unused bits are set to zero. Lowercase letters in Macro Page 58 SIXBIT text strings are treated as uppercase. Otherwise, only the SIXBIT character set allowed. The SIXBIT character set consists of all the printable ASCII characters except lowercase letters, and the following symbols: `{|}~. The values of the SIXBIT characters are 0-77 (octal); they are offset from the corresponding ASCII values by octal 40, e.g. ASCII "A" is 101, and SIXBIT "A" is 41. Historically, this was a popular format for the storage of strings whose maximum length was 6, such as filenames or program names under Tops-10, because it allowed word comparisons and transfers instead of slower byte manipulations. Optional Notation: Right-justified SIXBIT can be entered by using single quotes to surround uf to six characters, as in move t1, ['FOOBAZ'] 3.3.30. STOPI Ends an IRP or IRPC before all subarguments or characters are used. The current expansion is completed, but no new expansions are started. STOPI can be used with conditionals inside IRP or IRPC to end the repeat if the given condition is met. 3.3.31. SUBTTL Format SUBTTL String Function Defines a subtitle (of up to 80 characters) to be printed at the top of each page of the listing file until the end-of- listing or until another SUBBTL statement is found. The initial SUBTTL usually appears on the second line of the first page of the input file, immediately following the TITLE statement. For subsequent SUBTTL statements, the following rule applies: if the new SUBTTL is on the first line of a new page, then the new subtitle appears on that page; if not, the new subtitle appears on the next page. Even if you do not plan to generate listing files, SUBTTL is still a useful documentation device, and is recognized by certain programming aids. 3.3.32. TITLE Format TITLE String Function Gives the program name and a title to be printed at the top of each page of the program listing. The first characters (up to 6) are the program name; the program is saved by this name unless another is explicitly given in the 'save' command. In addition, the name is used when debugging with DDT to gain access to the program's symbol table. Only one TITLE is allowed per module (file or section of file delimited by Macro Page 59 PRGEND's). The TITLE statement usually appears as the first line of a program. If no TITLE statement is used, the assembler inserts the program name ".MAIN". 3.3.33. XWD Format XWD leftHalf,rightHalf Function Enters two halfwords in a single storage word. Each half is formed in a 36-bit register, and the low-order 18 bits are placed in the halfword. The high-order bits are ignored. Optional Notation: leftHalf,,rightHalf 3.3.34. Z Format Z accumulator, address Function Z is treated as if it were the null machine language mnemonic ( opcode). An instruction word if formed with zeros in bits 0 to 8. The rest of the word is formed from the accumulator an address. If the accumulator and address fields are omitted, a zero word is assembled. 3.4. Macro Statements and Statement Processing A Macro statement has one or more of the following: a label, an opcode, zero or more operands, and a comment. The general form of a macro statement is: label: operator operand, operand ; Comment. A carriage return ends the statement. Direct assignment statements receive special handling. Processing of macros is not discussed in this section because a macro call produces text substitution. After substitution, the text is processed as described in this section. Macros are discussed in their own chapter. 3.4.1. Labels A label is always a symbol with a suffixed colon. The assembler recognizes a label by finding the colon. If a statement has labels (you can use more than one), they must be the first elements in the statement. A label can be defined only once; its value is the address of the first word of code generated after it. Macro Page 60 Since a label gives an address, the label can be either absolute or relocatable. A label is a local symbol by default, but you can declare a label to be INTERNAL globel or EXTERNAL global. 3.4.2. Operators After processing any labels, the assembler views the first group of following nonblank, nontab characters as a possible operator. An operator is one of the following: 1. A mnemonic symbol defined by Macro to stand for a machine opcode (see Chapter 2). 2. A user-defined operator, such as an OPDEF or macro invocation. 3. A Macro pseudo-op. If the characters found do not form one of the above, Macro views them as an expression. An operator is ended by the first non-alphanumeric character that is not a ., $, or %. If it is ended by blank or tab, operands may follow; if it is ended by a semicolon, there are no operands and the comment field begins; if it is ended by a carriage return, the statement ends and there are no operands or comments. 3.4.3. Operands After processing labels and the operator, if any, the assembler views as operands all characters up to the first unquoted semicolon or carriage return. Unquoted commas delimit the operands. The operator in a statement determines the number (none, one, two, or more) and kinds of permitted or required operands. Any expected operand not found is interpreted as null. An operand can be any expression or symbol appropriate for the operator. Examples: loop: move t1, x %print , In the first line, t1 and x are operands. In the second line, a macro, %print, is invoked with two operands. Each of these operands is quoted by enclosing it in angle brackets; this is necessary because each operand contains an internal comma. Macro Page 61 3.4.4. Comments The first unquoted semicolon in a statement begins the comment field. You can use any ASCII characters in a comment; however, angle brackets in a comment may produce unpredictable results (such as unexpected termination of a macro definition or conditional-assembly code) and should be avoided. If the first nonblank, nontab character in a line is a semicolon, the entire line is a comment. You can also enter a full line of comment with the pseudo-op REMARK, or a multiline comment with the pseudo-op COMMENT. Comments do not affect binary program output. 3.4.5. Statement Processing Macro processes your program as a linear stream of data. During Pass 1 of an assembly, Macro may find references to symbols not yet defined in the user symbol table. Whenever a symbol is defined, it is entered in the table with its value, so that on Pass 2 all definitions can be found in the table. The values then replace the symbols in the binary code that is generated. Note: Delayed definition is allowed only for labels and direct- assignment symbols. A symbol that contributes to code generation (for example, an OPDEF, a macro, or a REPEAT index) must be defined before any reference to it. Statement processing proceeds as follows: 1. Labels are found and entered in the user symbol table. 2. The next characters up to the first unquoted semicolon, blank, tab, comma, or equal sign are processed: a. Equal sign: the preceding characters form a symbol, and the following characters form an expression. The symbol and the value of the expression are entered in the user symbol table. b. Other delimiter: the preceding characters form an expression or an operator. If an operator, it is found in a table and assembled. If an expression, its value is assembled. If the operator takes operands, the next characters up to the first unquoted semicolon or carriage return form operands. Unquoted commas delimit operands. For each operand, leading and trailing blanks and tabs are ignored. Operands are evaluated and assembled for the given operator. 3. The first unquoted semicolon ends processing of the line. Any further characters up to the first carriage return are comment. 4. The first unquoted carriage return ends the statement. Any following characters begin a new statement. Macro Page 62 3.4.6. Assigning Addresses Macro normally (and by default) assembles statements with relocatable addresses. Assembly begins with the zero storage word and proceeds sequentially. Each time Macro assembles a word of binary code, it increments its location counter by 1. A mnemonic operator generates one word of binary code. Direct assignment statements and some pseudo-ops do not generate any binary code. Some pseudo-ops generate one or more word of binary code. You can control address assignment by setting the assembler's location counter using the pseudo-ops LOC and RELOC. You can also reference addresses relative to the location counter by using the dot symbol (.). For example, the expression .-1 used as an address refers to the location immediately preceding the current location. Use of this construction is not encouraged (see Chapter 8), because you can cause an incorrect address to be assembled by adding or removing statements withing the range of a .+n expression. Labels should be used except when n = plus or minus 1. 3.4.7. Machine Instruction Mnemonics and Formats An instruction is in one of the following forms: mnemonic accumulator, address mnemonic accumulator, mnemonic address where "mnemonic" evaluates to a machine operation code (opcode), "accumulator" is an accumulator (or register) address, and "address" is a memory address, possibly modified by indexing, indirect addressing, or both. The accumulator address can be any expression whose value is in the range 0 to 17 (octal). The memory address gives a location in memory, and can be any expression or symbol whose value is an integer in the range 0 to octal 777777. You can modify the memory address by indirect addressing, indexed addressing, or both. For indirect addressing, prefix an at-sign (@) to the memory address in your program. For indexed addressing, suffix an index register address in parentheses to the memory address in your program. This address can be any expression or symbol whose value is an integer in the range 1 to octal 17. Note: To assemble the index, Macro places the index register address in a fullword of storage, swaps its halfwords (as it does to any expression in parentheses), and then adds the swapped word to the instruction word. Example (in which t1=1, temp=100, and x=3): add t1, @temp(x) This generates the following binary code: Macro Page 63 instruction indirect code bit memory address 010 111 000 0 001 1 0 011 000 000 000 001 000 000 accumulator index register The mnemonic ADD has the octal code 270, and this is assembled into bits 0 to 8. The accumulator goes into bits 9 to 12. Since the @ appears with the memory address, bit 13 is set to 1. The index register goes into bits 14 to 17. Finally, the memory address is assembled into bits 18 to 35. If any element is missing from a primary instruction, zeros are assembled in its instruction word field. 3.4.8. Mnemonics with Implicit Accumulators A few mnemonics set bits in the accumulator field as well as in the instruction field. Therefore these mnemonics do not take accumulator operands, and are of the form mnemonic address For example, JFOV gives the octal code 25504; JFCL gives 255. They both give the opcode 255 in bits 0 to 8, but JFOV also sets the accumulator (bits 9 to 12) to binary 0001. This makes JFOV 100 equivalent to JFCL 1,100. 3.5. Using Macros A macro is a sequence of statements defined and named in your program. When you invoke a macro (by mentioning its name in your program), the sequence of statements from its definition is generated in place of the invocation, possibly with "arguments" plugged in. By using macros with arguments, you can generate passages of code that are similar, but whose differences are controlled by the arguments. This saves repetition in building a source file. 3.5.1. Defining Macros Before you can invoke a macro, you must define it. You can also redefine a macro if you wish; the new definition simply replaces the old one. To define (or redefine) a macro, use the pseudo-op DEFINE: DEFINE macroName (dArgList) where "macroName" is the name of the macro, "dArgList" is an optional list of "dummy arguments", and "macroBody" is a sequence of statements. The macroName is a symbol constructed according to the rules for symbols. Macro Page 64 The optional dummy-argument list can give one or more dummy-argument symbols through which values are passed to the sequence of statements. If a macro definition has dummy arguments, they must be enclosed in parentheses. Use commas as delimiters between dummy arguments. For each dummy argument, leading and trailing spaces and tabs are ignored. The macroBody is the sequence of statements you want to generate when you invoke the macro. The macroBody must be enclosed in angle brackets. Here is a sample macro definition: define vMag (adrs,len) < ;; Vector Magnitude macro. move t1, adrs ;; Get first component. fmp t1, t1 ;; Square it. move t2, adrs+1 ;; Get second component. fmp t2, t2 ;; Square it fad t1, t2 ;; and add in square of first. move t2, adrs+2 ;; Third component... fmp t2, t2 ;; squared... fad t1, t2 ;; and added in to the sum. call fSqrt ;; Go get the floating square root movem t1, len ;; and store the result. > Note the double semicolons; this device allows comments to be included in macro definitions but omitted from macro expansions. 3.5.2. Invoking Macros You can invoke a macro by putting its name in your program. Recall that you must define the macro before you can invoke it. You can use the macroName as a label, an operator, or an operand. If the macro's definition has dummy arguments, the macro invocation can have arguments. The arguments passed to the macro are inserted into the defined sequence of statements as it is generated. The first passed argument replaces the first dummy argument; the second passed argument replaces the second dummy argument; this treatment continues for each argument passed. Any missing arguments are passed as nulls (zeros) or filled in by default arguments (see below). Note: if FOO is a macro with four dummy arguments, the call FOO (a,,c) passes a and c as the first and third arguments. The second argument is passed as null; it is not considered missing and cannot be replaced by a default argument. The fourth argument is missing and will be replaced by a default argument if one has been defined; otherwise it will be passed as nulls. After argument substitution, the defined sequence of statements replaces the macroName and the argument list in the source text. For example, suppose you have defined vMag(adrs,len) as shown in Section 3.5.1 above, and vMag is invoked in your program as follows: getMag: vMag coord, q1 Macro Page 65 Then the effect would be as if you had written the following code in your program in place of the invocation of vMag: getMag: move t1, coord fmp t1, t1 move t2, coord+1 fmp t2, t2 fad t1, t2 move t2, coord+2 fmp t2, t2 fad t1, t2 call fSqrt movem t1, q1 This code is called the "macro expansion"; the invocation of vMag has been replaced by the macroBody from the definition of vMag, with the actual arguments from the invocation (coord and q1) replacing the dummy arguments from the definition (coord and len) throughout the text of the macroBody. 3.5.3. Macro Invocation Format In a macro invocation, delimit the macroName with one or more blanks or tabs. If the macro has arguments, the first nonblank, nontab character begins the argument list. Each argument ends with a comma, a carriage return, or a semicolon. These three characters cannot be used within arguments unless enclosed by special quoting characters. Leading spaces and tabs are stripped from each argument unless they are within special quoting characters. Embedded spaces and tabs are not stripped. Examples: %erMsg eh?,errLoc The macroName is %erMsg; the arguments are the strings "eh?" and "errLoc". %erMsg ,errLoc Here the first argument is quoted, because it contains commas. %print ,< x y z > Here the first argument is quoted because it contains commas, and the second argument is quoted because it contains carriage returns. Macro Page 66 3.5.4. Quoting Characters in Macro Arguments The special quoting characters for macro arguments are: < > Angle brackets ( ) Parentheses [ ] Square brackets " " Doublequotes (but not single quotes (apostrophes)) Any character, including the semicolon (;), enclosed in special quoting characters is treated as a regular character. If one of the special quoting characters is to be passed as a regular character, it must be enclosed by different special quoting characters. Here are the rules for macro argument handling. In the examples, "foo" is assumed to be a defined macro: 1. The special quoting characters are not argument delimiters. They only tell the assembler to treat the enclosed characters as regular characters. foo c has one argument: c. foo c,d has two arguments: c and c. 2. With the two exceptions explained below, special quoting characters are always included in passed arguments. foo a,(b,c) has two arguments: a and (b,c). foo [xwd 1,L1]-1(ac) has one argument: [xwd 1,L1]-1(ac). foo "(",0 has two arguments: "(" and 0. Exceptions: a. If the first character of the argument list is a left parenthesis, then it and its matching right parenthesis delimit the argument list. They are not treated as special quoting characters and are not included in passed arguments. All nested quoting characters except angle brackets are disabled. After stripping the outer parentheses, angle brackets are handled as described in Exception 2, below. foo (a,b,c) has 3 arguments: a, b, and c. foo (?Length > 132) has one argument: ?Length > 132. foo ([a,b]) has two arguments: [a and b]. foo () has one argument: a,b. b. If a left angle bracket is the first character of the argument list, or the first character after an unquoted comma, then it and its matching right angle bracket are treated as special quoting characters, but are not included in passed arguments. Macro Page 67 foo ,c has two arguments: a,b and c. foo c, has two arguments: c and a,b. 3.5.5. Nesting Macro Definitions You can define a macro within the body of another macro definition. The nested macro is not defined to the assembler until the enclosing macro is invoked. See the example in 3.5.6. 3.5.6. Concatenating Macro Arguments The apostrophe (') is the concatenation operator for macro calls. If you insert an apostrophe immediately before or after a dummy argument in the body of a macro, the assembler removes it at invocation. This removal joins (concatenates) the passed argument to the neighboring character in the generated text. If the apostrophe precedes the dummy argument, the passed argument is suffixed to the preceding character; if the apostrophe is follows the dummy argument, the passed argument is prefixed to the following character. You can use more than one apostrophe with a dummy argument. In this case only apostrophes next to the dummy argument will be removed (at most one from each side). Other apostrophes are treated as regular characters in the macroBody. The following example shows the treatment of apostrophes on both sides of the dummy argument, and of double apostrophes: define o (prefix,midfix) < define ocomp (suffix) < prefix'o'midfix''suffix > > The invocation "o a,j" generates: aoj'suffix because when the assembler replaces "prefix" with "a", the apostrophe following is removed to form "ao". When "j" replaces "midfix", the preceding apostrophe and first following apostrophe are removed to form "aoj'suffix". Now the invocation "ocomp le" generates "aojle" since the apostrophe is removed to join "aoj" and "le". 3.5.7. Default Arguments and Created Symbols Ordinarily, an argument missing from a macro invocation is passed as nulls. For example, the macro defined by Macro Page 68 define words (a,b,c) < exp a,b,c > when invoked by "words 1,1" generates three words containing 1, 1, and 0, respectively. You can, however, alter this handling by specifying default values other than nulls, or by using created symbols. 3.5.7.1. Specifying Default Values If you want a missing argument to default to some value other than nulls, you can specify the default value in your DEFINE statement. Do this by inserting the default value in angle brackets immediately after the dummy argument. For example, the macro defined by define words (a,b<222>,c<333>) < exp a, b, c > when invoked by "words 1,1" generates three words containing 1, 1, and 333, respectively. An argument passed as nulls by consecutive commas is NOT considered missing and cannot cause a default value to be supplied. Therefore missing arguments can occur only at the end of the list of passed arguments. 3.5.7.2. Created Symbols A symbol used as a label in a macroBody must be different for each call of the macro, since duplicate labels are not allowed. Therefore for each call a different symbol for the label must be passed as an argument. If you do not refer to such a label from outside the macro, you can simply let the assembler provide a new label for each call. This label is called a created symbol, and is of the form ..nnnn where nnnn is a 4-digit number. To use a created symbol in place of a passed argument, use the percent sign (%) as the first character of the dummy argument in your DEFINE statement. The assembler then creates a symbol for use in the macro expansion if that argument is missing from a call to the macro. If you provide an argument in the call, the passed argument overrides the created symbol. The argument is determined to be missing from, or present in, the macro invocation in the same way in 3.5.7.1, i.e. only a trailing argument can be missing. Example: Macro Page 69 define compare (test,save,index,%loc) < %loc: move save, test setz index, came save, table(index) jrst %loc > compare t1, t1, t3 expands to: ..0001: move t2, t1 setz t3, came t2, table(t3) jrst ..0001 while compare t1, t2, t4, foo expands to: foo: move t2, t1 setz t4, came t2, table(t4) jrst foo 3.5.8. Indefinite Repetition The pseudo-ops IRP, IRPC, and STOPI give a convenient way to repeat all or part of a macro; you can change arguments on each repetition if you wish, and the number of repetitions can be computed at assembly time. You can use these three pseudo-ops only within the body of a macro definition. To see how IRP works, assume the macro definition define doEach (a) < IRP a, > The invocation "doEach " produces the code: alpha beta gamma because each subargument passed to IRP generates one repetition of the code. Notice that the range of IRP must be enclosed in angle brackets. Using angle brackets in the invocation of doEach is critical, since they make the string "alpha,beta,gamma" a single argument for IRP. IRP then sees the commas as delimiting subarguments. IRPC is similar to IRP, but an argument passed to IRPC generates one Macro Page 70 repetition for each character of the argument. See 3.3.17 for an example of IRPC. STOPI ends the action of IRP or IRPC after assembly of the current expansion. You can use STOPI with a conditional assembly to calculate a stopping point during assembly. Here's an example, in which IRPC is used to generate code to copy either a whole string or the first five characters, whichever is shorter: define copy5 (string, dest) < $count=5 irpc string, < ;; Get a character from String. movei t1, "string" ;; Copy it. idpb t1, dest ;; ... $count=$count-1 ;; Count it. ife $count, ;; If we've done 5, then quit. > > 3.5.9. Alternate Interpretations of Characters Passed to Macros The normal argument passed by a macro call is simply the string of characters given with the call. Macro offers three alternate interpretations of the passed argument. - If you prefix a backslash (\) to an expression argument, the argument passed is the ASCII numeric character string giving the value of the expression. See 3.3.23 for a useful example. - If you prefix a backslash-apostrophe (\') to an expression argument, the argument passed is the string whose value is the SIXBIT string with the integer value of the expression. - If you prefix backslash-doublequote (\") to an expression argument, the argument passed is the string whose value is the ASCII string with the integer value of the expression. Illustration: a macro "foo" that just replaces itself by its argument can work in many ways. define foo (a) < a > Sample invocations of foo; assume the following symbol definitions: Z=60 ZZ='SIXBIT' ZZZ="ASCII" Invocation Expansion foo 60 60 Macro Page 71 foo \60 60 foo \'60 P foo \"60 0 foo Z Z foo \Z 60 foo \'Z P foo \"Z 0 foo ZZ ZZ foo \ZZ 635170425164 foo \'ZZ SIXBIT foo ZZZ ZZZ foo \ZZZ 203234162311 foo \"ZZZ ASCII 3.6. Errors and Messages Macro has three kinds of messages: 1. Informational messages 2. Single-character error codes 3. MCRxxx (where xxx is a 3-letter mnemonic code) 3.6.1. Informational Messages This are found at the end of listings, and some of them are printed at your terminal. The only truly useful one is: UNASSIGNED DEFINED AS IF EXTERNAL This message lists symbols that were not defined. A symbol can be undefined for several reasons: 1. You forgot to define it, e.g. you referred to a label that you forgot to include in your program. 2. You spelled it wrong either when defining it or when referring to it. 3. You are referring to a symbol defined in another module, but you forgot to declare it EXTERNAL in the current module. Macro Page 72 3.6.2. Single-Character Error Codes These cryptic and uninformative codes are printed at your terminal when Macro encounters various kinds of errors. A Argument error in pseudo-op. D The statement refers to a multiply defined symbol. E Improper use of EXTERNAL symbol. L Literal generates less than 1 or more than 99 words of code. M Symbol has already been defined; it will retain its first definition. N Number error. O Opcode undefined, assembled as zeros. P Phase error. In general, the assembler generates the same number of locations on Pass 1 and Pass 2. Any discrepancy causes a phase error. Q Questionable. A broad class of warnings issued when the assembler finds ambiguous language. R Relocation error. U Undefined Symbol. V Symbol used to control the assembler is undefined. X Error in defining or calling a macro during Pass 1. 3.6.3. MCRxxx Messages These appear at your terminal during assembly, and are followed by an English phrase that explains the problem, e.g. MCRCFU - CANNOT FIND UNIVERSAL The MCRxxx messages should be more or less self-explanatory. Monitor Calls Page 73 4. Introduction to Tops-20 Monitor Calls 4.1. Introduction This document contains all the information you need to do certain basic monitor calls (mostly those for input/output) from assembly language programs on the DECSYSTEM-20. Only a few of the many existing monitor calls are described, and these are not always described in complete detail. Any information not covered here can be found in most recent edition of the DECSYSTEM-20 Monitor Calls Reference Manual. Knowledge of the KL10 (DECSYSTEM-20) instruction set, and of an assembler (preferably Macro-20), is assumed. All numbers in the following text, except bit positions (which are always decimal), are octal (base 8) unless otherwise noted. 4.2. General Information Monitor calls are invocations of subroutines in the Tops-20 monitor. A DEC-20 monitor call is known as a JSYS (Jump to System). All JSYS's have opcode 104, which is trapped by the monitor which in turn looks at the address field of the instruction to find out which JSYS is being called. The following conventions apply to all JSYS's: Arguments for the JSYS are placed in accumulators (AC's). The first argument is in AC1, the second in AC2, and so forth up to a maximum of four AC's. If more than four arguments need to be passed, they are placed in an argument block pointed to by an AC. Results are also returned in AC's 1-4; if more than 4 results are to be returned, they are returned in a block of memory whose address was specified in the calling sequence. Any AC which does not return a result should have the same contents after the call as before. After execution of the JSYS, control is returned to the caller at one of two locations. The +1 return (calling location plus 1) is often used to indicate failure of the JSYS to perform its intended function, and an error code is stored somewhere to indicate the exact cause of the failure. The +2 return is used to indicate successful completion of the JSYS. However, some JSYS's have only a single return to the instruction following the call (+1) regardless of success or failure; this is the newer convention. For this reason, it is generally not wise to depend on the +1/+2 convention; another mechanism has been provided which should be used instead: erjmp/ercal. Erjmp (which assembles as 'jump 16,') causes a jump to the specified location if the JSYS, which must precede it immediately, failed. Ercal ('jump 17,') works in the same way except it 'calls' rather than 'jumps'. Each JSYS, upon encountering an error, inspects the contents of the instruction following the JSYS call. If it is an erjmp or ercal, it takes the indicated action; if not, it returns +1 or +2 according to its definition. Every JSYS error has an associated error message, which provides a one-line English description of the error, e.g. Monitor Calls Page 74 ?Directory access privileges required The erring JSYS does not print the message; other JSYS's are provided to print these error strings in a standard way (on the left margin, preceded by a question mark). In general, recovery from a JSYS error involves printing the error message and then either halting or going to some special section of code that takes corrective action. Since these cases are overwhelmingly common, let us assume that a special macro, called %jsErr, has been provided to take care of them. It has the following format: %jsErr message,address If the message is specified, it is printed as a standard error message; if not, the appropriate system-defined JSYS error message is printed. If the address is specified, control will transfer to it after the message is printed, otherwise the program will halt (but can be continued at the instruction following the %jserr). Here are some examples (shown without setting up the AC's): GTJFN ; JSYS to Get Job File Number. %jsErr ; Print standard message and halt on error. GTJFN %jsErr ; Print custom message, then halt. GTJFN %jsErr (,foo) ; Print standard message then go to foo. A couple more examples not using the macro: GTJFN erjmp foo ; Go to foo on error. GTJFN ercal baz ; Call subroutine baz on error. GTJFN ercal [setz t1 ; Execute these instructions on error. movei t2, 1 ret ] ; (ret = 'popj 17,') movem t1, injfn ; Come here regardless of success or failure. GTJFN erjmp [setz t1 ; On failure, execute these instructions movei t2, 1 ; ... jrst foo ] ; then go to foo. movem t1, injfn ; Come here only on success. Erjmp and ercal are defined in monSym (i.e. in sys:monSym.unv) for Macro programs, and predefined for Midas programs. %jsErr is defined, for Macro, in CUSYM; jsErr is defined for Midas in MAC:MacSym.mid. Monitor Calls Page 75 4.3. Using Mnemonic Symbols Most JSYS's include among their arguments a number of arbitrarily chosen numeric constants, including special numbers to designate the source for input or output, and bits that can be on or off depending on whether the corresponding option is selected. These bits are listed in the description of each JSYS. Of course it is possible to write the (possibly combined) bits as octal constants, but this would be extremely poor programming practice, because the resulting code would be unreadable. Therefore, symbols have been defined to stand for each bit. These symbols have some mnemonic value, and they should always be used. The definitions appear in SYS:MONSYM.UNV, and you access them from your Macro program by including the "search monSym" directive. (Midas has the symbols built in.) Some of the very common symbols that are not specific to any particular JSYS are: Value: Symbol Left Half Right Half Meaning ------ --------- ---------- ------- .PRIIN 0 100 Primary input (usually TTY) .PRIOU 0 101 Primary output (ditto) .FHSLF 0 400000 Fork Handle Self (current process) When a JSYS has option bits, these have symbols of the form JJ%NNN, where JJ denotes which JSYS, and NNN denotes the option. The % is included to prevent confusion with user defined symbols (which should not be of this form), and to emphasize that the symbol stands for a JSYS bit. Examples: Symbol JSYS Bit Meaning ------ ---- --- ------- GJ%NEW GTJFN 1 Want JFN for a new file. GJ%OLD GTJFN 2 Want JFN for an old (i.e. existing) file. OF%RD OPENF 19 Want to open a file for read access. OF%WR OPENF 20 Want to open a file for write access. 4.4. Source/Destination Designators Many monitor calls operate on or transmit byte streams; a byte stream is a sequence of bytes of some size from 1 to 36 bits, most commonly 7 bit bytes (ASCII text) or 36 bit bytes (machine words). The source or destination of these bytes can be any one of several items, including a file, a terminal, or a string in the caller's address space. In these cases, a standard 36-bit quantity, called the source/designation designator, is used as a JSYS argument to declare the byte stream on which to operate. It can have various formats; these are the most common: Monitor Calls Page 76 Value: Symbol Left Half Right Half Meaning ------ --------- ---------- ------- (none) 0 a JFN Job file number, i.e. a file. .PRIIN 0 100 Primary input (usually TTY). .PRIOU 0 101 Primary output (usually TTY). (none) reasonable address Pointer to beginning of a string left half of in the caller's address space. byte pointer. (none) 777777 address Equivalent to 440700,,address, i.e. a 7-bit byte pointer. The monitor can tell which type of designator is intended either from its format or from option settings which you provide. The last format is provided to allow you to write hrroi t1, [asciz/foo/] rather than move t1, [point 7, [asciz/foo/]] ; Macro or move t1, [440700,,[asciz/foo/]] ; Midas or Macro This saves a memory reference, and it's easier to type. But it only works for JSYS's, not for byte instructions; the JSYS internally translates 777777 in the left half of a source/destination designator to 440700. Note that JSYS's usually assume that an input byte string in memory (i.e. one pointed to by a byte pointer) ends with a zero byte; strings are usually kept in memory in this format, which is known as ASCIZ (7-bit ASCII, terminated by zero). 4.5. Setting up a JSYS Invocation You may have noticed that some of the JSYS bits described above fall in the left half of the word, while others fall in the right half. This means that to load the AC's with the appropriate bits could require different instructions. Since programs should not depend on the actual values of the symbols for JSYS bits, a macro is provided to free you from worrying about whether a given symbol is a left-half or right-half quantity: the macro is called MOVX; it translates to the appropriate instruction based on its argument, e.g. if JS%FOO = bit 35 JS%BAZ = bit 0, () swaps its contents, [] specifies the address of the enclosed quantity, and ! is the logical OR operator, then: Monitor Calls Page 77 movx t1, JS%FOO (becomes) movei t1, JS%FOO movx t1, JS%BAZ (becomes) movsi t1, (JS%BAZ) movx t1, JS%FOO!JS%BAZ (becomes) move t1, [JS%FOO!JS%BAZ] If you would rather avoid movx, you can use the third form; it will always work, but it often costs an extra memory reference (and a little more typing). One final complication: some JSYS arguments consist of fields which must take on certain values. The desired value must somehow be logically ANDed with a mask of 1's that specifies the field. For instance, the NOUT JSYS (which outputs a number according to a format which you specify) accepts as one of its arguments, the number of columns in which to print the number. This field happens to be in bits 11-17. If you want the number to be printed in a field of, say, 5 columns, you must put a 5 right-justified in bits 11-17. This field has a name, NO%COL, and it assembles as 000177000000. It would be hard to imagine how to logically AND a 5 with this field at assembly time. Fortunately, a macro is provided for this purpose: FLD(value,mask), which in this case would be written FLD(5,NO%COL). The result here is 000005000000, which in turn can be logically ORed with other bits and fields to provide all the information needed for the JSYS to return the desired result. The FLD and MOVX macros are defined for Macro-20 in SYS:MACSYM.UNV. Here's a fullblown example, in which the NOUT JSYS is set up to print a number, which is stored at location foo, on the terminal in a field of 5 characters, usigned, with leading blanks, in base-4 notation. First, the symbol definitions are reproduced from MONSYM: Symbol Bit(s) Meaning ------ ------ ------- NO%MAG 0 Output the magnitude (i.e. unsigned). NO%LFL 2 Output leading filler specified by NO%ZRO. NO%ZRO 3 Output 0's as leading filler. If off, output 1's as leading filler. NO%COL 11-17 Number of columns to output. NO%RDX 18-35 Radix (2-36) of number being output. Now here's the actual JSYS setup and call: movx t1, .PRIOU ; Output goes to terminal. move t2, foo ; The number is stored in foo. movx t3, NO%MAG!NO%LFL!fld(5,NO%COL)!fld(4,NO%RDX) ; Format. NOUT ; Number Out JSYS. %jserr ; Print message and halt on error. This should be sufficient background for you to understand the descriptions of the specific JSYS's. 4.6. JSYS's to Open and Close Files A file in Tops-20 is identified by its device name, directory name, filename, file type, and generation number. These 5 items uniquely identify any file on the system that is accessible to a user. The device name identifies the device on which the file is stored (or it can be a "logical device name" that Monitor Calls Page 78 specifies a list of one or more devices and/or directories which is to be searched for the file); the directory name identifies the directory containing the file. The filename, type, and generation number identify a particular file in the given device and directory. Tops-20 requires references to files to be via "handles" that can be contained in a few bits (a halfword, really) and do not require extensive lookup procedures for each reference. Such a handle is called a JFN (Job File Number); it is a small integer, valid within a particular job (even if the job consists of many processes) but not valid across jobs; JFN 2 in job 11 will generally be a handle on a completely different file than JFN 2 in job 18. A JFN is associated with a file by means of the GTJFN (Get JFN) monitor call, which accepts a file specification and returns a JFN for the indicated file. The special JFN's 100 ( .PRIIN) and 101 ( .PRIOU) are reserved for the primary input and output designators, respectively, and are never returned by the GTJFN call. To do i/o (input/output) to a file, you must first get a JFN for it, using GTJFN. You must then open the file by giving the JFN to the OPENF JSYS. Then you can use various monitor calls to actually transfer data. When you are finished with the file, you must close it using the CLOSF monitor call. These essential monitor calls are now described, but not in complete detail. For the more esoteric features, refer to the Monitor Calls Reference Manual. The examples shown here are for Macro programs. In general the only difference between Macro and Midas (for these examples) the logical OR operator (in Macro, it's "!"; in Midas it's "\") 4.6.1. GTJFN (JSYS 20) - Get Job File Number (short form) Returns the JFN for the specified file. Accepts the specification for the file from a string in memory or from a file (possibly a terminal) but not both. If the source is the terminal, recognition can be done. One or more fields of the file specification can be represented by a logical name. If any fields are omitted, the system will provide default values as follows: Device Connected structure. Directory Connected directory. Name No default; must be specified. Type Null. Generation Highest existing (input file); next higher (output). Accepts in AC1: Flag bits in the left half, default generation number in the right half (usually 0). AC2: Source designator from which to obtain the file specification (see description of flag bit GJ%FNS). Returns: +1: Failure, error code in AC1. +2: Success, flags in the left half of AC1, and the JFN in the right. Flag bits: Monitor Calls Page 79 Symbol (Bit(s)) Meaning GJ%FOU (0) (File OUtput) The file given is to be assigned the next higher generation number; this bit indicates that a new version of a file is to be created and is normally set if the file is for output use. GJ%NEW (1) The file specification must not refer to an existing file, i.e. the file must be a new file. GJ%OLD (2) The file specification given must refer to an existing file, i.e. the file must be an old file. Note: if you want to open a file in ' append' mode (i.e. you want to create a new file if none exists, or else write to the end of an existing file), don't specify GJ%OLD or GJ%NEW, and use OF%APP in the OPENF call (see below). GJ%CFM (4) ConFirMation from the user will be required (if GJ%FNS is on) to verify that the file specification is correct. This generally means that the user will have to type carriage return after typing the file specification. GJ%FNS (16) (File Name Source) The contents of AC2 are to be interpreted as follows: 1. If this bit is on, AC2 contains an input JFN in the left half, and an output JFN in the right. The input JFN is used to obtain the file specification to be associated with the JFN. The output JFN is used to indicate the destination for printing the names of any fields being recognized. To omit either JFN, specify .NULIO (377777). This option is generally used when the file specification is being obtained from the terminal, and recognition is being performed. The JFN's will normally be .PRIIN and .PRIOU. 2. If this bit is off, AC2 contains a pointer to an ASCIZ string in memory that specifies the file to be associated with the JFN. GJ%SHT (17) This bit should always be on to indicate that the long form of GTJFN is not being used (the long form requires extra input). (none) (18-35) The generation number of the file. Usually one of the following: 1. .GJDEF (0) - the normal case - to indicate that the next higher generation number of the file is to be used if GJ%FOU is on, or that the highest existing generation of the file is to be used if GJ%FOU is off. 2. .GJNHG (-1) - to indicate that the next higher generation number is to be used if no generation number is supplied. To translate a JFN to its corresponding filename string, see the description Monitor Calls Page 80 of JFNS in the Monitor Calls Reference Manual. Examples: ; Get a JFN from a filename string in memory. movx t1, GJ%OLD!GJ%SHT ; For an "old" file hrroi t2, [asciz/foo.txt/] ; called "foo.txt", GTJFN ; get a Job File Number. %jsErr ; Print message and halt on error. hrrzm t1, iJfn ; Save the JFN in location "ijfn". Note the "hrrzm", rather than "movem". This eliminates the extraneous flags that are returned in the left half, which could cause errors in some applications that don't need them (most don't). ; Get a JFN from the terminal, allowing recognition. movx t1, GJ%OLD!GJ%FNS!GJ%CFM!GJ%SHT ; Old, tty i/o, confirm. move t2, [.PRIIN,,.PRIOU] ; Get filename from terminal. GTJFN ; Get the JFN. %jsErr (,err) ; Print msg & go to "err" on error. hrrzm t1, iJfn ; Save the result. 4.6.2. OPENF (JSYS 21) - Open a File Opens the given file. No data transfer can be done until the file is open. Accepts in AC1: JFN of the file being opened, in the right half. AC2: Various control information, including: Symbol Bit(s) Meaning ------ ------ ------- OF%BSZ 0-5 Byte size (maximum 36 decimal) OF%RD 19 Allow read access. OF%WR 20 Allow write access. OF%APP 22 Allow append access. OF%PLN 30 Disable line number checking and consider a line number as 5 characters of text. If this bit is off, line numbers will be discarded automatically. Returns: +1: Failure, error code in AC1. +2: Success. No values are returned. Monitor Calls Page 81 Example: ; Open an old file for read access. move t1, iJfn ; The JFN already obtained by GTJFN. movx t2, fld(7,OF%BSZ)!OF%RD ; 7-bit bytes, read access desired. OPENF ; Open the file. %jsErr (,fail) ; Print this message and ; go to location "fail" on error. 4.6.3. CLOSF (JSYS 22) - Close a File. Closes a specific file. Accepts in AC1: Flags, normally 0, in left half, JFN in right half. Returns +1: Failure, error code in AC1. +2: Success. No value is returned. Flags: CO%NRJ (0) Do not release the JFN. CO%ABT (6) abort any output operations currently being done. Close the file, but do not perform any cleanup operations normally associated with closing a file (e.g. do not output remaining buffers). If output to a new disk file that has not been closed (i.e. is nonexistent) is aborted, the file is closed and then expunged. Example: ; Close a file move t1, ijfn ; Get the JFN into AC1. CLOSF ; Close the file. %jsErr (,.+1) ; On error, print message but continue. 4.7. File i/o JSYS's There are three ways to do file i/o: by character (byte), by character string (byte string), and by page. Of these, only the first two will be discussed. For paged i/o see the Monitor Calls Reference Manual. When doing input, whether byte- or string-oriented, you must always be prepared to handle errors, and to distinguish a normal end-of-file (eof) error from other errors. A monitor call is provided that can be used for this purpose (among others): Monitor Calls Page 82 4.7.1. GTSTS (JSYS 24) - Get file Status Returns the status of a file associated with a JFN. ACCEPTS in AC1: JFN in the right half. Returns: +1: always, with JFN status word in AC2. If the JFN is illegal in any way, bit GS%NAM will be 0. The JFN Status Word has the following format: Symbol Bit Meaning ------ --- ------- GS%OPN 0 File is open. GS%RDF 1 Read access is allowed. GS%WRF 2 Write access allowed. GS%EOF 8 Last read was past end of file. GS%NAM 10 A file specification is associated with this JFN. If GS%OPN is 0, then the settings of the other bits are meaningless. There are other bits and fields in the JFN Status Word; for these, see the Monitor Calls Reference Manual. When testing the status word for some (combination of) bit(s), you are faced with a situation similar to loading some combination of option bits into an AC for a JSYS call, i.e. you don't know (or at least you shouldn't depend on) whether the bits form a left-half, right-half, or fullword quantity, and therefore you don't know which test instruction to use (tl--, tr--, ts--, or td--). Of course, td-- will always work if you put the argument in square brackets, but this requires an extra memory fetch at runtime. For this reason, a set of tx-- macros have been defined that are analogous in purpose and operation to the "movx" macro described above. You can always use a tx-- without worrying about the actual value of the quantity being tested. These comments apply not just to testing the JFN status word, but to testing any word for symbolically defined bits. Example (assume that a JSYS error has just occurred during input): ; Check for end of file. move t1, iJfn ; Load the JFN of the input file. GTSTS ; Get the status of the file. txnn t2, GS%EOF ; Are we at eof? jrst error ; No, must be some other error. ; Come here on end of file. 4.7.2. Sequential Byte i/o The following JSYS's input or output bytes sequentially. Monitor Calls Page 83 4.7.2.1. BIN (JSYS 50) - Byte In Inputs the next byte from the specified source. When the byte is read from a file, the file must first be opened, and the size of the byte given, with the OPENF call. When the byte is read from memory, a pointer to the byte is given; this pointer is always updated after the call (a BIN to a character string in memory is equivalent to an ildb instruction). Accepts in AC1: source designator (JFN or byte pointer) Returns +1: always, with the byte right-adjusted in AC2. If the end of file is reached, AC2 contains 0 and an erjmp or ercal instruction following the BIN call will be activated (but note that end of file is not the only error possible). 4.7.2.2. PBIN (JSYS 73) - Primary Byte In Like BIN, but always obtains the byte from the primary input source (i.e. the .PRIIN JFN, usually the terminal). Needs no input. Returns: +1: always, with byte right-adjusted in AC1. Various errors are possible, including end-of-file. 4.7.2.3. BOUT (JSYS 51) - Byte Out Outputs a byte sequentially to the specified destination. When the byte is written to a file, the file must first be opened, and the size of the byte given, with the OPENF call. When the byte is written to memory, a pointer to the location in which to write the byte is given in AC1. This pointer is updated after the call. Accepts in AC1: destination designator (JFN or byte pointer). Returns +1: always. Various errors are possible. 4.7.2.4. PBOUT (JSYS 74) - Primary Byte Out Like BOUT, except byte always goes to the primary output destination (i.e. the .PRIOU JFN, usually the terminal). Accepts in AC1: byte to be output, right-justified. Returns: +1: always. Various errors are possible. Monitor Calls Page 84 4.7.2.5. Example of Byte i/o ; This program segment reads bytes one at a time from a file whose ; JFN is stored in location "iJfn". If a byte is null (zero), it is ; discarded, otherwise it is written to the output file, whose JFN ; is stored in location "oJfn". Both files are assumed to be open. copy: move t1, iJfn ; Load input JFN into t1. BIN ; Get next byte. erjmp tstEof ; On error, go test for eof. jumpe t2, copy ; If the character was zero, ignore it. move t1, oJfn ; Load output jfn into t1. BOUT ; Write the byte to the output file. %jsErr ; Print message and halt on error. jrst copy ; Loop for all bytes in file. ; Come here on error in the BIN JSYS. Check for end of file. tsteof: GTSTS ; JSYS to Get file Status (JFN is in t1). txnn t2, GS%EOF ; Was it an eof? %erMsg (,done) ; No, print the error message and finish up. ; Here when we've copied the entire file. done: move t1, iJfn ; Close the input file CLOSF ; using the CLOSF JSYS. %jsErr (,.+1) ; On error, print message but continue. move t1, ojfn ; And close the ouput file too. CLOSF %jsErr (,.+1) ; End of program fragment. Note the %erMsg macro; it is identical to %jsErr, except that it is executed unconditionally rather than via erjmp or ercal. It is mainly intended for use after a skipping instruction or subroutine call. 4.7.3. String-oriented i/o The following JSYS's input or output byte strings. 4.7.3.1. SIN (JSYS 52) - String In Reads a string from the specified source into the caller's address space. The string can be a specified number of bytes or terminated with a specified byte. Monitor Calls Page 85 Accepts in AC1: Source designator (JFN or byte pointer). AC2: Pointer to string in the caller's address space (destination). AC3: Count of number of bytes to input, or 0. AC4: Byte (right-justified) on which to terminate input (optional). Returns: +1: Always, with updated string pointers in AC1 and AC2 (if pertinent), and updated count in AC3 (if pertinent). The contents of AC3 controls the number of bytes to read as follows: AC3=0 The string being read is terminated with a 0 byte. AC3>0 A string of the specified number of bytes is to be read or a string terminated with the byte given in AC4 is to be read, whichever occurs first. AC3<0 A string of minus the number of bytes is to be read. AC4 is ignored unless AC3 contains a positive number. An end of file can be processed as in the example above. After execution of the call, the file's pointer is updated for subsequent i/o to the file. AC2 is updated to point to the last byte read or, if AC3 contained 0, the last nonzero byte read. The count in AC3 is updated toward 0 by subtracting the number of bytes read from the number of bytes requested to be read. If the input was terminated by an end-of-file condition, AC1 thru AC3 are updated (where pertinent) to reflect the number of bytes transferred before the end of file was reached. Various errors, besides eof, are possible (e.g. invalid JFN, file not open, device or data error, etc.). Note: The source JFN can be .PRIIN (i.e. the terminal), but SIN is not the best JSYS to use for input of strings from the terminal, because it does not allow for editing (via DEL, ^U, ^W, and ^R) or prompting. Terminal string input is best done using the TEXTI or RDTTY JSYS (RDTTY is described below); better still, all terminal input can be done using the COMND JSYS (Chapter 5), which allows not only prompting and editing, but automatic help and recognition, as well as syntax checking for all sorts of fields (numbers, file names, keywords, etc). For TEXTI and COMND, refer to the Monitor Calls Reference Manual. 4.7.3.2. SOUT (JSYS 53) - String Out Writes a string from the caller's address space to the specified destination. The string can be a specified number of bytes or terminated with a specified byte. Monitor Calls Page 86 Accepts in AC1: Destination designator. AC2: Pointer to string to be written. AC3: Count of the number of bytes to be written, or 0. AC4: Byte (right-justified) on which to terminate output. Returns: +1: Always, updated string pointers and counts in the AC's, if pertinent. Interpretation of the arguments is exactly the same as by SIN, and the operation is entirely analogous, but in the opposite direction. 4.7.3.3. PSOUT (JSYS 76) - Primary String Out A short form of SOUT (there is no corresponding short form of SIN). Outputs a string sequentially to the primary output destination (usually the terminal). Accepts in AC1: Pointer to ASCIZ (zero-terminated) string in the caller's address space. Returns: +1: Always, with updated string pointer in AC1. As usual, various errors are possible (usually an invalid byte pointer is the culprit). Monitor Calls Page 87 4.7.3.4. Example of String I/O This program fragment copies the input file (which is already opened in 7-bit mode and whose JFN is stored in location "iJfn") to the output file (open similarly, JFN in "oJfn"), a string at a time, where a string is considered to be a sequence of bytes terminated by a linefeed, but of maximum lenth 512 (decimal). Note that the linefeed character is represented symbolically; this is considered good technique because it makes the program more readable. The symbolic definitions of the (nonprinting) ASCII control characters are obtained by searching MACSYM. maxLen=^d512 ; Define maximum length symbolically. buffer: block ; String buffer (convert bytes to words). : : copy: move t1, iJfn ; Load the input file JFN. hrroi t2, buffer ; Point to place in memory to put string. movei t3, maxLen ; Maximum number of characters. movei t4, .CHLFD ; Terminate on the CHaracter LineFeeD. SIN ; Read the string into the buffer. erjmp tstEof ; On error, go test for eof. ; Now write the string into the output file. Note that t3 contains ; >. We can't count on the presence of the ; terminating linefeed because input may have exhausted the maximum byte ; count before a linefeed was encountered. move t1, oJfn ; Load the output file JFN. hrroi t2, buffer ; Point to the string to be written. subi t3, maxLen ; Get negative length into t3. SOUT ; Write the string out. %jserr (,done) ; Print message and finish up on error. jrst copy ; Loop for all strings in file. tstEof: GTSTS ; JFN of input file already in t1. txnn t2, GS%EOF ; Error caused by end of file? %ermsg (,done) ; No, print message and finish up. done: hrroi t1, [asciz /All done/] ; Type this message PSOUT ; at the terminal. ; Now just close the files in the normal way. ; (end of program fragment) 4.7.3.5. RDTTY (JSYS 523) - Read string interactively from TTY RDTTY is the JSYS most suited to simple terminal input; it is a subset of the TEXTI JSYS, which is much more flexible, more powerful, but more complicated to use (see the Monitor Calls Reference Manual for details on TEXTI and full Monitor Calls Page 88 details on RDTTY). We will describe the most useful options of the RDTTY JSYS. RDTTY is most suited for interactive terminal input because it lets the user edit her/his input, on a line-by-line (or several-line) basis, before it's "eaten" by the program. All the editing characters, viz., ^R, ^W, ^U, ^L, and RUBOUT, are available for editing (though no recognition is possible, since command parsing is not being done). There are basically two ways to call RDTTY: to read a single line (up to a carriage-return or line-feed), and to read a whole group of lines (up to a ^Z or ESCAPE). In the single-line mode, the program receives each line as the user types it, and thus only permits editing on a single-line basis; in the line-group mode, the user can enter any number of lines (limited only by the size of your input buffer) and edit all the way back to the beginning of input (for example, MM and Mail use the 1 line-group mode to collect the text of messages to be sent). The RDTTY JSYS reads input from the primary input port ( .PRIIN), until either a break character is seen (such as ^Z or carriage-return) or the given byte count is exhausted (which means your buffer is full). Output generated as a result of editing the input text (such as echoing of deleted characters after backslashes, as in "fooee", which echoes as "foo\o\oee" on a hardcopy terminal) goes to the primary output port (.PRIOU). .PRIIN and .PRIOU normally designate the user's terminal (TTY) as an both input and output device, respectively. Accepts in AC1: pointer to string in caller's address space where input is to be placed. AC2: control bits in left half, as described below, and number of bytes available in the buffer (pointed to by AC1) in the right half. AC3: 0, if no prompt is wanted, or byte pointer to prompting text (^R) buffer. Returns: +1: failure, error code is in AC1. +2: success, updated string pointer in AC1, appropriate bits set in the left half of AC2, and updated count of available bytes in the right half of AC2. Note that the prompting text buffer is not automatically output by RDTTY; you have to output it yourself before doing the RDTTY (but it will be output properly if ^R or ^L is typed during user input). The control flags in the left half of AC2 are broken into two groups: those given to RDTTY, and those returned (see the Monitor Calls Manual for a --------------- 1 If you need to read in a group of text lines, even a large group, it's advisable to use the the line-group mode of RDTTY, since memory is 'cheap': just give RDTTY a large buffer area; this makes things much easier for the poor, fallible human out there. Monitor Calls Page 89 complete description of the control flags). Symbol (Bit(s)) Meaning Given to RDTTY in AC2: RD%BRK (0) BReaK on ^Z or ESCAPE (i.e., line-group mode) RD%BEL (3) Break on End of Line (carriage return or linefeed, i.e., single-line mode) RD%RAI (10) RAIse user input: convert lower case to upper case Returned from RDTTY in AC2: RD%BTM (12) Break character TerMinated input: if this is set, the input was terminated because a break character was seen; if not, then input was terminated because the buffer is full (as determined by the byte count given in the right half of AC2) 4.7.3.6. RDTTY Example The following example uses the RDTTY JSYS to collect a "note", or a group of lines about some subject, and write it all to a file (which is opened and closed elsewhere). ; Data area maxLen==^d10000 ; Allow 10k characters in note input. prompt: asciz/Text of note (end with ^Z or ESCAPE): / inBuf: block maxLen/5 + 1 ; Input buffer. oJfn: block 1 ; Output JFN. . . . Monitor Calls Page 90 ; Assume oJfn is set up with the output file JFN. note: hrroi t1, prompt ; Prompt for the note. PSOUT ; ... hrroi t1, inBuf ; Get a pointer to destination buffer. movx t2, ; Break on ^Z, ESC, up to buffer size. hrroi t3, prompt ; Use prompt on ^R, etc. RDTTY ; Read the note. %jsErr ; Die loudly if fails. hrrz t3, t2 ; Get remaining byte count. movns t3 ; Negate it and add to maxLen to get add t3, [maxLen] ; number of bytes input. hrroi t1, inBuf ; What we want to output. move t2, oJfn ; Get destination JFN, SOUT ; and output what we just read. %jsErr ; Handle failure nicely. . . 4.7.4. Number conversion JSYS's These are similar to string i/o JSYS's in that their input or output is a string (in memory or in a file), but the string is the ASCII character representation of a number in some base, and conversion is done either from the internal form of the number to the string or vice versa. 4.7.4.1. NIN (JSYS 225) - Number In Inputs the character-string representation of an integer number, ignoring leading spaces. This call terminates on the first character not in the specified radix. If that character is a carriage return followed by a line feed, the line feed is also input. Accepts in AC1: Source designator (JFN or byte pointer). AC3: Radix (2-10) of number being input. Returns: +1: Failure, error code in AC3, updated string pointer (if pertinent) in AC1. +2: Success, internal two's complement binary representation of number in AC2, and updated string pointer (if pertinent) in AC1. Various errors are possible (invalid JFN, file not open, invalid radix, first nonspace character not a digit, overflow, etc.). It is not advisable to use NIN to input a number from the terminal because NIN does not allow editing and reprompting, as does RDTTY; RDTTY should be used to get the string representation of the number into memory, and then NIN should be used to Monitor Calls Page 91 convert the string to a number; if the string contains invalid characters, the error can be caught and the user can be informed and reprompted (by a %jsErr (,foo) after the NIN, where foo is the address of the RDTTY sequence). This technique is shown in a subsequent example. 4.7.4.2. NOUT (JSYS 224) - Number Out Outputs an integer number to a file or to a string in memory. Accepts in AC1: Destination designator. AC2: Number to be output. AC3: Format control information as follows: Symbol (bit(s)) Meaning NO%MAG (0) Output the magnitude. That is, output the number as an unsigned 36-bit number (e.g. output -1 as 777777777777 in base 8). NO%SGN (1) Output a plus sign for a positive number. NO%LFL (2) Output leading filler. If this bit is not set, trailing filler is output, and bit NO%ZRO is ignored. NO%ZRO (3) Output 0's as the leading filler if the specified number of columns (NO%COL) allows filling. If this bit is not set, blanks are output as leading filler if the number of columns allows filling. NO%OOV (4) Output on column overflow and return an error. If this bit is not set, column overflow is not output. NO%AST (5) Output asterisks on column overflow. If this bit is not set and NO%OOV is set, all necessary digits are output on column oveflow. NO%COL (11-17) Number of columns (including the sign column) to output is right-justified in this field. If this field is 0, as many columns as necessary are output. NO%RDX (18-35) Radix (2-36) of number being output. Returns: +1: Failure, error code in AC3. +2: Success, updated string pointer in AC1, if pertinent. Various errors are possible, including NOUTX2 (column overflow), plus those that are possible for NIN. Monitor Calls Page 92 4.7.4.3. NIN/NOUT Example Here, the user is prompted at the terminal to type in a decimal number. The number is then typed back to the user, in octal (base 8) notation, right-adjusted in a field of width 9. prompt: asciz /Decimal number: / ; The prompt. numLen=^d20 ; Length of buffer for number string. numBuf: block numLen ; The input buffer. : : reTry: hrroi t1, prompt PSOUT ; Prompt the user for a decimal number. ; Get the alleged number from the terminal into memory in string form. hrroi t1, numBuf ; Point to buffer for string that user types. movx t2, RD%BEL!numLen ; Break on CRLF, max length for typein. hrroi t3, prompt ; Reprompting text. RDTTY ; Get string, allowing editing. %jsErr (,reTry) ; Print message & reprompt on error. ; Now convert the string to a number. hrroi t1, numBuf ; Point to string representation of number. movei t3, ^d10 ; Radix for interpretation. NIN ; Number In - do the conversion. %jsErr (,reTry) ; On error, print msg and ask again. ; Now type it back, converting to octal notation. The number is still in t2. movx t1, .PRIOU ; Output is to primary output destination. movx t3, NO%MAG!NO%LFL!NO%AST!fld(^d9,NO%COL)!fld(^d8,NO%RDX) NOUT ; Type the number. %jserr (,prompt) ; On error, print message and reprompt. ; (end of program fragment) Similar monitor calls ( FLIN and FLOUT) exist for input and output of floating point numbers; see the Monitor Calls Reference Manual for these. Monitor Calls Page 93 4.7.5. Random-access i/o Tops-20 gives you the ability to randomly access files on disk storage, which is the only kind of device which supports this type of access (other than DECtape, which is now essentially defunct in "DEC land"). As described earlier, disk files can be considered to be merely a sequence of bytes, of some size from 1 to 36 bits (the normal sizes are 7, for text files, and 36, for binary, word-oriented files); the byte size of the file is determined at the time of the OPENF for the file (and can thus be changed from open to open, if you're daring). Each byte in the file has a label, or index, which is its position in the sequence, starting at zero. For example, the text file containing the 15 ascii bytes "This is a file." can be considered to be a sequence of bytes, as follows: Index Byte Value Index Byte Value 0 "T" 124 8 "a" 141 1 "h" 150 9 " " 040 2 "i" 151 10 "f" 146 3 "s" 163 11 "i" 151 4 " " 040 12 "l" 154 5 "i" 151 13 "e" 145 6 "s" 163 14 "." 056 7 " " 040 For example, the zero'th byte of the file is an ascii "T" (octal value 124), the tenth byte is an ascii "f" (octal value 146), and the fourteenth, and last, byte is a period, "." (octal value 056). Although it's not important to understanding the notion of random-access i/o, it's culturally interesting to note that disk files are stored in units of pages, or 512 36-bit words, or 2560 (512*5) 7-bit bytes. Thus, the above file, which is 15 bytes long, would occupy 3 (15/5) words, or one page. Why one page? Because pages are the minimum unit of disk file storage, and thus, the above file is "wasting" 509 words of disk space (this much of the one page it uses is unoccupied). The 512 words-per-page is a hardware parameter, which we can't do much about, so we'll accept this waste as reasonable, since files are generally many pages in length (and the wasted portion is then proportionally smaller). Tops-20 keeps track of two items, when you're doing i/o with disk files: the current byte index (the one you would read next if you did a BIN JSYS, or write next if you did a BOUT), known in JSYS jargon as the 'file pointer', and the end-of-file byte index, which is simply the length of the file, in bytes (and is thus, conceptually, the non-existent byte after the last byte in the file). When you first open the file (with an OPENF), the file pointer is set depending on how you opened the file. If you opened it for reading or writing (or both), then the file pointer will be zero, i.e., you're initially positioned to read or write the first (zero'th) byte in the file. If you opened it for appending, then the file pointer is initially set to the end-of-file byte index, i.e., you're positioned to read or write the byte right after the last real byte in the file; in the case of a BIN this first read will return a zero byte and indicate you're at end of file; in the case of a BOUT this first write will simply extend the file by one byte. For example, if you had done 15 BIN JSYS's on the sample file shown above Monitor Calls Page 94 after opening it, on the 16th BIN you would be at end-of-file (i.e., there are no more bytes to read), and Tops-20 would obey an erjmp instruction after the BIN, going to where you specified (which would presumably check for a true end-of-file condition and handle it appropriately). For a more graphic example, after 10 BIN's, say, then the situation looks like: Index Byte Value 0 "T" 124 <- byte read on the first BIN : : : 8 "a" 141 9 " " 040 <- byte just read on the 10th BIN 10 "f" 146 <- file pointer (what will be 11 "i" 151 read by the next BIN) 12 "l" 154 13 "e" 145 14 "." 056 15 -none- <- end-of-file byte index Thus, as mentioned in the previous example, when the file pointer becomes equal to the end-of-file byte index (15 in this example), end-of-file is said to be "reached". Of course, if you're writing a file, then the end-of-file byte index is normally the same as the file pointer, since the file pointer indexes the next byte to be read or written. Of course, you can mix random-access i/o with sequential i/o in any old way, and the results are very well-defined. The SFPTR JSYS simply sets the file pointer to the next byte to be read or written; once it's set, you can merrily BIN, BOUT, SIN and SOUT, and they'll take place at the file pointer position, updating the pointer appropriately. Using the sample file from before, imagine that you've done a SFPTR to the 4th byte (which is the space after "This" - why isn't it the "s" in "This"?). Then, suppose you do a BOUT of a "-" character to the file. The resulting situation will be: Index Byte Value 0 "T" 124 1 "h" 150 2 "i" 151 3 "s" 163 4 "-" 055 <- file pointer after the SFPTR 5 "i" 151 <- file pointer after the BOUT 6 "s" 163 : : : In a more 'cultural' vein, again, you should know that Tops-20 considers a disk file to be a completely extensible sequence of bytes, i.e., it is possible, and quite easy, to extend a file by simply positioning to the end-of-file position (with the SFPTR JSYS, as described below) and writing more bytes with BOUTs or SOUTs. The end-of-file byte index will be updated appropriately. Even more, it's possible to position the file pointer to some point way beyond the end-of-file index, and write more bytes. The file will simply be extended appropriately with zero bytes, up to the point at which you started writing (technically speaking, there may be some 'holes' in the file, but they will look like zero, or null, bytes when you read over them; they only matter if you don't like nulls, or are doing your own direct disk page manipulation). For example, given the sample file above, if you positioned to the 20th byte (which doesn't exist yet), and wrote the byte string "more.", then the file would simply be extended, resulting in a file with 25 bytes, with the end-of-file and file pointer equal to 25, and with the file Monitor Calls Page 95 containing This is a file.<0><0><0><0><0>more. where the <0>'s represent null (zero) bytes. There are basically four JSYS's designed for dealing with disk files in a random (not purposeless!) fashion: read the current file pointer for a file (RFPTR), set the file pointer for a file (SFPTR), read a byte from a file, given its index (RIN, for Random IN), and write a byte to a file, given the desired byte index (ROUT, for Random OUT). 4.7.5.1. RFPTR (JSYS 43) - Read File Pointer Returns the current file pointer of the specified file. Accepts in AC1: JFN of an open file. Returns: +1: Failure, error code in AC1. +2: Success, byte number (index) in AC2. Actually, you can read the file pointer of a non-disk file, but it is either meaningless or the number of bytes read so far from the file (e.g., in a tape file). RFPTR can fail in various ways, but only if you don't give it a valid JFN for an open file. 4.7.5.2. SFPTR (JSYS 27) - Set File Pointer Sets the specified file's file pointer for subsequent i/o to the file. Note that doing an SFPTR specifying a certain byte index, followed by a BIN or BOUT JSYS, is the same as doing a RIN or ROUT JSYS, respectively, specifying the same byte index. Accepts in AC1: JFN of an open disk file. AC2: byte index to which the file pointer is to be set, or, -1 to set the pointer to the current end-of-file index. Returns: +1: failure, error code in AC1. +2: success, file pointer has been set. The SFPTR JSYS can fail in various ways, but only if you don't give it a valid disk file JFN. Monitor Calls Page 96 4.7.5.3. RIN (JSYS 54) - Random byte In Inputs a byte nonsequentially (i.e., random byte input) from the specified file. The size of the byte is that given in the OPENF call for the file. Accepts in AC1: JFN of an open disk file. AC2: byte index within the file of the byte desired. Returns: +1: always, with the byte right-justified in AC2. If the end of file is reached (i.e., you specify a byte index greater than or equal to the end-of-file index), a zero is returned in AC2; this is not really an error condition, but you can catch it if you want to with an erjmp or ercal after the RIN. The file's file pointer is updated for subsequent sequential i/o to the file. Several errors are possible, but not if you give it a real JFN for a file opened in (at least) read mode (unless the file is opened for append access only, in which case you can't change the file pointer, to avoid letting you read the part of the file you're not supposed to see, e.g., a mail.txt file with protection 770404). 4.7.5.4. ROUT (JSYS 55) - Random byte Out ROUT outputs a byte nonsequentially (i.e., random byte output) to the specified file. The size of the byte is that given in the OPENF call for the file. Accepts in AC1: JFN for an open disk file. AC2: the byte to be output, right-justified. AC3: the byte index within the file at which to write the byte in AC2. Returns: +1: always, with the byte written in the file. The file byte pointer after the ROUT is C(AC3)+1 (i.e., the file pointer is set to C(AC3), the byte is written, and the file pointer updated to account for the byte just written). ROUT will always succeed if you give it a JFN for a disk file opened for writing and ask it to write at a nonnegative byte index that is less than the upper limit for the disk (you can have at most 512*512 pages (or 512*512*512 36-bit bytes) in a file); also, it may fail if you try to ROUT to a file opened in append-only mode (because that would allow you to change parts of a file you weren't supposed to access, such as a mail.txt file protected at 770404). Monitor Calls Page 97 4.8. Fork-Handling JSYS's 4.8.1. What's in a Fork? To effectively use machine language on the DEC-20, you have to understand something about what programs are, how they are born, how they live, and how they die. The term 'program' can be understood in many ways; most intuitively, because it's in the form that we understand more easily, it means a text file in some source language, such as Macro-20 or Pascal. But, this textual form of the program must be processed by a translator (an assembler or compiler) to produce a form more palatable to the actual machine. Even in this form, which is normally a '.REL' or relocatable file, it isn't quite understandable to the machine. It needs to go a further step, called linking, which involves combining it with any other program segments which it needs to interact with during execution by the machine (such as a language support package for doing input/output, etc.). Once it's linked, the result can be saved as an '.EXE' or executable file. An executable file is nothing more than a program in vestigial or 'pure potential' form, i.e., a data file with contents specifying how memory is to be filled when it is incarnated as an active, or executing, program. We can distinguish between a program in this passive state (the result of compiling, linking and saving), from a program actually in execution by the Tops-20 system. Using this distinction, we can properly call the passive element --the EXE file-- the 'program', and the active element --the part that executes the program-- the 'process' or 'fork' (the latter is jargon used when dealing with Tops-20 processes). It may be helpful to think of the fork as a self-contained machine with a well-defined 'control panel', or interface to the outside world. This control panel has a set of 'knobs' for controlling it and objects which are dealt with via 'handles'; the knobs and handles are manipulated by other forks, via JSYS's. A fork, considered in this way, is a machine with power to follow instructions, which come from the passive program (EXE file); a fork is loaded up with its instructions by pulling in the program's pages from the EXE file. Then, it can be started, temporarily stopped (frozen), continued, stopped, etc. There also must be a way to create and destroy these fork machines, and there is. The fork can, under its own power, follow the program's instructions and stop itself (it can't start itself, for obvious reasons). From the time a fork is created until it is annihilated, it is necessarily in some 'state', just like any good mechanical machine: stopped, running, waiting for something external to it to happen, etc. A Tops-20 job is nothing more than a related collection of forks; since a fork can only come into existence by being created by another fork, each fork has a superior fork (which created it) and some number of inferior forks (which it created). It's also helpful to think of a fork's superior as its parent and its inferiors as its children. The top-level fork in each job (which has no superior) is the Tops-20 EXECutive command processor, which is responsible for creating other forks and running programs in them. The EXEC is thus the super-parent fork in each job, and is created by Tops-20 when you type a control-C on an unlogged-in terminal. In a graphic form, suppose you've logged in and have run EMACS as a kept editor, written a program, gotten out Monitor Calls Page 98 of EMACS and started to compile it with MIDAS. At this point, your job's fork structure looks like: (W: Your EXEC fork) / #0 / | / | (H: EMACS) (R: MIDAS) #1 #2 For convenience' sake, we label the forks with numbers (in the order in which they were created). In fork #2, the MIDAS EXE file is being executed (R: means running), and fork #1, containing the EMACS EXE and various TECO files, is temporarily halted (H: means halted), since you're not using it currently. The EXEC fork, fork #0, is waiting for the MIDAS fork to halt itself before going on and prompting you for more commands (W: means waiting for another fork). Suppose, after running MIDAS, you Push to a new EXEC and run your program (somewhat useless, but helpful for this example). At the time your program is running, your job fork structure looks like: (W: your EXEC) / #0 \ / | \ / | \ (H: EMACS) (H: MIDAS) (W: Pushed-to EXEC) #1 #2 / #3 / / (R: your program) #4 Here you see more clearly the 'tree' structure of the forks in your job. As with most other trees encountered in computer science, this one is drawn upside-down, with the branches growing down. You can also see why Tops-20 processes are called 'forks', as the tree structure forks out at each process, just as a road forks at a junction. To elaborate more on the superior and inferior notion, fork #0 has forks #1, #2, and #3 as inferiors, and fork #3 has fork #4 as an inferior. Of course, forks #1, #2 and #3 have fork #0 as a superior, and fork #4's superior is fork #3. Notice the states of the various forks: the top-level EXEC is now waiting on the pushed-to EXEC to stop; your MIDAS fork has halted itself (which made the top EXEC go on and ask you for more commands, at which point you Pushed to the new EXEC); the EMACS fork's state hasn't changed, since you haven't told the top EXEC to do anything with it; the pushed-to EXEC is waiting for your running program to finish. Monitor Calls Page 99 4.8.2. The Fork Environment As mentioned above, each fork has a 'control panel', or interface with the external environment, i.e. the world of forks outside itself. This panel has several different areas, relating to the various kinds of interactions possible with the outside world. These areas are as follows (don't worry about the areas we haven't discussed yet): 1. The file system, each file represented by a Job File Handle (JFN); 2. The fork world, each inferior fork represented by a relative fork handle; 3. The software interrupt system, as described by various enabled or disabled channels and their handler address, as well as the global state of the interrupt system for the fork; 4. Inter-process communication system, each channel of communication represented by a Process ID (PID); 5. Inter-process sychronization system, each sychronization request represented by outstanding Enqueue request (ENQs); 6. The user-terminal interface, consisting of various mode information for the terminal apropos this fork; Note that each area on the 'control panel' has a method of referencing the various external objects related to the area; these are called 'handles' in general (Job File Handles, Relative Fork Handles, Process ID Handles, etc.). This is no coincidence: you have to refer to one of several objects --forks, files, process IDs, etc.-- when you want to do something with them --open a file, start a fork, send a message to another process, etc.-- and thus the need for a handle. A handle is usually nothing more than a small integer uniquely identifying the particular object you want to deal with. 4.8.3. Basic Fork-Handling JSYS's With the above introduction to forks and their environment, we can proceed to the two most basic fork-handling JSYS's: RESET and HALTF. 4.8.3.1. RESET (JSYS 147): Reset the current fork This JSYS, in the terms of the previous section, cleans up the interface to the external environment for this fork. It's always a good idea to use this JSYS at the start of your program to make sure there are no 'loose ends' lying around. Returns: +1: Always (no errors are possible). The RESET JSYS (cf. the list above in section 4.8.2): 1. closes all files for this fork (and any inferiors) and releases all Monitor Calls Page 100 Job File Numbers (JFNs); 2. destroys all inferior forks and releases any Relative Fork Handles it can; 3. resets the software interrupt system for this fork; 4. releases all handles on inter-process communication channels (PIDs); 5. releases all inter-process sychronization requests (ENQs); 6. resets the terminal for this process to wake up on every character, echo input, and translate output normally; 4.8.3.2. HALTF (JSYS 170) - Halt the current fork Halts this fork and any forks inferior to this one; this fork (the one executing this JSYS) then goes into the 'halted' state (with a 'voluntarily terminated' flavor). If this fork is resumed (with a RFORK JSYS), then it will continue with the instruction after this HALTF JSYS. Returns: +1: only if this fork is later resumed (no errors are possible from the HALTF itself). 4.8.3.3. Examples of RESET and HALTF These two JSYS's are fairly simply used in practice. An example of a complete MACRO program that prints "I'm here" and stops is: title ImHere search monsym ImHere: RESET ; Tidy up the environment. hrroi 1, [asciz/I'm here/] ; Print our PSOUT ; message to the terminal, HALTF ; and stop this fork. jrst ImHere ; If this fork is continued, start over. end ImHere 4.9. Miscellaneous JSYS's Monitor Calls Page 101 4.9.1. STCMP (JSYS 540) - STring CoMParison Compares two ASCIZ strings in memory. Letters are always considered as upper case regardless of their case within the string; e.g. "ABC" and "abc" are considered an exact match. Accepts in AC1: Pointer to test string. AC2: Pointer to base string. Returns: +1: always, with AC1 containing the compare code: SC%LSS (bit 0) Test string less than base string. SC%SUB (bit 1) Test string is a subset of base string. SC%GTR (bit 2) Test string greater than base string. No bits are set in AC1 if the strings are equal. AC2 containing base string pointer, updated such that an ildb instruction will reference the first nonmatching byte. One string is considered less than another string if the ASCII value of the first nonmatching character in the first string is less than the ASCII value of the character in the same position in the second string. On string is considered a subset of another string if both of the following conditions are true: 1. From left to right, the ASCII values of the characters in corresponding positions are the same. 2. The test string is shorter than the base string. Two strings are considered equal if the ASCII values of the characters in corresponding positions are the same and the two strings are the same size. In this case, the contents of AC1 is 0 on return. Example: move t1, [point 7, string1] ; Pointers to test string move t2, [point 7, [asciz /FOO/]] ; and base string. STCMP ; Compare them. jumpe t1, equal ; Do this if they're equal. txnn t1, SC%LSS!SC%SUB ; Not equal. Test string less? jrst greater ; No, greater - go handle. ; Get here only if test string lexically less than base string. : : COMND JSYS Page 102 5. The COMND JSYS - JSYS 544 The COMND JSYS is probably the single most attractive feature of the DECSYSTEM-20. It allows any interactive program to be totally (and automatically) fault-tolerant and helpful to its users. The pain involved in learning to use it effectively is worth suffering, but should probably be deferred until you have become comfortable with the instruction set, Macro-20, and the monitor calls described in Chapter 4. The following introductory material was adapted for use in this manual from a document written by Andrew R. Lowry and David S. Millman at Columbia in August 1978. 5.1. Informal Introduction You are probably already at least partially familiar with the workings of the COMND jsys, although you may not be aware of your education. The COMND jsys is the thing that figures out what you mean when you abbreviate your commands to the EXEC, or when the EXEC finishes up your commands for you when you types an ESCAPE character. It is also what types guide words in parentheses telling you what you must type next, and it is the thing that will untiringly answer your question marks with lists of alternatives for you to choose from. In short, it is one of the things that makes learning and typing in commands to the EXEC as easy as it is. Using the COMND jsys mainly consists of the following: you tell it what you're looking for, and it tells you what it finds. All the complexities have to do with how this communication takes place. A great deal of this is common to all the calls you can give to COMND, but some of it depends on exactly what you're asking COMND to look for. All the information which you must supply to COMND may be broken down basically into two sets, the Command State Block (CSB) and the Function Descriptor Block (FDB). The CSB is ten words (storage locations) long and contains information about what has been typed so far, how much more can be typed without overflowing the space that has been set aside to store it, and where COMND can find other information that it needs. Most of this information must be supplied only once by the program using COMND, and from that point on COMND will update the information as it goes. The FDB is four words long and contains information which is more specific to the type of call which is being made than is the information in the CSB. It is here that COMND finds out what type of information it should be looking for -- file name, time of day, one out of a list of possible keywords, or any other of the 24 different items it knows how to look for. Also in the FDB it finds pointers to a help string and a default string which the controlling program supplies. The help string is what will be typed if the user types a question mark, and the default string is what COMND will fill in if the user types an escape character. Finally, COMND finds several indicators in the FDB telling it how to process the request -- things like whether or not to convert lower case input into upper case, whether or not to accept indirect file specifications, etc. COMND JSYS Page 103 The only thing left to do after the CSB and the FDB have been properly filled in is to tell COMND where they are and let it go do its work. The only questions now are, what does it do, and what are all the things it might give back? When COMND is invoked, it starts accepting characters, usually from the job's controlling terminal. It keeps taking characters until the user types what is called an "action" chatacter. Action characters include question mark, escape, ctrl/f and carriage return, and they are called action characters because COMND won't start doing what it is supposed to do until one of them is typed. Once an action character is encountered, COMND will determine whether or not the user has finished typing what s/he was supposed to type, and if so, will return control to the calling program. This can be done in any of three possible states. If COMND was able to interpret what was typed, and decided that it was an appropriate response, then it returns normally with the data that it received stored in a place where the program can easily retrieve it. If COMND was unable to understand what was typed in the context of what it was told to look for, then it will return with an error indicator set so that the program will know that something went wrong. The third possibility is best shown by example. Suppose a program desires to obtain from the user first a file name, then a time of day. This will involve two calls to COMND. When the first call is executed, suppose the user (call him Fred) types in FOO.BAR as his file name. COMND accepts this as a valid file name and returns normally to the calling program, which then executes COMND again, this time asking for a time of day. At this point, Fred decides that he really didn't want to use FOO.BAR, but wanted FOO.BAZ instead. So he deletes back to the R and types in a Z, and then he goes ahead and types in a time of day as required. Now it looks like trouble ... COMND has already told the program that Fred wanted FOO.BAR, and now it needs some way to let the program know that he changed his mind. In COMND jargon, we say that a "reparse" is needed. COMND sets a flag to let the program know what has happened, and now the program must start right from the beginning and reissue all the calls that it made to COMND for the current command line (this rather vague term will be firmed up shortly). Thus it redoes the call to get a file name, and COMND fills in FOO.BAZ this time, and then it continues with the time of day call just like before. If there had been other calls on this command line before the file name, they too would have to be reissued in the same order as they originally were called. The term "command line" was used a few times in the preceeding paragraph, and although it may have a fairly intuitive meaning, it also has a very strict meaning in the context of the COMND jsys, which will be explained now and should be kept in mind when trying to understand the reparse mechanism. One of the calls that can be made to COMND is called "CMINI" (each different call has its own slightly mnemonic name). The CM signifies that it has something to do with COMND, and the INI stands for "INItialize". This call, unlike the rest of the calls, accepts no input from the terminal. It is used to set up initial values in the CSB and to type out the command line prompt. The end of this prompt is the farthest back that a user may delete when typing the prompted-for command. A "command line", then, consists of all responses to COMND calls made between COMND JSYS Page 104 successive CMINI's. For a reparse, the program reissues COMND calls starting with the one after the CMINI call that initiated the command line. The CMINI call should not be reissued. That would be done if an error were detected and the entire command line had to be restarted. The difference is that when CMINI is done, COMND forgets everything that was previously contained in the buffer. This is appropriate for a complete restart after an error, but not for merely backing up on a reparse. A final general topic is the concept of multiple function calls. This refers to giving COMND more than one alternative to look for. For instance, it may be equally acceptable at a particular point in a command line for the user to type either a date, or simply a decimal integer, specifying possibly a number of days from the current date. It would be advantageous for COMND to know about the alternatives and not have to return an error if the second alternative were chosen. To accomplish this, it is possible to have several FDB'S all linked together in a chain. The first one contains a pointer to the second, which contains a pointer to the third, etc. Then if COMND can't make sense of what is typed in the context of the first FDB, it goes on and tries the next FDB, and so on right down the line until it gets to an FDB that does not point to another one. At that point, if none of the FDB's succeeded, an error condition is finally signalled and COMND returns. If one of the FDB's does fit, then not only is the normal data returned, but also the location of the succeeding FDB so that the program will be able to determine what ultimately happened. The same procedures apply to multiple FDB's as to single ones in regard to reparsing, error restarts, etc. Remember that when using multiple FDB's only one input item is parsed, not several. The multiple nature comes from the fact that COMND is given several choices as to how it should attempt to interpret that item. Note that when using multiple FDB's, only one CSB is used. This fact underlines the main difference between the natures of information stored in the two blocks, the CSB being fairly call-independent, while almost all of the information in the FDB depends on exactly what call is being made. At this point you should have an idea of what is involved in getting the COMND jsys to do your work for you. Various people have written comprehensive sets of macros or UUO's to simplify use of the COMND JSYS. Such macros appear in MACSYM, CUsym, and elsewhere. A sample program is included in Chapter 11, and a description of Columbia's COMND macro/UUO package is included in 7.1. This remainder of the Chapter is taken intact from the Tops-20 v4 Monitor Calls Reference Manual. 5.2. General Information COMND - JSYS 544 Parses one field of a command that is either typed by a user or contained in a file. When this monitor call is used to read a command from a terminal, it provides the following features: COMND JSYS Page 105 1. Allows the input of a command (including the guide words) to be given in abbreviated, recognition (ESC and CTRL/F), and/or full input mode. 2. Allows the user to edit his input with the DELETE, CTRL/U, CTRL/W, and CTRL/R editing keys. 3. Allows fields of the command to be defaulted if an ESC or CTRL/F is typed at the beginning of any field or if a field is omitted entirely. 4. Allows a help message to be given if a question mark (?) is typed at the beginning of any field. 5. Allows input of an indirect file (@file) that contains the fields for all or the remainder of the command. 6. Allows a recall of the correct portion of the last command (i.e., up to the beginning of the field where an error was detected) if the next command line begins with CTRL/H. The correct portion of the command is retyped, and the user can then continue typing from that point. 7. Allows input of a line to be continued onto the next line if the user types a hyphen (-) immediately preceding a carriage return. (The carriage return is invisible to the program executing the COMND call, although it is stored in the text buffer.) The hyphen can be typed by the user while he is typing a comment. The comment is then continued onto the next line. The COMND call allows the command line that is input to contain a comment if the comment is preceded by either an exclamation point or a semicolon and the previous field has been terminated. When the COMND call inputs an exclamation point after a field that has been terminated, it ignores all text on the remainder of the line or up to the next exclamation point. When the COMND call inputs a semicolon after a field that has been terminated, it ignores all text on the remainder of the line. When an indirect file is given on the command line, it can be given at the beginning of any field. However, it must be the last item typed on the line, and its contents must complete the current command. The user must terminate his input of the indirect file (after any recognition is performed) with a carriage return. If he does not terminate his input, the message ?INDIRECT FILE NOT CONFIRMED is output. Also, if the user types a question mark (instead of the file specification of the indirect file) after he types the @ character, the message FILESPEC OF INDIRECT FILE is output. The indirect file itself should not contain an ESC or carriage return; if these characters are included, they will be treated as spaces. The contents of the indirect file are placed in the text buffer but are not typed on the user's terminal. As the user types his command, the characters are placed in a command text buffer. This buffer can also include the command line prompt, if any. Several byte pointers and counts reflect the current state of the parsing of the command. These pointers and counts are as follows: 1. Byte pointer to the beginning of the prompting-text buffer (.CMRTY). This pointer is also called the CTRL/R buffer byte pointer since it indicates the initial part of the text that will COMND JSYS Page 106 be output on a CTRL/R. (The remainder of the text output on a CTRL/R is what the user had typed before he typed CTRL/R.) The buffer containing the prompt need not be contiguous with the buffer containing the remainder of the command line. Typically this pointer is to a string in the literals area. 2. Byte pointer to the beginning of the user's input (.CMBFP). This is the limit back to which the user can edit. 3. Byte pointer to the beginning of the next field to be parsed (.CMPTR). 4. Count of the space remaining in the text input buffer (.CMCNT). 5. Count of the number of characters in the buffer that have not yet been parsed (.CMINC). The illustration below is a logical arrangement of the byte pointers and counts. Remember that the prompting-text buffer does not have to be adjacent to the text buffer. .CMCNT !=======================================================! ! ! ! ! ! ! ! ! ! ! !=======================================================! ^ ^ ^ ! ! ! ! ! ! .CMINC ! ! ! ! ! ! ! .CMBFP .CMPTR .CMRTY These byte pointers and other information are contained in a command state block, whose address is given as an argument to the COMND monitor call. The .CMINI function initializes these pointers. Parsing of a command is performed field by field and by default begins when the user types a carriage return, ESC, CTRL/F, or question mark. These characters are called action characters because they cause the system to act on the command as typed so far. A field can also be terminated with a space, tab, slash, comma, or any other nonalphanumeric character. Normally, the parsing does not begin, and the COMND call does not return control to the program, until an action character is typed. However, if B8(CM%WKF) is on in word .CMFLG when the COMND call is executed, parsing begins after each field is terminated. The command is parsed by repeated COMND calls. Each call specifies the type of field expected to be parsed by supplying an appropriate function code and any data needed for the function. This information is given in a function descriptor block. On successful completion of each call, the current byte pointers and the counts are updated in the command state block, and any data COMND JSYS Page 107 obtained for the field is returned. The program executing the COMND call should not reset the byte pointers in the command state block after it completes the parsing of each command. It should set up the state block once at the beginning and then use the .CMINI function when it begins parsing each line of a command. This is true because the .CMINI function implements the CTRL/H error recovery feature in addition to initializing the byte pointers in the state block and printing the prompt for the line. If the program resets the pointers, the CTRL/H feature is not possible because the pointers from the previous command are not available. When a CTRL/H is input, the .CMINI function allows error recovery from the last command only if both (1) the pointer to the beginning of the user's input (.CMBFP) is not equal to the pointer to the beginning of the next field to be parsed (.CMPTR) and (2) the last character parsed in the previous command was not an end-of-line character. The design of the COMND call allows the user to delete his typed input with the DELETE, CTRL/W, and CTRL/U keys without regard to field boundaries. When the user deletes into a field that has already been parsed, the COMND call returns to the program with B3(CM%RPT) set in word .CMFLG. This return informs the program to forget the current state of the command and to reparse from the beginning of the line. Because the complete line as typed and corrected by the user is in the text buffer, the parse can be repeated and will yield the same result up to the point of the change. The calling sequence to the COMND call is as follows: ACCEPTS IN AC1: address of the command state block AC2: address of the first alternative function descriptor block RETURNS +1: always (unless a reparse is needed and the right half of .CMFLG is nonzero), with AC1 containing flags in the left half, and the address of the command state block in the right half. The flags are copied from word .CMFLG in the command state block. AC2 containing either the data obtained for the field or an error code if the field could not be parsed (CM%NOP is on). AC3 containing in the left half the address of the function descriptor block given in the call, and in the right half the address of the function descriptor block actually used (i.e., the one that matched the input). The format of the command state block is shown below. COMND JSYS Page 108 0 17 18 35 !=======================================================! .CMFLG ! Flag Bits ! Reparse Dispatch Address ! !-------------------------------------------------------! .CMIOJ ! Input JFN ! Output JFN ! !-------------------------------------------------------! .CMRTY ! Byte Pointer to CTRL/R Text ! !-------------------------------------------------------! .CMBFP ! Byte Pointer to Start of Text Buffer ! !-------------------------------------------------------! .CMPTR ! Byte Pointer to Next Input To Be Parsed ! !-------------------------------------------------------! .CMCNT ! Count of Space Left in Buffer ! !-------------------------------------------------------! .CMINC ! Count of Characters Left in Buffer ! !-------------------------------------------------------! .CMABP ! Byte Pointer to Atom Buffer ! !-------------------------------------------------------! .CMABC ! Size of Atom Buffer ! !-------------------------------------------------------! .CMGJB ! Address of GTJFN Argument Block ! !=======================================================! Command State Block Word (Symbol) Meaning .CMFLG (0) Flag bits in the left half, and the reparse dispatch address in the right half. Some flag bits can be set by the program executing the COMND call; others can be set by the COMND call after its execution. The bits that can be set by the program are described following the Command State Block description. The bits that can be set by COMND are described following the Function Descriptor Block description. The reparse dispatch address is the location to which control is automatically transferred when a reparse of the command is needed because the user edited past the current pointer (i.e., the user edited characters that were already parsed). If this field is zero, the COMND call sets B3(CM%RPT) in the left half of this word and gives the +1 return when a reparse is needed. The program must then test CM%RPT and, if on, must reenter the code that parses the first field of the command. When the reparse dispatch address is given, control is transferred automatically to that address. The code at the reparse dispatch address should initialize the program's state to what it was after the last .CMINI function. This initialization should include resetting the stack pointer, closing and releasing any JFNs acquired since the last .CMINI function, and transferring control to the code immediately following the last .CMINI function call. .CMIOJ (1) Input JFN in the left half, and output JFN in the right half. These designators identify the source for the input of the command and the destination for the output of the typescript. COMND JSYS Page 109 These designators are usually .PRIIN (for input) and .PRIOU (for output). .CMRTY (2) Byte pointer to the beginning of the prompting-text. .CMBFP (3) Byte pointer to the beginning of the user's input. The user cannot edit back past this pointer. .CMPTR (4) Byte pointer to the beginning of the next field to be parsed. .CMCNT (5) Count of the space remaining in the buffer after the .CMPTR pointer. .CMINC (6) Count of the number of unparsed characters in the buffer after the .CMPTR pointer. .CMABP (7) Byte pointer to the atom buffer, a temporary storage buffer that contains the last field parsed by the COMND call. The terminator of the field is not placed in this buffer. The atom buffer is terminated with a null. .CMABC (10) Count of the number of characters in the atom tbuffer. This count should be at least as large as the largest field expected to be parsed. .CMGJB (11) Address of a GTJFN argument block. This block must be at least 16(octal) words long and must be writable. If a longer GTJFN block is being reserved, the count in the right half of word .GJF2 of the GTJFN argument block must be greater than four. This block is usually filled in by the COMND call with arguments for the GTJFN call if the specifified function is requesting a JFN (i.e., functions .CMIFI, .CMOFI, and .CMFIL). The user should store data in this block on the .CMFIL function only. The flag bits that can be set by the user in the left half of word .CMFLG in the Command State Block are described below. These bits apply to the parsing of the entire command and are preserved by COMND after execution. See the end of the COMND JSYS discussion for the bits that are returned by COMND in the left half of word .CMFLG. 5.3. Bits Supplied in State Block on COMND Call Symbol (bit) Meaning CM%RAI (6) Convert lowercase input to uppercase. CM%XIF (7) Do not recognize the @ character as designating an indirect file; instead consider the character as ordinary unctuation. A program sets this bit to prevent the input of an indirect file. CM%WKF (8) Begin parsing after each field is terminated instead of only after an action character (carriage return, ESC, CTRL/F, question mark) is typed. For example, a program sets this bit if it must change terminal characteristics (e.g., it must turn COMND JSYS Page 110 off echoing because a password may be input) in the middle of a command. However, use of this bit is not recommended because terminal wakeup occurs after each field is terminated, thereby increasing system overhead. The recommended method of changing terminal characteristics within a command is to input the field requiring the special characteristic on the next line with its own prompt. For example, if a program is accepting a password, it should turn off echoing after the .CMCFM function of the main command and perform the .CMINI function to type the prompt requesting a password on the next line. The format of the function descriptor block is shown below. 0 8 9 17 18 35 !=======================================================! ! function ! function ! address of next function ! .CMFNP! code ! flags ! descriptor block ! !-------------------------------------------------------! .CMDAT! Data for specific function ! !-------------------------------------------------------! .CMHLP! Byte pointer to help text for field ! !-------------------------------------------------------! .CMDEF! Byte pointer to default string for field ! !-------------------------------------------------------! .CMBRK! Pointer to 4-word break mask ! !=======================================================! 5.4. Function Descriptor Block Symbol (word) Meaning .CMFNP (0) Function code and pointer to next function descriptor block (FDB). B0-B8(CM%FNC) Function code B9-B17(CM%FFL) Function-specific flags B18-B35(CM%LST) Address of the next FDB .CMDAT (1) Data for the specific function, if any. .CMHLP (2) Byte pointer to the help text for this field. This word can be zero if the program is not supplying its own help text. CM%HPP must be set (in word 0) in order for this pointer to be used. .CMDEF (3) Byte pointer to the default string for this field. This word can be zero if the program is not supplying its own default string. .CMBRK (4) Pointer to a 4-word break mask that specifies which characters constitute end of field. Word .CMBRK is ignored unless CM%BRK (B13) is on. The individual words in the function descriptor block are described in the COMND JSYS Page 111 following paragraphs. 5.4.1. Words .CMFNP and .CMDAT of the FDB Word .CMFNP contains the function code for the expected field to be parsed, and word .CMDAT contains any additional data needed for that function. The function codes, along with any required data for the functions, are described below. Symbol (code) Meaning .CMKEY (0) Parse a keyword, such as a command name. Word .CMDAT contains the address of a keyword symbol table in the format described in the TBLUK monitor call description (i.e., alphabetical). The data bits that can be defined in the right half of the first word of the argument pointed to by the table entries (when B0-B6 of the first word are off and B7(CM%FW) is on) are as follows: B35(CM%INV) Suppress this keyword in the list output on a ?. The program can set this bit to include entries in the table that should be invisible because they are not preferred keywords. For example, this bit can be set to allow the keyword LIST to be valid, even though the preferred keyword may be PRINT. The LIST keyword would not be listed in the output given on a ?. This bit is also used in conjunction with the CM%ABR bit to suppress an abbreviation in the output given on a ?. B34(CM%NOR) Do not recognize this keyword even if an exact match is typed by the user and suppress its listing in the list output on a ?. (Refer to the TBLUK call description for more information on using this bit.) B33(CM%ABR) Consider this keyword a valid abbreviation for another entry in the table. The right half of this table entry points to the keyword for which this is an abbreviation. The program can set this bit to include entries in the table that are less than the minimum unique abbreviation. For example, this bit can be set to include the entry ST (for START) in the table. If the user then types ST as a keyword, it will be accepted as a valid abbreviation even though there may be other keywords beginning with ST. To suppress the output of this abbreviation in the list typed on a ?, the program must also set the CM%INV bit. On a successful return from .CMKEY, AC2 contains the address of the table entry where the keyword was found. COMND JSYS Page 112 CMNUM (1) Parse a number. Word .CMDAT contains the radix from 2 to 10) of the number. On a successful return, AC2 contains the number. .CMNOI (2) Parse a guide word string, but do not return an error if no guide word is input. An error is returned only if a guide word is input that does not match the one expected by the COMND call. A guide word field must be delimited by parentheses. Word .CMDAT contains a byte pointer to an ASCIZ string. This string does not contain the parentheses of the guide word. Guide words are output if the user terminated the previous field with ESC. Guide words are not output, nor can they be input, if the user has caused parsing into the next field. .CMSWI (3) Parse a switch. A switch field must begin with a slash and can be terminated with a colon in addition to any of the legal terminators. Word .CMDAT contains the address of a switch keyword symbol table. (Refer to the TBLUK monitor call description for the format of the table.) The entries in the table do not contain the slash of the switch keywords; however, they should end with a colon if the switch requires a value. The data bits CM%INV, CM%NOR, and CM%ABR defined for the .CMKEY function can also be set on this function. On a successful return, AC2 contains the address of the table entry where the switch keyword was found. .CMIFI (4) Parse an input file specification. This function causes the COMND call to execute a GTJFN call to attempt to parse the specification for an existing file, using no default fields. The .CMGJB address (word 11 in the command state block) must be supplied, but the GTJFN block should be empty. (Data stored in the block will be overwritten by the COMND JSYS. Also, certain GTJFN flags are set.) On a successful return, AC2 contains the JFN assigned. Hyphens are treated as alphanumeric characters for this function See note following .CMFIL function. .CMOFI (5) Parse an output file specification. This function causes the COMND call to execute a GTJFN call to attempt to parse the specification for either a new or an existing file. The default generation number is the generation number of the existing file plus 1. The .CMGJB address must be supplied, but the GTJFN block should be empty. (Data stored in the block will be overwritten by the COMND JSYS. Also, certain GTJFN flags are set.) On a successful return, AC2 contains the JFN assigned. Hyphens are treated as alphanumeric characters for this function. See note following .CMFIL function. .CMFIL (6) Parse a general (arbitrary) file specification. This function causes the COMND call to execute a GTJFN to attempt to parse the specification for the file. The .CMGJB address must be supplied, but data stored in certain words of the GTJFN block will be overwritten by the COMND JSYS and certain GTJFN flags will be set (see note below). On a successful return, AC2 COMND JSYS Page 113 contains the JFN assigned. Hyphens are treated as alphanumeric characters for this function. Note that portions of the GTJFN block used by functions .CMOFI, .CMIFI, and .CMFIL are controlled by COMND. The following list shows which words are under the control of COMND and which words are under the control of the user: GTJFN Controlled Characteristics Word(s) by .GJGEN COMND 1. .CMOFI sets flags GJ%FOU, GJ%MSG, and GJ%XTN and clears all other flags. 2. .CMIFI sets flag GJ%OLD, and GJ%XTN and clears all other flags. 3. .CMOFI and .CMIFI zero the right half of word .GJGEN 4. .CMFIL sets flag GJ%XTN and clears GJ%FCM .GJSRC COMND None .GJDEV - .GJJFN COMND/ USER Functions .CMIFI AND .CMOFI give COMND control of these words. .CMFIL gives the user control of these words. .GJF2 - .GJATR COMND None .CMFLD (7) Parse an arbitrary field. This function is useful for fields not normally handled by the COMND call. The input, as delimited by the first nonalphanumeric character, is copied into the atom buffer; the delimiter is not copied. Note the following: 1. This function will parse a null field 2. Hyphens are treated as alphanumeric characters for this function 3. No validation is performed (such as filename validation) 4. No standard help message is available (see below) COMND JSYS Page 114 5. The FLDBK. and BRMSK. macros may be used for including other characters in the field (like "*"). .CMCFM (10) Confirm. This function waits for the user to confirm the command with a carriage return and should be used at the end of parsing a command line. .CMDIR (11) Parse a directory name. Login and files-only directories are allowed. Word .CMDAT contains data bits for this function. The currently defined bit is as follows: B0(CM%DWC) Allow wildcard characters On a successful return, AC2 contains the 36-bit directory number. .CMUSR (12) Parse a user name. Only login directories are allowed. On a successful return, AC2 contains the 36-bit user number. .CMCMA (13) Comma. Sets B1(CM%NOP-no parse) in word .CMFLG of the command state block and returns if a comma is not the next item in the input. Blanks can appear on either side of the comma. This function is useful for parsing a list of arguments. .CMINI (14) Initialize the command line (e.g., set up internal monitor pointers, type the prompt, and check for CTRL/H). This function should be used at the beginning of parsing a command line but not when reparsing a line. Otherwise, the CTRL/H feature will not work. To use this function, the user first moves the appropriate data into the command state block and then issues .CMINI. If, at any time during the parsing of a line, an error occurs, .CMINI is issued again to reinitialize the line. However, for the 2'nd thru N'th invocation of .CMINI for a given line, the user should not alter the byte pointers and character counts in the command state block. To do so would disable the CTRL/H feature. This feature allows the user program, on parsing a bad atom, to print an error message, reissue the prompt, and parse the command line again without forcing the user to retype the entire line. If .CMINI reads a CTRL/H character, .CMINI will reset all byte pointers and character counts except the .CMINC count to their original state. .CMINI will set the .CMINC count to the number of characters in the buffer up to the bad atom. These characters are output to the terminal and parsed again. Control then passes to the reparse address (if provided) and normal parsing resumes. The effect on the program is as if the bad atom had never been typed. .CMFLT (15) Parse a floating-point number. On a successful return, AC2 contains the floating-point number. .CMDEV (16) Parse a device name. On a successful return, AC2 contains the device designator. .CMTXT (17) Parse the input text up to the next carriage return, place the COMND JSYS Page 115 text in the atom buffer, and return. If an ESC or CTRL/F is typed, it causes the terminal bell to ring (because recognition is not available with this function) and is otherwise ignored. If a ? is typed, an appropriate response is given, and the ? is not included in the atom buffer. (A ? can be included in the input text if it is preceded by a CTRL/V.) .CMTAD (20) Parse a date and/or time field according to the setting of bits CM%IDA and CM%ITM. The user must input the field as requested. Any date format allowed by the IDTIM call can be input. If a date is not input, it is assumed to be the current date. If a time is not input, it is assumed to be 00:00:01. When both the date and time fields are input, they must be separated by one or more spaces. If the fields are input separately, they must be terminated with a space or carriage return. Word .CMDAT contains bits in the left half and an address in the right half as data for the function. The bits are: B0(CM%IDA) Parse a date B1(CM%ITM) Parse a time B2(CM%NCI) Do not convert the date and/or time to internal format. The address in the right half is the beginning of a 3-word block in the caller's address space. On a successful return, this block contains data returned from the IDTNC call executed by COMND if B2(CM%NCI) was on in the COMND call (i.e., if the input date and/or time field was not to be converted to internal format). If B2(CM%NCI) was off in the COMND call, on a successful return, AC2 contains the internal date and time format. .CMQST (21) Parse a quoted string up to the terminating quote. The delimiters for the string must be double quotation marks and are not copied to the atom buffer. A double quotation mark is input as part of the string if two double quotation marks appear together. This function is useful if the legal field terminators and the action characters are to be included as part of a string. The characters ?, ESC, and CTRL/F are not treated as action characters and are included in the string stored in the atom buffer. Carriage return is an invalid character in a quoted string and causes B1(CM%NOP) to be set on return. .CMUQS (22) Parse an unquoted string up to one of the specified break characters. Word .CMDAT contains the address of a 4-word block of 128 break character mask bits. (Refer to word .RDBRK of the TEXTI call description for an explanation of the mask.) The characters scanned are not placed in the atom buffer. On return, .CMPTR is pointing to the break character. This function is useful for parsing a string with an arbitrary delimiter. The characters ?, ESC, and CTRL/F are not treated as action characters (unless they are specified in the mask) and can be included in the string. Carriage return can also be included if it is not one of the specified break characters. COMND JSYS Page 116 .CMTOK (23) Parse the input and compare it with a given string ("token"). Word .CMDAT contains the byte pointer to the given string. This function sets B1(CM%NOP) in word .CMFLG of the command state block and returns if the next input characters do not match the given string. Leading blanks in the input are ignored. This function is useful for parsing single or multiple character operators (e.g., + or **). .CMNUX (24) Parse a number and terminate on the first non-numeric character. Word .CMDAT contains the radix (from 2 to 10) of the number. On a successful return, AC2 contains the number. This function is useful for parsing a number that may not be terminated with a nonalphabetic character (e.g., 100PRINT FILEA). Note that non-numeric identifiers can begin with a digit (e.g., 1SMITH as a user name). When a non-numeric identifier and a number appear as alternates for a field, the order of the function descriptor blocks is important. The .CMNUX function, if given first, would accept the digit in the non-numeric identifier as a valid number instead of as the beginning character of a non-numeric identifier. .CMACT (25) Parse an account string. The input, as delimited by the first nonalphanumeric character, is copied into the atom buffer; the delimiter is not copied. No verification is performed nor is any standard help message available. .CMNOD (26) Parse a network node name. A node name consists of up to six alphanumeric characters followed by 2 colons ("::"). Lowercase characters are converted to uppercase characters. The node name is copied into the atom buffer without the colons. Note that this function does not verify the existence of the node. In addition to the .CMFNP word of the function descriptor block containing the function code in bits 0-8 (CM%FNC), this word also contains function-specific flag bits in bits 9-17 (CM%FFL) and the address of another function descriptor block in bits 18-35 (CM%LST). The flag bits that can be set in bits 9-17 (CM%FFL) are as follows: Symbol (bit) Meaning CM%PO (14) The field is to be parsed only and the field's existence is not to be verified. This bit currently applies to the .CMDIR and .CMUSR functions and is ignored for the remaining functions. On return, COMND sets B1(CM%NOP-no parse) only if the field typed is not in the correct syntax. Also, data returned in AC2 may not be correct. CM%HPP (15) A byte pointer to a program-supplied help message for this field is given in word 2 (.CMHLP) of this function descriptor block. CM%DPP (16) A byte pointer to a program-supplied default string for this field is given in word 3 (.CMDEF) of this function descriptor block. COMND JSYS Page 117 CM%SDH (17) The output of the default help message is to be suppressed if the user types a question mark. (See below for the default messages.) The address of another function descriptor block can be given in bits 18-35 (CM%LST) of the .CMFNP word. The use of this second descriptor block is described below. Usually one COMND call is executed for each field in the command. However, for some fields, more than one type of input may be possible (e.g., after a keyword field, the next field could be a switch or a filename field). In these cases, all the possibilities for a field must be tried in an order selected to test unambiguous cases first. When the COMND call cannot parse the field as indicated by the function code, it does one of two things: 1. It sets the current pointer and counts such that the next call will attempt to parse the same input over again. It then returns with B1(CM%NOP) set in the left half of the .CMFLG word in the command state block. The caller can then issue another COMND call with a function code indicating another of the possible fields. After the execution of each call, the caller should test the CM%NOP flag to see if the field was parsed successfully. 2. If an address of another function descriptor block is given in CM%LST, the COMND call moves to this descriptor block automatically and attempts to parse the field as indicated by the function code contained in B0-B8(CM%FNC) in word .CMFNP of that block. If the COMND call fails to parse the field using this new function code, it moves to a third descriptor block if one is given. This sequence continues until either the field is successfully parsed or the end of the chain of function blocks is reached. Upon completion of the COMND call, AC3 contains the addresses of the first and last function blocks used. By specifying a chained list of function blocks, the program can have the COMND call automatically check all possible alternatives for a field and not have to issue a separate call for each one. In addition, if the user types a question mark, a list is output of all the alternatives for the field as indicated by the list of function descriptor blocks. 5.4.2. Word .CMHLP of the FDB This word contains a byte pointer to a program-supplied help text to be output if the user types a question mark when entering his command. The default help message is appended to the output of the program-supplied message if B17(CM%SDH) is not set. If B17(CM%SDH) is set, only the program-supplied message is output. If this word in the descriptor block is zero, only the default message is output when the user types a question mark. Bit 15(CM%HPP) must be set in word 0 (.CMFNP) of the function descriptor block for this pointer to be used. The default help message depends on the particular function being used to parse the current field. The table below lists the default help message for each function available in the COMND call. COMND JSYS Page 118 5.4.3. Default Help Messages Function Message .CMKEY (keyword) ONE OF THE FOLLOWING followed by the alphabetical list of valid keywords. If the user types a question mark in the middle of the field, only the keywords that can possibly match the field as currently typed are output. If no keyword can possibly match the currently typed field, the message KEYWORD (NO DEFINED KEYWORDS MATCH THIS INPUT) is output. .CMNUM (number) The help message output depends on the radix specified in .CMDAT in the descriptor block. If the radix is octal, the help message is OCTAL NUMBER If the radix is decimal, the help message is DECIMAL NUMBER If the radix is any other radix, the help message is A NUMBER IN BASE nn where nn is the radix. .CMNOI (guide word) None .CMSWI (switch) ONE OF THE FOLLOWING followed by the alphabetical list of valid switch keywords. The same rules apply as for .CMKEY function. (See above.) .CMIFI (input file) The help message output depends on the .CMOFI (output file) settings of certain bits in the GTJFN call. .CMFIL (any file) If bit GJ%OLD is off and bit GJ%FOU is on, the help message is OUTPUT FILESPEC Otherwise, the help message is INPUT FILESPEC .CMFLD (any field) None .CMCFM (confirm) CONFIRM WITH CARRIAGE RETURN .CMDIR (directory) DIRECTORY NAME .CMUSR (user) USER NAME .CMCMA (comma) COMMA .CMINI (initialize) None .CMFLT (floating point) NUMBER .CMDEV (device) DEVICE NAME .CMTXT (text) TEXT STRING .CMTAD (date) The help message depends on the bits set in .CMDAT in the descriptor block. If CM%IDA is set, the help message is DATE If CM%ITM is set, the help message is TIME If both are set, the help message is DATE AND TIME .CMQST (quoted) QUOTED STRING .CMUQS (unquoted) None COMND JSYS Page 119 .CMTOK (token) None .CMNUX (number) Same as .CMNUM .CMACT (account) None .CMNOD (node) NODE NAME 5.4.4. Word .CMDEF of the FDB This word contains a byte pointer to the ASCIZ string to be used as the default for this field. For this pointer to be used, bit 16 (CM%DPP) must be set in word 0 (.CMFNP) of the descriptor block. The string is output to the destination, as well as copied to the text buffer, if the user types an ESC or CTRL/F as the first non-blank character in the field. If the user types a carriage return, the string is copied to the atom buffer but is not output to the destination. When the caller supplies a list of function descriptor blocks, the byte pointer for the default string must be included in the first block. The CM%DPP bit and the pointer for the default string are ignored when they appear in subsequent blocks. However, the default string can be worded so that it will apply to any of the alternative fields. The effect is the same as if the user had typed the given string. Defaults for fields of a file specification can also be supplied with the .CMFIL function. If both the byte pointer to the default string and the GTJFN defaults have been provided, the COMND default will be used first and then, if necessary, the GTJFN defaults. NOTE: The function descriptor block, whose address is given in AC2, can be set up by the FLDDB. and FLDBK. macros defined in MACSYM. (See end of COMND section for a description of these macros.) 5.4.5. Word .CMBRK of the FDB This word contains a pointer to a 4-word user-specified mask that determines which characters constitute end of field. The leftmost 32 bits of each word correspond to a character in the ASCII collating sequence (in ascending order). If the bit is on for a given character, typing that character will cause the COMND JSYS to treat the characters typed so far as a separate field and parse it according to the function being used. CM%BRK (B13) must be on in the first word of the function descriptor block or COMND will ignore word .CMBRK. Ordinarily, the user would rely on COMND's default masks (varying according to function) to specify which characters signal end of field and thus would not be concerned with word .CMBRK of the function block. However, for special purposes such as allowing "*" or "%" to be part of a field rather than a field delimiter, the user must specify his own mask. (In this example, the bits for "*" and "%" would be off in the mask word.) The user may inspect COMND's default masks (defined in MONSYM) for help in designing a custom mask. The following is a list of the COMND functions that use masks: COMND JSYS Page 120 Mask COMND Changeable Symbols Function by User KEYB0. - KEYB3. .CMKEY Yes DEVB0. - DEVB3. .CMDEV Yes (only if parse-only) FLDB0. - FLDB3. .CMFLD Yes EOLB0. - EOLB3. .CMTXT Yes KEYB0. - KEYB3. .CMSWI Yes User specified .CMDAT Yes USRB0. - USRB3. .CMUSR No FILB0. - FILB3. .CMFIL No FILB0. - FILB3. .CMIFI No FILB0. - FILB3. .CMOFI No internal .CMNUM No FILB0. - FILB3. .CMDIR No internal .CMFLT No ACTB0. - ACTB3. .CMACT No COMND will ignore any break masks that are specified for functions that do not allow user-modified masks. Note that specifying a zero mask with CM%BRK set will cause the TTY line buffer to fill up and generate an error. On a successful return, the COMND call returns flag bits in the left half of AC1 and preserves the address of the command state block in the right half of AC1. These flag bits are copied from word .CMFLG in the command state block and are described as follows. 5.5. Bits Returned on COMND Call Symbol (bit) Meaning CM%ESC (0) An ESC was typed by the user as the terminator for this field. CM%NOP (1) The field could not be parsed because it did not conform to the specified function(s). An error code is returned in AC2. CM%EOC (2) The field was terminated with a carriage return. CM%RPT (3) Characters already parsed need to be reparsed because the user edited them. This bit does not need to be examined if the program has supplied a reparse dispatch address in the right half of .CMFLG in the command state block. CM%SWT (4) A switch field was terminated with a colon. This bit is on if the user either used recognition on a switch that ends with a colon or typed a colon at the end of the switch. CM%PFE (5) The previous field was terminated with an ESC. When a field cannot be parsed, B1(CM%NOP) is set in AC1, and one of the following error codes is returned in AC2. Note that if a list of function descriptor blocks is given and an error code is returned, the error is associated with the last function descriptor block in the list. COMND JSYS Page 121 NPXAMB: ambiguous NPXNSW: not a switch - does not begin with slash NPXNOM: does not match switch or keyword NPXNUL: null switch or keyword given NPXINW: invalid guide word NPXNC: not confirmed NPXICN: invalid character in number NPXIDT: invalid device terminator NPXNQS: not a quoted string - does not begin with double quote NPXNMT: does not match token NPXNMD: does not match directory or user name NPXCMA: comma not given COMX18: invalid character in node name COMX19: too many characters in node name 5.6. Macros Several macros (defined in MACSYM) are available to make using the COMND JSYS more convenient. These macros are as follows: 5.6.1. FLDDB.(TYP,FLGS,DATA,HLPM,DEFM,LST) where: TYP = function type FLGS = function flags DATA = function-specific data HLPM = help message DEFM = default text LST = additional invocations of the FLDDB. macro (used only if multiple function blocks are required) This macro generates function descriptor blocks for COMND. For example, the following code would perform a .CMINI function: MOVEI T1,STEBLK ; Get address of COMND state block MOVEI T2,[FLDDB.(.CMINI)] ; Get address of function block COMND COMND JSYS Page 122 The following code would perform a .CMKEY function (assuming that the keyword table started at address CMDTAB: MOVEI T1,STEBLK ; Get address of COMND state block MOVEI T2,[FLDDB(.CMKEY,,CMDTAB, ,)] COMND 5.6.2. FLDBK.(TYP,FLGS,DATA,HLPM,DEFM,BRKADR,LST) This is exactly the same as FLDDB. except that a provision has been made for the address of the first word of a 4-word character mask (BRKADR). This version is for use when a user-specified character mask is required. 5.6.3. BRMSK.(INI0,INI1,INI2,INI3,ALLOW,DISALLOW) where: INI0 = first word of character mask INI1 = second word of character mask INI2 = third word of character mask INI3 = fourth word of character mask ALLOW = characters to allow in the mask DISALLOW = characters to disallow in the mask This macro generates 4-word character masks for use with those COMND functions that allow the user to specify his own mask. For example, executing the following code would allow "*" in the predefined mask for the .CMFLD function (FLDB0 thru BLDB3): BRMSK.(FLDB0.,FLDB1.,FLDB2.,FLDB3.,<*>,) 5.6.4. FLDBK. Also, the BRMSK. macro may be invoked within the FLDBK. macro: FLDBK.(TYP,FLGS,DATA,HLPM,DEFM,[ BRMSK.(INI0,INI1,INI2,INI3,ALLOW,DISALLOW)],LST) The COMND call causes other monitor calls to be executed, depending on the particular function that is requested. Failure of these calls usually results in the failure to parse the requested field. In these cases, the relevant error code can be obtained via the GETER and ERSTR monitor calls. - Any TBLUK error can occur on the keyword and switch functions. - Any NIN/NOUT and FLIN/FLOUT error can occur on the number functions. COMND JSYS Page 123 - Any GTJFN error except for GJFX37 can occur on the file specification functions. - Any IDTNC error can occur on the date/time function. - Any RCDIR or RCUSR error can occur on the directory and user functions. - Any STDEV error can occur on the device function. 5.7. Errors Generates an illegal instruction interrupt on error conditions below. COMND ERROR MNEMONICS: COMNX1: invalid COMND function code COMNX2: field too long for internal buffer COMNX3: command too long for internal buffer COMNX5: invalid string pointer argument COMNX8: number base out of range 2-10 COMNX9: end of input file reached COMX10: invalid default string COMX11: invalid CMRTY pointer COMX12: invalid CMBFP pointer COMX13: invalid CMPTR pointer COMX14: invalid CMABP pointer COMX15: invalid default string pointer COMX16: invalid help message pointer COMX17: invalid byte pointer in function block MACSYM Page 124 6. MACSYM System Macros This chapter was written by Dan Murphy, Digital Equipment Corporation, July 1976. 6.1. Introduction MACSYM is a file of standard macro and symbol definitions for use with TOPS20 machine language programs. Use of these definitions is recommended as a means of producing more consistent and readable MACRO sources. Some of the definitions were obtained from C.MAC; others will be added if they are generally useful. MACSYM is available on SYS: in two forms, MACSYM.UNV and MACREL.REL. The first is the universal file of macro and symbol definitions; the second is a file of small support routines used by certain of the facilities (e.g., stack variables). The universal file is normally obtained at assembly time by the source statement SEARCH MACSYM The object file, if necessary, may be obtained by the source statement .REQUIRE SYS:MACREL This instructs LINK to load the object file along with the main program. The file is loaded only once even if the .REQUIRE appears in several source modules, and no explicit LINK command need be given. Certain conventions are observed regarding the construction of symbols as follows: ("x" represents any alphanumeric) xxxxx. an opdef or macro defininition .xxxxx a constant value xx%xxx a mask, i.e., a bit or bits which specify a field. Symbols containing multiple periods may be used internally by some macros. Symbols containing "$" are not used or defined by DEC and are reserved for customer use. 6.2. Definitions The following definitions are available in MACSYM and are arranged into groups as shown. MACSYM Page 125 6.2.1. Standard Program Version This macro assembles the standard contents of .JBVER. PGVER. VERS,UPDAT,EDIT,CUST where VERS is the major version number UPDAT is the update or minor version number (1=A, 2=B, ...) EDIT is the edit number CUST is the customer/SWS edit code (1=SWS, 2-7= customer) A word constructed from these quantities is assembled into absolute location .JBVER (137); the current assembly location is restored. 6.2.2. Miscellaneous Constants (Symbols) .INFIN = 377777,,777777 ;plus infinity .MINFI = 400000,,0 ;minus infinity .LHALF = 777777,,0 ;left half .RHALF = 0,,777777 ;right half .FWORD = 777777,,777777 ;full word 6.2.3. Control Characters (Symbols) Symbols are defined for all control character codes 0 to 37 and 175-177. The following are the commonly used characters; see source listing for others. .CHBEL = 07 ;bell .CHBSP = 10 ;backspace .CHTAB = 11 ;tab .CHLFD = 12 ;linefeed .CHFFD = 14 ;formfeed .CHCRT = 15 ;carriage return .CHESC = 33 ;escape .CHDEL = 177 ;delete (rubout) 6.2.4. PC Flags (Mask Symbols) PC%OVF = 1B0 ;overflow PC%CYO = 1B1 ;carry 0 PC%CY1 = 1B2 ;carry 1 PC%FOV = 1B3 ;floating overflow PC%BIS = 1B4 ;first part done (byte increment suppress) PC%USR = 1B5 ;user mode MACSYM Page 126 PC%UIO = 1B6 ;user IO mode PC%LIP = 1B7 ;last instruction public PC%AFI = 1B9 ;ADDRESS FAILURE INHIBIT PC%ATN = 1B10 ;apr trap number PC%FUF = 1B11 ;floating underflow PC%NDV = 1B12 ;no divide 6.2.5. Macros to Manipulate Field Masks Many of the symbols in MACSYM and MONSYM define flag bits and fields. A field mask is a full-word value with a single contiguous group of 1's in the field. E.g., 000000,,777000 defines a field consisting of bits 18-26. The following macros may be used in expressions to deal with these masks. 6.2.5.1. WID(MASK) Width - computes the width of the field defined by the mask, i.e., the number of contiguous 1-bits. Value is not defined if mask contains non-contiguous 1-bits. 6.2.5.2. POS(MASK) Position - computes the position of the field defined by the mask. The position of a field is always represented by the bit number of the rightmost bit of the field regardless of the width of the field. This is sufficient to specify the entire field in the case of flags (1-bit fields). 6.2.5.3. POINTR(LOC,MASK) Byte pointer - constructs a byte pointer to location LOC which references the byte defined by MASK, e.g., POINTR(100,77) = POINT 6,100,35 = 000600,,100 6.2.5.4. FLD(VAL,MASK) Field value - Places the value VAL into the field defined by MASK, e.g., FLD(3,700) = 0,,000300 6.2.5.5. .RTJST(VAL,MASK) Right-justify - Shift VAL right such that the field defined by MASK is moved to the low-order bits of the word, e.g., .RTJST(300,700) = 3 MACSYM Page 127 6.2.5.6. MASKB(LBIT,RBIT) Mask - construct a mask word which defines a field from bit LBIT to bit RBIT inclusive. E.g., MASKB(18,26) = 0,,777000. 6.2.6. Instructions Using Field Masks (Macros) The following mnemonics are similar to certain machine instructions used to move and test bits and fields. These macros select the most efficient instruction for the mask being used. 6.2.6.1. MOVX AC,MASK Load AC with constant. MASK may be any constant; this assembles one of the following instructions: MOVEI, MOVSI, HRROI, HRLOI, or MOVE literal. 6.2.6.2. TXmn AC,MASK where m is: N, Z, O, C n is: E, N, A, null There are 16 definitions of this form which include all of the modification and testing combinations fo the test instructions, i.e., TXNN, TXNE, TXO, TXON, etc. A TL, TR, or TD literal is assembled as appropriate. 6.2.6.3. IORX AC,MASK; ANDX AC,MASK; XORX AC,MASK These are equivalent to certain of the TX functions but are provided for mnemonic value. 6.2.6.4. JXm AC,MASK,ADDRESS This is a set of four definitions which jump to ADDRESS if the field specified by MASK meets a certain condition. The condition (m) may be: E - jump if all masked bits are 0 N - jump if not all masked bits are 0 O - jump if all masked bits are 1 F - jump if not al masked bits are 1 (false) These macros will assemble into one, two, or three instructions as necessary to effect the specified result, e.g. JXN T1,1B0,FOO = JUMPL T1,FOO JXE T1,770,FOO = TRNN T1,770 JRST FOO MACSYM Page 128 6.2.7. Data Structure Facility (Macros) This set of macros provides a comprehensive facility for the definition and use of data structures. It is an extension of some of the techniques represented by the field mask facilities above. Typically, a data structure definition will include some information about the location of the data in memory as well as its position within a word. These facilities are intended to provide the following advantages: - Data items may be referenced more mnemonically, e.g., two data items in the same word would be given different names rather than merely being known as the left half or right half of the word. - Should the need arise, storage formats may be changed without incurring the expense of a search of the code to change each reference. 6.2.7.1. DEFSTR and MSKSTR DEFSTR NAME,LOCATION,POSITION,SIZE MSKSTR NAME,LOCATION,MASK These macros both define a data structure called NAME. LOCATION specifies the memory location of the desired word and consists of address, index, and indirect fields in the usual form, i.e., @address(index). Any of the fields may be omitted if not needed, and the entire location argument may be null in some circumstances. The remaining arguments define the desired field. DEFSTR specifies the field in terms of its position (right-most bit number) and size (number of bits), while MSKSTR specifies the field by a full-word mask as described earlier. Normally, the actual storage to be used is declared separately, e.g., by a BLOCK statement. As a simple example, consider an array of full-word data items. We wish to use the name FOO for the data itself, so we declare the actual storage by some other name, e.g., FOO1: BLOCK n Then we declare the structure by DEFSTR FOO,FOO1(FOOX),35,36 This says that we declare a data item called FOO, that the items are addressed by FOO1(FOOX) (assuming that the index is kept in register FOOX), that the items are 36-bit quantities with the rightmost bit in bit 35 (i.e., full words). If instead, we wish to declare that each word of FOO1 consists of an item in the left half and two 9-bit items in the right half, we could write: DEFSTR FIRSTD,FOO1(FOOX),17,18 ; LH item. DEFSTR SECOND,FOO1(FOOX),26,9 ; One 9-bit item DEFSTR THIRDD,FOO1(FOOX),35,9 ; Another 9-bit item. Data items defined with DEFSTR or MSKSTR may be referenced in a general way. MACSYM Page 129 At each instance, additional location information may be given if necessary. A set of reference functions (macros) is defined for most common operations, some affecting AC and memory, others only memory. For example, the LOAD function loads a data item into an AC and is written as LOAD AC,NAME,LOCATION where AC is the AC to be loaded NAME is the structure name as defined with DEFSTR LOC is location specification in addition to that declared in the structure definition. This field may be null in some cases. Taking the example definitions above, we may write LOAD T1,FOO which would assemble into MOVE T1,FOO1(FOOX) or LOAD T1,SECOND = LDB T1,[POINT 9,FOO1(FOOX),26] LOAD T1,FIRSTD = HLRZ T1,FOO1(FOOX) Note that the macro compiles the most efficient instruction available to reference the specified field. The optional third argument is provided to allow some of the location information to be specified at each instance. For example, if the definition is DEFSTR FOO,FOO1,35,36 Then the index may be specified at each instance, e.g., LOAD T1,FOO,(XX) LOAD T2,FOO,(T1) The specification given in the definition is concatentated with the specification given in the reference. The following reference functions are presently defined: LOAD AC,NAME,LOC load data item into AC STOR AC,NAME,LOC store data item from AC into memory The data item is right justified in the AC. MACSYM Page 130 SETZRO NAME,LOC set the data item to zero SETONE NAME,LOC set the data item to all ones SETCMP NAME,LOC complement the data item INCR NAME,LOC increment the data item DECR NAME,LOC decrement the data item For functions not specifically provided, the following may be used: OPSTR OP,NAME,LOC OPSTRM OP,NAME,LOC OP is any machine instruction written without an address field. It will be assembled such as to reference the specified data structure. OPSTR is used if memory is not modified, OPSTRM is used if memory is modified. E.g., OPSTRM ,FOO to add the quantity in T1 to the data item FOO. The following test and transfer functions are presently defined: JE NAME,LOC,ADDR jump to ADDR if data is 0 JN NAME,LOC,ADDR jump to ADDR if data is not 0 The following test and transfer functions take a list of structure names (surrounded by angle-brackets) or a single structure name. They compile code to test each data item in the order given, and will stop as soon as the result of the function is known (e.g., AND encounters a false term). JOR NAMLST,LOC,ADDR jump to ADDR if any data item is true (non-0) JAND NAMLST,LOC,ADDR jump to ADDR if all data items true (non-0) JNOR NAMLST,LOC,ADDR jump to ADDR if all data items false (0) JNAND NAMLST,LOC,ADDR jump to ADDR if any data item is false (0) These functions optimize multiple fields in the same word if they are adjacent in the structure list. If the final location is an accumulator, further optimization is done. As a final example of the data structure facility, consider the typical case of data organized into unit blocks with pointers to other blocks. Such a block may appear as Flag 1 Flag 2 Code List pointer ! ! ! ! V V v V +---+---+---------+---------+-------------------------+ ! ! !/////////! ! ! +---+---+---------+---------+-------------------------+ ! additional node data ! +-----------------------------------------------------+ ! '''''''' ! MACSYM Page 131 We assume that n-word blocks will be allocated from a free pool at execution time. The structure of the block is declared as follows: MSKSTR FLAG1,0,1B0 MSKSTR FLAG2,0,1B1 DEFSTR CODE,0,17,9 DEFSTR LINK,0,35,18 DEFSTR NODDAT,1,35,36 Note that the location field contains only the offset address of the word within the block; the address of the block will be specified in an index at each reference. References would appear as follows: LOAD T1,LINK,(T1) ;step to next node in list STOR T2,CODE,(T1) ;set new block code JE FLAG1,(T1),FLOFF ;jump if flag1 is off JAND ,(T1),FLGSON ;jump if flag1 and ; flag2 are both on 6.2.8. Subroutine Conventions (Macros/opDefs) The following definitions are used to make subroutine mechanics more mnemonic. Reference is made to these conventions elsewhere in this document. 6.2.8.1. CALL address Call subroutine at address; equivalent to PUSHJ P,address 6.2.8.2. RET Return from subroutine; equivalent to POPJ P, 6.2.8.3. RETSKP Return from subroutine and skip; equivalent to JRST [AOS 0(P) RET] 6.2.8.4. CALLRET address Call the subroutine at address and return immediately thereafter; equivalent to MACSYM Page 132 CALL address RET RETSKP CALLRET assembles as JRST but should be treated as if it assembles into several instructions and cannot be skipped over. 6.2.8.5. AC Conventions The facilities described here assume in some cases the following accumulator naming conventions: AC1-AC4 temporary, may be used to pass and return values AC0,AC5-AC15 preserved, i.e., saved and restored if used by subroutine AC16 temporary, used as scratch by some MACSYM facilities AC17 stack pointer 6.2.9. Named Variable Facilities (Macros and Runtime Code) A traditional deficiency of machine language coding environments is facilities for named transient storage ("automatic", etc.). Sometimes, permanent storage is assigned (e.g., by BLOCK statements) when no recursion is expected. More often, ACs are used for a small number of local variables. In this case, the previous contents must usually be saved, and a general mnemonic (e.g., T1, A, X) is usually used. In some cases, data on the stack is referenced, e.g., MOVE T1,-2(P) but this is completely non-mnemonic and likely to fail if addition storage is added to or removed from the stack. The facilities described here provide local named variable storage. Two of these allocate the storage on the stack; the third allocates it in the ACs. 6.2.9.1. STKVAR namelist This statement allocates space on the stack and assigns local names. The list consists of one or more symbols separated by commas. Each symbol is assigned to one stack word. If more than one word is needed for a particular variable, then a size parameter may be given enclosed with the symbol in angle-brackets. E.g., STKVAR STKVAR > Variables declared in this way may be referenced as ordinary memory operands, e.g., MACSYM Page 133 MOVE T1,AA DPB T1,[POINT 6,BB,5] Each variable is assembled as a negative offset from the current stack location, e.g., MOVE T1,AA = MOVE T1,-2(P) Hence, no other index may be given in the address field. Indirection may be used if desired. There is no explicit limit to the scope of the variables defined by STKVAR, but the following logical constraints must be observed: 1. The stack pointer must not be changed within the logical scope of the variables, e.g., by PUSH or PUSHJ instructions. This also implies that the variables may not be referenced within a local subroutine called from the declaring routine. 2. The declaring routine must return with a RET or RETSKP. This will cause the stack storage to be automatically deallocated. STKVAR assumes that the stack pointer is in P, and it uses .A16 (AC16) as a temporary. 6.2.9.2. TRVAR namelist This statement allocates stack space and assigns local names. It is equivalent to STKVAR except that it uses one additional preserved AC and eliminates some of the scope restrictions of STKVAR. In particular, it uses .FP (AC15) as a frame pointer. .FP is setup (and the previous contents saved) at the same time as the stack space is allocated, and references to the variables use .FP as the index rather than P. This allows additional storage to be allocated on the stack and allows the variables to be referenced from local subroutines. Note that all such subroutines (i.e., all variable references) must appear after the declaration in the source. STKVAR may be used within TRVAR, e.g., by a local subroutine. STKVAR and TRVAR declarations are normally placed at the beginning of a routine. They need not be the first statement. If a routine has two or more entry points, a single declaration may be placed in the common path, or several identical declarations may be used in each of the separate paths. Care must be taken that control passes through exactly one declaration before any variables are referenced. E.g., MACSYM Page 134 ;MAIN ROUTINE ENT1: TXO F,FLAG ;entry 1, set flag JRST ENT0 ;join common code ENT2: TXZ F,FLAG ;entry 2, clear flag ENT0: TRVAR ;common code, declare locals .. CALL LSUBR ;call local subroutine .. RET ;LOCAL SUBROUTINE LSUBR: STKVAR ;local subroutine, declare ; locals MOVE T1,AA ;reference outer routine ; variable MOVEM T1,CC ;reference local variable .. RETSKP ;skip return 6.2.9.3. ASUBR namelist This statement is used to declare formals for a subroutine. The namelist consists of from one to four variable names. The arguments are passed to the subroutine in ACs T1 to T4, and values may be returned in these same ACs. ASUBR causes these four ACs to be stored on the stack (regardless of how many formals are declared), and defines the variable names as the corresponding stack locations. The return does not restore T1-T4. The same frame pointer AC is used by ASUBR and TRVAR, hence these declarations may not be used within the same routine. Scope rules are the same as for TRVAR. 6.2.9.4. ACVAR namelist This statement declares local storage which is allocated from the set of preserved ACs. An optional size parameter may be given for each variable. The previous contents of the ACs are saved on the stack and automatically restored on the next return. Variables declared by ACVAR may be referenced as ordinary AC operands. 6.2.10. Miscellaneous 6.2.10.1. TMSG string Type literal string; uses AC1, outputs to primary output. E.g., TMSG MACSYM Page 135 6.2.10.2. JSERR Handle unexpected JSYS error; type "?JSYS ERROR: message". This is a single instruction subroutine call which returns +1 always. 6.2.10.3. JSHLT Handle unexpected fatal JSYS error; same as JSERR except does HALTF instead of returning. 6.2.10.4. MOD.(DEND,DSOR) Modulo - In assembly-time expression, gives remainder of DEND divided by DSOR; e.g., MOD. 10,3 = 1. Columbia Page 136 7. Columbia Macros and Packages The items described in this chapter are peculiar to Columbia's DECSYSTEM-20. Programs that use these facilities are not transportable to other DECSYSTEM-20's (except in their .EXE form) unless the appropriate library files are taken, too. 7.1. Utility UUO Package for Macro-20 [ Programs and text by Chris Ryland, 1978. ] Preliminary Specs Note: all of these UUOs have a general restriction that must be observed: no strings addressed as arguments may live in the ACs. Further, none of the COMND functions may use FLDDBs that address indirectly through ac's t1-t4, .fp or p. 7.1.1. Formatted Printing Package %print , < addr of arg1 addr of arg2 ...> This expands into a call on the %uprint uuo, with argument [[point 7,[asciz/string/], addr of arg1, addr of arg2, ...] If you understand that the arguments are just part of a literal, then you can understand why they're in this format, and how to extend it; e.g., you might also say %print , or call the %uprint uuo directly, if the format string is a variable, e.g., %uprint [exp fmstr, arg1-addr, arg2-addr] The semantics of this beast are: the characters in the format string are output sequentially, until an escape character `%' is seen; then, a argument descriptor is eaten from the format string (see below for the definition of the argument descriptor), and one or more arguments are eaten from the argument list, and used for output, as directed by the descriptor. the basic idea here is that each argument descriptor item directs special output handling for a group of argument items (usually one). to make this discussion more concrete, here is an example of how this macro might be used: Columbia Page 137 %print , < [^d234] [ot%day!ot%fdy] > What happens here is that "Here's a number: " is printed on the primary output, and then the argument descriptor %d is processed, which slurps up the next argument, [^d234] (remember, all arguments are actually addresses of the object in question), and prints it as a decimal number. then, ", and the time: " is printed --nothing special here--, and the argument descriptor %@n is hit; this descriptor, mnemonic for `the time as of now', has a @ modifier (described in detail below) which causes the print package to pick up the next argument from the list and use it as the date/time format value (again, what's actually there is a literal, since arguments are always addresses of the value to be used). finally, the arg descriptor `%/' is seen, which means print a CRLF, and we're all done (because we hit the end of the asciz format string). Each argument descriptor is of the form `%<@>'. The `@' means pick up additional data to modify the action of the , from the argument list (this data is eaten just like data that is output; see below). Then, the action denoted by is taken, which results in one or more data items being eaten from the argument list and output according the the format . these constant references to `eating' are to graphically state that when an argument is used, it disappears from the argument list. thus, you can think of the argument list as being eaten one argument at a time, from the top to the bottom (or left to right, depending on how you coded it). Note that this type of output differs from Fortran-style formatting, in that it is format-driven, not argument-driven. E.g., in Fortran, each list item (argument) is taken in turn, and the next format item selected to be used as the output specification. in this package, just the reverse is done; the format items cause argument-handling. If any Jsys errors occur during printing, then if the %print is followed by an erjmp or ercal, the jump or call is taken, just as in a jsys invocation. Otherwise, a fatal error occurs. The equivalent, but skipping, UUO is %prSkp; it returns +2 on success (or +3 if it has an erjmp or ercal after it). The various argument descriptors, also known as format items, are: %% print a `%' %! ignore all following characters until a `!' is seen, at which point formatting resumes normally. This is designed to allow formats to nicely cross line boundaries. %{ print a < %} print a > (these last two are for non-paired <>) %/ print a Carriage-Return/Line-feed pair %= use the argument as a destination designator for the remainder of the output for this %print call; note that any JFN top-of-stack is then ignored for the rest of the %print (and Columbia Page 138 is NOT updated after the %print is done). %_ print a Horizontal Tab %^ print a Form-Feed (^L) %c print the name of the Connected directory, with punctuation (str:<...>); use any @ modifier value as a directory number, and print its name instead %d print a Decimal number; use any @ modifier as the NOUT format. If no radix is given in the @ modifier case, decimal radix is used. %e with no modifier, print the last error message encountered by this process; with a modifier, use the argument value as an error number to print symbolically. %? do error synchronization: clear terminal input and wait for terminal output to drain, and print a newline, followed by a "?". Rest of this %print will go to the physical terminal device. %f print the Floating (single-precision) value of the argument; use any @ modifier as the FLOUT format %h print the ascii cHaracter value of the argument %i like %d, but print +Inf if negative (for printing positive numbers) %j print the name of the file as given by the Jfn argument; use any @ modifier value as the JFNS format %n print the date and time of Now; use any @ modifier value as the ODTIM format %o print the argument as an (unsigned) Octal number; use any @ modifier value as the NOUT format. If no radix is given in the @ modifier case, octal radix is used. %s print the argument as an asciz String (of byte size as given by the argument's byte size); -1 in the left half of the argument means treat it as 7-bit asciz (NB: the argument is a really the address of a byte pointer, not a byte pointer itself; see examples below) %t print the date and Time as given by the argument; use any @ modifier value as the ODTIM format %u print the user's login ID, with no punctuation; with a @ modifier, print the user name of the given user number %v print the deVice name for the designator given as argument %x print the argument as a siX-bit value. Columbia Page 139 7.1.2. %prPush and %prPop Warning: this facility is not implemented yet! These two instructions push and pop the %print UUO's output JFN stack, respectively. The top-of-stack entry (or .priou if the stack is empty) is used as the destination designator, and is updated after each %print (unless a %= or %? is used in the format string, see above), so that a byte pointer may be effectively used as the output destination. %prPush takes an argument which is an address of the output designator (a JFN or a byte pointer). %prPop simply pops off the top of stack and discards it. (Note that you can't have indexed or indirected byte pointers, as this is doable but infinitely hairy for the UUO package. Also note that since the destination designator itself is updated after each %print, you can't use a literal (e.g. a byte pointer) for the argument to %prPush, unless you don't mind modifying pure data (which you should!).) Some examples are: sPtr: point 7, buffer ; Byte pointer to a memory buffer. : %prPush sPtr ; Use buffer as general : ; output area. %print ; Output to it. : ; (sPtr now points at last char). %prPop ; Get rid of this destination : ; now that we're done. Columbia Page 140 7.1.3. COMND-Jsys-Made-Easy Package This set of macros implements an easy access route to the COMND Jsys; familiarity with COMND and its functions IS required, though - we're only trying to ease the pain of using it, not learning about it. Following are the macros used to invoke the different functions of the COMND Jsys as well as ancillary tasks such as setting up various control blocks, getting information out of some of the data structures deliberately hidden to ease use of COMND, etc. The naming conventions used are designed to follow as much as possible the names of each of the COMND functions, with slightly varied punctuation that corresponds to CUsym conventions. E.g., the .cmkey function of COMND is invoked with the %cmkey macro; induction should get you the rest. Some philosophy about error-handling: any COMND function can fail in two ways (actually, three, but the third is merged into the first to make your job all that much easier): the function can't be parsed with the given input, or the user deletes input back into an already-parsed field. We treat these two `errors' uniformly: each invocation of a function either returns normally if no error occurs, or takes an erjmp or ercal path (if provided) if an error is found. Thus, a simple, uniform method of handling parse errors and reparse conditions is provided: each subroutine of the main parse routine can always return non-skip on any error (real or reparse), or skip on success. The main parse routine can worry about whether a real error occurred, and print an appropriate message, restarting the parse from scratch, or whether just a reparse is needed, restarting from the first parse step. When an error occurs, t1 contains the flags from COMND (note in the success case that t1 isn't modified), and t1 is not set. If a real COMND Jsys error (i.e., not a user parse error occurs, like a COMND internal buffer overflow), the parse error bit will be set, and the error return will happen as usual; this is done to make COMND errors be treated uniformly. Note that each COMND function, if it succeeds, returns a value in t2 (even if it doesn't return any useful result). Also, if any COMND function is invoked with more than one FDB (i.e., alternate FDBs are used), then upon return t3 contains what it normally after the COMND Jsys (q.v.): the FDB actually used, and the first FDB given. %cmini (prompt, flags, iojfn, gjfblk) This macro prepares everything for a command parse. All the arguments are optional; their use is: prompt a text string used as the prompt; e.g., <>. If not supplied, the prompt `>' is used. Note that if a `>' is required on the end of this prompt, then you must use the form <>. Blame macro. flags the flags destined for the (left half of the) CSB .cmflg word, such as raise all input, wake on every field, etc. iojfn the pair for the parse. gjfblk the address of the GTJFN argument block, used in the .cmifi, .cmofi, .cmfil functions (and it must be supplied if you plan to use these functions). Its length must be at least .gjln Columbia Page 141 (defined in CUsym). This function will only fail if some horrible mistake has been made, usually by the COMND package, so expect success. Note that this UUO only does a .CMini COMND Jsys on the second and subsequent invocations, until a %cmres UUO is done, at which point it will re-initialize everything (see the %cmres UUO description below for an explanation). 7.1.3.1. %cmRes This UUO, automatically done by %setUp at normal program startup, resets all COMND parsing information, so that the next %cmini UUO will cause a full setup of the Command State Block (set up the prompt, reset all the buffer pointers, etc.). It should be invoked whenever you intend to start a whole new section of parsing (e.g., changing the prompt or the set of commands, such as subcommand mode). In the simplest case, you never have to worry about it, as it's automatically done at startup. In the most usual case where it would be needed, a subroutine called to do some subsidiary parsing (e.g., the GetOK routine in mac:), the safest approach is to do a %cmres before the subroutine's %cmini, and a %cmres after finishing its parsing job. This guarantees that the caller won't get hurt. 7.1.3.2. %cmKey (keytab, help, default, flags) This macro invokes the .cmkey COMND function; upon success, t2 contains the address of the table entry where the parsed keyword was found. All but the first arguments are optional. keytab address of a TBLUK keyword table help a literal string (usually enclosed in <> if it contains anything other than alphanumerics and spaces) that will be used as the help message. if not given, the default help message is used. default a literal string that will be used as the default keyword if none is supplied; if not given, no default is possible. flags flags, such as suppress default help, etc. Note that these arguments are just used to build a function descriptor block, and thus the usual things happen in their absence or presence. Many of the other macros use the same structure for their arguments, and the same comments apply as here, so they will usually be elided. If you need to use a hand-crafted function descriptor block, you can call the COMND package uuo directly, with the address of the FDB, as in %comnd [flddb. .cmkey,...]. Rather than give each COMND function as above, we will let you induce on the `base step' above and assume that the remaining functions are invoked similarly. What follows are those functions that do not map directly to a COMND function. Columbia Page 142 7.1.3.3. %cmgab bp This function asks the COMND interface package to get the current contents of the atom buffer into the string pointed to by the byte-pointer given as argument; this cannot fail. Note that the atom buffer can be quite long --how long depends on the current implementation of the COMND package, but a reasonable size would be 100 words--, so be wary of extremely long atoms. The byte pointer is updated. Note that `bp' is the address of where the byte pointer can be found; i.e., the effective address of this UUO is the address of the byte pointer. BEWARE: if you supply the argument as a literal, the byte pointer gets updated in the literal pool; if you use the same literal later, it will not be pointing where you think! 7.1.3.4. %comnd flddb This is the general COMND function interface, for doing things not directly supported by this package. `flddb' is the address of a function descriptor block. It returns the data as the COMND jsys does, except that t1 is not used to return the parse flags (see %cmgfg below, if you want to do this). 7.1.3.5. %cmgfg flag This function get the flags from the .cmflg word of the Command State Block into the word addressed by `flag'; e.g., to get the parse flags into t1, a %cmgfg t1 will do just fine. Some notes about using these COMND support macros in a structured fashion: - The basic idea, as hinted at in the description above, is that each COMND function either returns successfully if the parse succeeded (including no reparse needed), or takes an erjmp/ercal path if one is provided. Thus, each subroutine that is doing some `piece' of the parsing can, at each step in its job, simply return non-skip on error, or go on if each parse step succeeds, returning skip when it finally finishes successfully. Only the top-level parse routine has to worry about whether a reparse is needed, or an actual parse error occurred; in the former case, the routine only needs to start the parse over (without re-initializing); in the latter, an error message can be issued, and the parse started over from scratch. - There are several macros to help with this philosophy of parsing: %pret To be used in a parse subroutine after each COMND function; it just returns non-skip if a parse error occurs. %errep errlab, replab To be used in a situation (either top-level or a subroutine) where an error or reparse must be handled specially. Mostly useful in the top-level parse routine. Columbia Page 143 %merrep errlab, replab Mostly like %errep, but it prints a parse error message before going to errlab; this is useful in the top-level parse routine. Note that an erjmp or ercal after a COMND function invocation is usually sufficient for handling most errors; e.g., after a %cmfil invocation in a parse subroutine, if any errors occur later in the same routine, an erjmp to a cleanup segment (that releases the jfn gotten by the %cmfil) is quite sufficient for handling both noparse and reparse errors. Columbia Page 144 7.2. CUrel Utility Subroutines CUrel.rel is an indexed library that can be searched for the handlers for the Columbia UUOs for COMND Jsys calls and formatted printing (described in CUUOs.doc) and for the routines described below. 7.2.1. Helper Types the desired help file at the job's controlling terminal. Actually, it will type any 7-bit ASCII file, but the error messages all refer to help files. Input: t2/ 7-bit byte pointer to ASCIZ filespec. Effects: If the specified file is found and accessible then it is typed, otherwise an appropriate error message is typed. Returns +1 always. Calling sequence: search CUsym extern helper : %setup : move t2, [point 7, [asciz\HLP:FILE.HLP\]] call helper : - F. da Cruz, CUCCA, 1978 7.2.2. GetOK Get affirmative or negative response to a question, using the COMND Jsys (help is given on '?', recognition on ESC). A default answer can be specified, which will be supplied automatically if the user types carriage return alone in response to the question. Columbia Page 145 Input: t1/ 7-bit byte pointer to ASCIZ string posing the question. t2/ zero -- no default answer. positive (nonzero) -- default answer is "yes". negative -- default is "no". Returns: +1 if response was negative. +2 if response was affirmative. Caution: Don't call this routine while processing another COMND Jsys (i.e. after .CMINI but before .CMCFM). Example: move t1, [point 7, [asciz\Really delete all your files? \]] move t2, [-1] ; default is 'no'. call getok jrst dont ; answer was no, don't do it. ; code here will be executed if answer was 'yes'. - F. da Cruz, C. Ryland, CUCCA, 1978 7.2.3. Gfcpg This routine will allocate a page to the user. This is useful, for instance, when PMAPping is to be done to a single page in memory. Returns: +1: error, no free core pages; +2: success, page number in t1. Example: call %gfcpg %ermsg ,nopage ; do this on +1 return ; come here when a page has been successfully allocated - George Lotridge (DEC), CUCCA, 1977 7.2.4. pagMgr A page management facility. Gives greater functionality that gfcpg at a slightly greater cost in overhead. Allows allocation and deallocation of consecutive blocks of pages. Keeps an internal 'own' page table for page management. Columbia Page 146 Call with: Function codes in t1: 0: Get pages, searching from 770 -> 0. 1: Get pages, searching from 0 -> 770. 2: Free pages. 3: Initialize the pages-in-use vector. (this is done automatically the first time this routine is called, before the selected function is executed). Arguments in t2: For get-page functions: Left half contains number of consecutive pages to get, Right half contains starting page number to search from. For free-page function: Left half contains number of consecutive pages to free, Right half contains starting page number to free from. (in these functions, if the number of pages given is 0, it will be treated as 1.) For reinitialize function: t2 is not examined. Returns +1: Not enough pages available; t2 contains maximum number available of type requested. If there was an error on initialization (e.g. invalid argument), a message is typed at the terminal, and t1 is set to -1. +2: t1/ Address of block of pages in right half, and page number in the left. t2/ page count. Joel Rosenblatt, CUCCA, 1978. 7.2.5. Subbp Subroutine to subtract two byte pointers, i.e. to tell the number of bytes between the bytes pointed by the first one and the second one. The two byte pointers must point to bytes of the same size. Indirection (@) and indexing is handled properly. Columbia Page 147 Call with: t1/ First byte pointer. t2/ Second byte pointer. Returns: +1 if the byte sizes are different, with t1-t3 unchanged, or else +2 with: t1/ Unchanged. t2/ Unchanged. t3/ The number of bytes of the specified bytesize in thee string pointed to by the first byte pointer (in t1) up to, but not including, the byte pointed to by the second byte pointer (in t2). Example: ; assume a SIN has just been done to get a string into location ; 'buffer'. SIN returns the updated byte pointer in t2. This ; call to Subbp will tell how many characters were in the string. move t1, [point 7, buffer] ; Point to beginning of buffer. call subbp ; (t2 already has the pointer to the end). %ermsg ,error ; Do this on error. movem t3, count ; Save the byte count. - F. da Cruz, CUCCA, November 1977 7.2.6. Rescan Allow arguments to be passed to programs via the Exec command line. Look in the rescan buffer for the name of the calling program followed by any arguments. If the first field found in the rescan buffer is not the same as the program name, or if the program name matches but there are no arguments after it then this routine returns +2 with no other effect. Otherwise it returns +1 to indicate that special handling (usually the setting of a flag) can be done. Enter with: t1/ Byte pointer to asciz program name. Returns: +1: If arguments found in rescan buffer, with updated pointer in t1. +2: otherwise. Example: extern Rescan %trnOff rscFlg ; Assume no rescan args. move t1, [point 7, [asciz/foo/]] ; Name of this program. call rescan ; Rescan args on command line? %trnOn rscFlg ; Yes, turn on the flag. Columbia Page 148 If arguments were detected then subsequent requests for tty input will be satisfied by the data in the rescan buffer until the rescan buffer is exhausted or the program issues a .CMINI (or otherwise clears the input buffer), at which time tty input will automatically revert to the physical tty. The caller should skip over the first .CMINI after return from this routine, which has already issued a .CMINI. The contents of the rescan buffer are discarded if the first field found on rescan does not match the program name passed in t1. - Jeff Langer, CUCCA, April 1979 Columbia Page 149 7.3. CUsym MACSYM Augmentation Macros The CUsym macros and documentation were written by George L. Lotridge of Digital Equipment Corporation (while he was assigned to Columbia University as a resident software specialist) and Chris Ryland of Columbia. CUsym contains a whole set of symbol and macro definitions to augment MONSYM and MACSYM. Included are the standard register definitions, macros for interfacing to the UUO package (which supports standard I/O, simple uses of the COMND Jsys, etc.), and generally any macro which has been found to be useful and which is missing from MONSYM and MACSYM (a working knowledge of which is assumed). NOTE: you should have a good feel for the contents of MACSYM (6) document and the Macro coding standards document (8) before using this package. A word about naming conventions: all names in this module are of the form %symbol; this will hopefully sidestep any name conflicts with a SEARCHing program. DEC has reserved names with % and . in them, but their use of % is restricted to other than the first char- acter, so we're safe. (Actually, a few of our Useful Symbols, below, use a "." as their first character, which is also DEC-reserved, but they're simple and few enough to cause no problems.) Also, a few of the 'hidden' symbols used herein (e.g., the stack, or global symbols in the support package) begin with "%%". 7.3.1. Accumulator Support Accumulator (register) definitions (conform to the DEC coding standard) These must be used exclusively, unless specifically redefined at the start of a module with the %DefAC macro (see below). p=:17 ; Stack pointer cx=:16 ; Call/Return temporary .sac=:16 ; CU/MacSym utility reg f=:0 ; Flag register (preserved) t1=:1 ; General temp and Jsys registers: t2=:2 ; never preserved t3=:3 ; ... t4=:4 ; q1=:5 ; First set of preserved regs q2=:6 ; (must be preserved by callee q3=:7 ; across a call) p1=:10 ; Second set of preserved regs p2=:11 ; (ditto) p3=:12 ; p4=:13 ; p5=:14 ; p6=:15 ; NB: not useable with TrVar MacSym facility .fp=:15 ; Frame pointer for TrVar facility Columbia Page 150 7.3.2. %DefAC Define an alternate name for one of the registers; this macro should egisters are re-defined, and the new definition should be made in terms of one of the definitions above. This macro purges the old name, thus preventing multiple names for one register. define %defac(new, old) 7.3.3. Useful Symbols .prjfn=<.priin,, .priou> ; Symbol for usual primary JFN pair .null==0 ; General nothing value .nil==0 ; General nothing pointer .True==1 ; General boolean truth value .False==0 ; and its complement ; Lengths of various Jsys control blocks; ommitted from MONSYM, sadly. .acln ; Length of ACCES arg block .ckln ; Length of CHKAC arg block .cmln ; Length of Command State Block .cmfln ; Length of Function Descriptor Block .cdln ; Length of CRDIR arg block .cjln ; Length of CRJOB arg block .jiln ; Length of GETJI arg block .gjln ; Length of (long form) GTJFN arg block .ipln ; Length of ipcf packet descriptor block .rsln ; Length of SFTAD arg block .rdln ; Length of TEXTI arg block 7.3.4. UUO Package OPDEFs and Interface Symbols %uprin ; Print UUO %comnd ; COMND interface UUO %ucmin ; COMND initializer UUO %cmgfg ; Get COMND flags UUO %cmgab ; Get COMND atom buffer UUO %nuuo ; Add-new-UUO UUO %cmres ; COMND reset UUO %prPush ; JFN-stack push (%print package) UUO %prPop ; JFN-stack pop (ditto) UUO ; length of COMND interface UUO atom buffer: %atmbl==^d250/5+1 ; in words extern %csb ; command state block, for you hackers Columbia Page 151 7.3.5. Setup Environment Macros 7.3.5.1. %setEnv %SetEnv is a macro that must be used as the first thing after your Title and Search CUsym statements; it sets up the CUsym, MonSym, and MacSym environments properly. Its use is envisioned as: title Baz - tweak the frob's runtime search CUsym ; (No MAC: needed!) %setenv ; Set up our environment ... 7.3.5.2. %setUp %SetUp is a macro that you should use as the first executable action in your program; it Resets the execution environment, sets up a stack, clears the flag register F, sets up for UUO calls, sets up for COMND parsing (resetting) (and starts off your code in %Pure mode). 7.3.6. Storage Declaration Macros %Pure, %Impure, %Routine Use these macros to declare what sort of storage follows them: either %pure code (or read-only data) or %impure data (read-write). Thus, before beginning a new logical section of code or data, always use one of them to declare what follows (if you don't use them, you may be surprised!). It goes without saying that %Pure should precede all code (which is NEVER impure). Using this pair of macros is a good way of keeping impure data, which belongs to a routine, physically (on the written page, that is) together with the routine. The alias %routine for %pure exists for purely mnemonic purposes; its use is suggested, as in: %routine openit: stkvar > 7.3.7. General-Purpose Macros 7.3.7.1. %Stack This macro creates the stack area, and loads P with the stack pointer. Its argument, the stack height, is optional, and defaults reasonably. Columbia Page 152 7.3.7.2. %Version This macro builds a standard DEC version word from its arguments. In order, its arguments are the major version, the edit number, the minor version, and the customer version. Omitted fields default to zero. 7.3.7.3. %Clear This macro takes three args, two of which are optional. The first, non-optional, argument is the starting address of the area to be cleared. The next is the number of locations to clear, which defaults to one. The last argument is the desired filler, which defaults to zero. 7.3.8. Macros Used for Common Primary I/O Note that since these macros use %print, that any string argument shouldn't include `%'s, or things may get very confusing. Also note that %typnum's first argument, the address of the number, must conform to the %print argument standard (q.v.). 7.3.8.1. %typeCR(string) Types the given literal string at your terminal, followed by a carriage return (CR). Example: %typeCR 7.3.8.2. %crType(string) Like %typeCR, but types the CR before, instead of after, the string. To type a literal string with no CR, either before or after, use the MACSYM macro TMSG. 7.3.8.3. %typNum(num,cols,rdx) Types the number in location 'num' at your terminal. The two trailing arguments are optional. cols Field width in which to print the number. Default is 0, i.e. use only as many character positions as are necessary to type the number. rdx Radix in which to type the number. Default is 10 (decimal). Columbia Page 153 7.3.8.4. %crlf Types a carriage-return/linefeed sequence at your terminal. 7.3.8.5. %tab Types a horizontal tab at your terminal. 7.3.9. JSYS Support Macros 7.3.9.1. %jsErr Macro to be used after a Jsys that either returns +1 on error or always returns +1 (i.e., all but two of the Jsysi); %JSerr types the user's error message (if given) or the Jsys error that caused it to be invoked (if no message is given); it then either halts (if no address is supplied) or goes to the address (if given). Note: this is similar to the MacSym macro JSERR, but it only works after a Jsys, since it is invoked by an erjmp; %JSerr has the advantage, though, that it ALWAYS works after a Jsys, which JSERR doesn't. Note also that both %Jserr and %ErMsg, if they halt and are continued, will simply return to the point after the invocation of the %Jserr or %ErMsg. Both of these macros, since they use %print, can produce customized error messages (eg, %jsErr ,, will print an error-synchronized message, with a string argument and the monitor error message in brackets, and then halt). 7.3.9.2. %erMsg This macro, which, in contrast to %JSerr, is designed to be used in a non-Jsys skip context, will print a message (if given) or the last fork error (if not given); finally, it either jumps to an address (if given), or halts (if not given). See %Jserr comments for more info. 7.3.10. Local Label Support Macros The intent of this set of macros is to provide a facility usually available in good assemblers (hint, hint): local labels. The idea, due to Knuth, is that instead of agonizing over choosing a label for each little local motion within some code, you simply plant one of nine local labels, of the form %N, and refer to the next local label %N by %NF, and the previous local label %N by %NB - a simple example of all this is: Columbia Page 154 some: stkvar txne t1,gj%old ; Does he want an old file? jrst %1f ; Yes, go handle it txz t1,gj%fou ; No, reset this setom from ; and set 'from' flag %1 call foo ; Continue with processing jrst %1b ; Failed: try it again ... ; etc etc These macros (internally) use symbols of the form %n% and %n%m, where n ranges from 1 to 9, and m from 0 to 777, so be wary. 7.3.10.1. %Cat(a,b) Useful macro that just returns its two arguments (as text strings), concatenated. 7.3.11. COMND JSYS Support Macros 7.3.11.1. %Ptr(string) Build a standard 7-bit ASCIZ pointer to a literal string. 7.3.11.2. %table and %tbEnd %table is used to start a keyword table definition; %tbEnd ends a keyword table definition. Suggested use is as in the following example, which also illustrates %key. cmtb: %table ; Keywords for frotz program %key Mumble,domum,cm%inv ; mumble command (invisible) %key Noodle,donood ; noodle command %key Zork,dungeo ; invoke dungeon command %tbend ; End of this keyword table 7.3.11.3. %key(name, data, flags) This macro takes three arguments: an (alphanumerics only!) name, the data to be associated with the name, and an (optional) flag value. It creates either a flag-less keyword (the normal case), or, if flags are given, a keyword with flags in the first word (and cm%fw set). Thus, the result is a TBLUK table entry, suitable for use by the .CMkey COMND Jsys function. Note that all %Key words in a table must be bracketted by %table and %TbEnd macros (see above). Columbia Page 155 7.3.11.4. %Flddb (typ, flgs, data, hlpm, defm, lst) This macro is useful for building function descriptor blocks that don't contain just literal strings for the help and default components; otherwise, it's the same as the MONSYM flddb. macro. 7.3.11.5. %Handlr(p,e), %PrsAdr, %EvlAdr Macros to support structured parse/evaluation; %Handlr builds a structure comprised of the parse routine address (p) and evaluation routine address (e), for a given keyword (it should be in a literal in the %Key macro); %prsAdr and %evlAdr are the DEFSTR structures for accessing these two elements of a structure, respectively. An example of all this: : %cmkey comtab, ; Get a top-level keyword %merrep restart, repars; Usual error handling hrrz t2, (t2) ; Pick up data value from keyword load t2, %evladr, (t2) ; Get evaluation routine movem t2, evaler ; Save its address load t2, %prsadr, (t2) ; Now, get parse routine call (t2) ; And call it %jmerrep restart, repars, restart ; Handle errors : ; Continue ; Pure data for main parse %table ; Main command table %key bletch, [%handlr(bletcm, doblet)] ; Bletch mode %key mumble, [%handlr(mumbcm, domumbl)] ; Mumble mode %tbend : 7.3.11.6. %CMxxx Macros to Invoke .CMxxx COMND Functions See the CUUOS document (7.1) for information about using these. They are listed here for convenience: - %cmIni (prompt, flags, ioJfn, gjfBlk): Initialize parse. - %cmKey (keyTab, help, defalt, flags): Parse a keyword. - %cmNum (radx, help, defalt, flags): Parse a number. - %cmNoi (guide-word): Parse guide words. - %cmSwi (swTab, help, defalt, flags): Parse a switch. - %cmIfi (help, defalt, flags): Parse an input filespec. - %cmOfi (help, defalt, flags): Parse an output filespec. Columbia Page 156 - %cmFil (help, defalt, flags): Parse a general filespec. - cmFld (help, defalt, flags): Parse a "field". - %cmCfm (help, flags): Get confirmation (CR). - %cmDir (data, help, defalt, flags): Parse a directory name. - %cmUsr (help, defalt, flags): Parse a user name. - %cmCma (help, flags): Parse a comma. - %cmFlt (help, defalt, flags): Parse a floating-point number. - %cmDev (help, defalt, flags): Parse a device name. - %cmTxt (help, defalt, flags): Parse a text string. - %cmTad (tadBlk, help, defalt, flags): Parse time and date. Note that the Time-and-Date flags belong to the first argument, since they're part of the data to the function. - %cmQst (help, defalt, flags): Parse a quoted string. - %cmUqs (brkTab, help, defalt, flags): Parse an unquoted string. - %cmTok (token, help, defalt, flags): Parse a token. The token as given should be a string in double quotes, as in %cmtok "*"; if you need some other form, use the %comnd UUO bare. - %cmNux (radix, help, defalt, flags): Parse a number. - %cmAct (help, defalt, flags): Parse an account string. - %cmNod (help, defalt, flags): Parse a network node name. 7.3.12. Macros to Handle COMND Errors 7.3.12.1. %pret For use in secondary parsing subroutines. Handle a parse error or reparse by just returning non-skip. 7.3.12.2. %errep errlab, replab For use in top-level command parser. Handle an error by going to errlab, and a reparse by going to replab. Columbia Page 157 7.3.12.3. %merrep errlab, replab For use in top-level command parser. Handle an error by giving an error message and going to errlab, and a reparse by going to replab. 7.3.12.4. Macros for Fail-Return from Parsing Routines These macros help with error- and reparse-handling after a parse subroutine call (which is expected to return skip on success, and non-skip on failure); three sorts of errors can be expected from such subroutines: parse error, reparse needed, other type of failure (usually a semantic problem). Thus, these macros have three dispatch addresses, corresponding to these three errors. Note that the method used here assumes that if neither the parse error or reparse flags are set in the command state block, then the error is of type `other'. 7.3.12.5. %jerrep errlab, replab, othrlb Handle a skip-return error condition as described above. 7.3.12.6. %jmerrep errlab, replab, othrlb Handle a skip-return error condition as described above, printing an error message on a parse error. Note! t1 is clobbered by %errep, %merrep, %jerrep, %jmerrep. 7.3.13. Flag-Handling Macros All of the following flag-handling macros use register F, the preserved flag register. F is assumed to be the flag register for any program that uses these macros. Note that %SetUp clears F, thus initializing flag management. 7.3.13.1. %Flags(aFlg,bFlg,cFlg,...) This macro takes a list of flag names, and assigns a flag value to each name (within a 36-bit word). It can be used more than once, but no more than 36 flags can be defined. 7.3.13.2. %trnOn & %trnOff These macros take a flag quantity (one or more flags ORed together), and turn them on or off, respectively (with no skipping). Columbia Page 158 7.3.13.3. %TrOnS & %TrOfS Like %TrnOn and %TrnOff, but skip always afterwards. 7.3.13.4. %SkpOn & %SkpOff These macros take a flag quantity, and will skip if ALL the flags are on or off, respectively. 7.3.13.5. %AnyOn & %AnyOff These macros take a flag quantity, and will skip if ANY of the flags are on or off, respectively. 7.3.14. CUuos (CUCCA Utility UUOs) Interface 7.3.14.1. %print, %prSkp Formatted-print macros: output the arguments according to the format string. %PrSkp returns skipping (+2 instead of +1, but handling a following erjmp/ercal properly). See 7.1 for details. A note about arguments: the argument list is really nothing more than a sequence of addresses, so if you choose to use addressing forms such as address(index), be sure and use the form so that macro will be happy with the address. The same applies to address forms such as , etc. Standards Page 159 8. Macro-20 Programming Standards and Conventions 8.1. Introduction This document was adapted in April 1979 at the Columbia University Computer Center from the document TOPS20 Coding Standards and Conventions, 17 January 1974, revised 8 September 1976 written by the DEC Tops-20 monitor group for its own use. This version departs from the original in certain ways, by additions and deletions, and by modifications of certain details, but the net effect is about the same. These standards and specifications apply in specific detail to Macro-20 programs, and in spirit to other DEC-20/DEC-10 assemblers (Macro-10, Midas, Fail). They do not apply in their entirety to assembly language routines written to be called from other languages, which usually have different subroutine calling conventions, and may have registers dedicated to different uses, precluding the use of certain MACSYM facilities. Familiarity with Macro-20 and DEC-20 monitor calls is assumed. A standard of programming style and conventions is especially important for the assembly language programmer because assembly language lacks high-level language facilities such as control and data structures, scoping of variables, etc., without which the programmer must exercise tremendous discipline to write a clear, easily modifiable program. It is hoped that this standard will make assembly language programming easier by relieving the programmer of the burden of deciding at every turn how to format a statement, how to use registers, how to pass parameters to a subroutine, and so forth. Like any standard, this one contains many elements that are arbitrary and capricious, and few people will agree with every detail. The bulk of the standard comes to us as a fait accompli, having been in use at DEC for several years and forming the basis for a very large amount of code. Other facets have been added after long experience both in writing programs and in reading and modifying code from various installations. It is felt the rules of style and usage given here can result in programs that are as clear as Macro-20 programs can be. 8.2. Statements The general form of a statment should be: label: opcode ac, @addr(x) ; Comment. where: 1. Tabstops are assumed to be set every 8 spaces. 2. The label begins at the left margin. 3. The opcode begins at the first tab stop. There should be one tab before the opcode. If the label and colon(s) occupy 8 or more spaces, the opcode should start at the first tab stop on the next Standards Page 160 line. Exceptions to this rule apply in multi-line literals and following skipping instructions, JSYS's, and subroutine calls (see below). 4. One space (a blank, not a tab) should follow the opcode unless there are no other fields except the comment to be specified in the statement, in which case sufficient tabs should follow to put the comment field at the 4th tab stop. Space is preferred to tab for a number of reasons: spaces take up less space and cause fewer overflows into the comment field; an instruction is typed the same way in the program text as it is in DDT (DDT will not accept a tab in an instruction); tabbing does not achieve vertical alignment of equivalent fields since various quantities can follow the opcode (register, address, and others); tabs would line up the operands with the opcodes of multiline literals; using tab prevents the operands of an indented opcode from reflecting the indentation and sometimes forces the operand forward an additional tab stop. This standard attempts to help the whole instruction to be perceived visually as a single unit; the use of tab tends to visually separate opcode and operands for no useful reason. Vertical alignment of operand fields such as produced by tab in this context appears to be mainly a carryover from punched-card oriented processors where parsing was done by field rather than on a character stream. 5. When any field is not used by the instruction, it may be omitted along with its directly related punctuation. A field which is affected by an instruction should NOT be defaulted to 0 by omitting it. 6. The semicolon which begins the comment should be at the 4th tab stop. There should be one or more tabs preceding the semicolon as necessary to place the semicolon at the 4th tab stop unless the preceding fields extend to or beyond the 4th tab stop. In this case, one space (blank) should be used to separate the last preceding field and the semicolon. The semicolon should be followed by a space. The instruction which follows a skipping monitor or subroutine call should be indented 1 additional space (1 space beyond the first tab stop) to indicate the possibility of that instruction being skipped. Indenting in this manner should also be done following skipping machine instructions; when several skipping instructions appear consecutively, each instruction that could be skipped should be indented 1 space. 8.3. Comments A comment on the same line as a statement should begin at the 4th tab stop as described above. When a comment consists of a single sentence or phrase which requires more than one line, the subsequent lines should have two spaces between the semicolon (at the 4th tab stop) and the comment to indicate to the reader that the several lines are part of one logical statement. A comment on a line by itself should begin at the left margin. A group of comment lines (1 or more lines) should be preceded and followed by Standards Page 161 a blank line. A long comment (10 or more lines) may be enclosed within a REPEAT 0 or COMMENT pseudo-op and the semicolons omitted. Extensive commenting of source listings is strongly encouraged: 1. Routines, modules, sections, macro definitions, etc., should be described at their beginning. See also requirements for subroutine comments below. 2. Comments should appear on almost every statement line. As the reader views the listing page, the comments (aligned at the 4th tab stop) should appear as a running commentary on what the code is doing. These on-line comments should explain the logical procedure being carried out, not just describe the obvious action of the instruction. Humorous or irrelevant comments (e.g. "; Oops...", "; Oh well...") are discouraged since they provide no information to the reader. Comments should be written in plain English, without jargon, following normal rules of capitalization, punctuation, and grammar. A reader should be able to read the comments without seeing the code and obtain a coherent understanding of what the program is doing. When a variable or other mnemonic symbol is referred to in the comments, an english phrase rather than the mnemonic itself should frequently be used (e.g. "last page address" rather than "LPGADR"). Comments, particularly routine headers, should describe why non-obvious actions are being taken and/or what assumptions are being made (e.g. "Get here only when ..."). Comments should exist on three levels. The highest level is a sentence or paragraph that describes the operation of an entire program or routine, and appears at the head of the program or routine (at the top of a page), usually enclosed by the COMMENT pseudo-op so that each line need not begin with a semicolon. The first top-level comment in a file should include the author's name, organization (and department), the date, and possibly a copyright notice. The lowest level is the per-statement comment at the 4th tab stop. The intermediate level is a one- or two-line comment, beginning on the left margin, which describes the purpose or operation of a group of statements. There should never be more than 8 or 10 lines of code without such an intermediate comment. Think of any program you write as if it were a letter to a stranger who will have to fix all the bugs you left in it, add new functionality, or translate it to another language. It should be possible to read through a program on four levels: 1. At the top level only. Reading only the highest-level comments should give the reader a broad idea of the purpose and structure of the program. 2. At the intermediate level. Reading the top- and intermediate-level comments should show you not only what the program does, but what algorithm was used. 3. At the lowest comment level. Including these comments in a reading of the program adds detailed information about the actual implementation of the algorithm. Standards Page 162 4. At the code level. Including the code itself brings you down to the bit-twiddling level. It should never be necessary to read code unless you want to learn how things are done, or you are modifying the program. Example showing intermediate- and low-level comments: ; Get a JFN for the input file. movx t1, GJ%SHT!GJ%OLD ; Using short form GTJFN, for an old file hrroi t2, [asciz /foo.txt/] ; called foo.txt, GTJFN ; get a Job File Number. erjmp filErr ; On error go to the file-error handler. hrrzm t1, inJfn ; No error, save the JFN (without flags). ; Now open the file. movx t2, fld(7,OF%BSZ)!OF%RD ; In 7-bit byte read mode, OPENF ; open the file. erjmp filErr ; On error go to the file-error handler. ; File is open. Now begin processing. : : The style of capitalization of the opcodes, JSYS bits, and variables is a matter of taste, but consistent use of a mixture of upper and lower case to enhance the readability of the program is strongly encouraged. In this example, MONSYM symbols (JSYS names and bits) are capitalized, instructions and register names are in lower case, and variable names formed from multiple words have capital letters at word breaks (e.g. filErr = file error). 8.4. Pagination of Source Programs Source listings should be divided into pages by formfeed (control-L) characters. A CRLF should precede and follow each formfeed. Source files should be arranged so that major modules, subroutines, etc., begin at the top of a page. Only when a subroutine is a quarter page or less in total size should it begin other than at the top of a page. Judicious use should be made of blank lines (they should be inserted to emphasize logical boundaries, but without large gaps). Garish graphic devices like lines across the page, boxes around headings, etc., should be avoided since they waste the programmer's time (both in typing them in and in waiting for them to be typed out) without adding any useful information to the program; the principle formatting devices for separating things and getting attention should be blank lines and formfeed characters. It should rarely be necessary for flow of control to cross a listing page. That is, the last instruction on each page would normally be an unconditional transfer of control not preceded by a skipping instruction. An unbroken sequence of instructions longer than one listing page is strong evidence of insufficient subroutinization. However, when a sequence of instructions does Standards Page 163 cross a page, the last line on the preceding page and the first line on the following page should be a comment line of the form ; .. where the semicolon appears at the first tab stop directly under the preceding opcode. E.g., move t1, foo ; Comment ; .. ^L ; Comment about continuation. ; .. label: movem t1, baz ; Comment The page break should occur at a natural juncture in the program logic. 8.5. Other Assembler Functions For top-level macro definitions, the DEFINE should appear at the left margin and be followed by one space. The name of the macro being defined should appear next, followed by one space. The dummy argument list, if any, should appear next, followed by the open angle-bracket. E.g., define macnam (a,b,c)< or define macnam < A comment or CRLF should follow the open angle-bracket, and the body of the macro definition should begin on the next line, except when the entire macro definition is on one line in order to be used as part of an expression. The terminating right angle bracket should be on a line by itelf, on the left margin. Example: define getSum (a,b,c)< ;; Macro to get a sum into an AC. move a, b ;; Load contents of b into AC a, add a, c ;; and add contents of c. > Macro calls generally do not require parentheses surrounding the arguments. The exception is a macro call appearing within an expression, e.g. aa+foo(xx,yy)+bb Standards Page 164 In some cases, however, the parentheses improve clarity, e.g. "foo (,x)" shows more clearly that the first argument to foo is omitted than does "foo ,x". Angle-brackets should be used to quote any argument containing non-alphanumeric characters; this is especially true of character strings with imbedded blanks. Examples (where %jsErr is a macro name): GTJFN %jsErr , r OPENF %jsErr (,r) Top level assembler conditionals should be indented 3 spaces from the left margin (so that tag and comment lines may always be leftmost). Lower level conditionals should be indented 3 or more spaces. The terminating angle-bracket of a conditional should appear: 1. immediately following the last instruction if the conditional is only one line long; 2. on a separate line indented the same amount as the pseudo-op which began the conditional. Example: tag: ife ftFoo, ; If feature Foo is selected just do this. ifn ftFoo,< ; If feature Foo is not selected... move t1, mumble ; then do this, ifn ftBaz,< ; and if feature Baz is selected... add t1, baz ; do this move t2, frotz ; and this too. > ; End of ifn ftBaz. jrst frotzH ; Go to the frotz handler. > ; End of ifn ftFoo. A closing angle bracket should NEVER appear in a comment (i.e. following a semicolon on the same line). The coding should be correct even if the assembler were made to ignore angle-brackets in comments. Assembly listing options are provided for those who want listings, but experience has shown that assembly listings are rarely useful; it is usually sufficient to work from a source listing. However, all assemblies should begin with SALL which has been shown to produce the most readable assembly listings. In addition, all source files should have a TITLE, and SUBTTL's for each logical section of the program (e.g. symbol definitions, initialization, main program, support routines, data area). In general, macros worth defining at all are worth defining on a system-wide basis. Therefore, localized, special-purpose macros are discouraged. Standards Page 165 8.6. Instruction Mnemonics The standard KL-10 instruction mnemonics as defined by the DECsystem-10/20 Hardware Reference Manual should be used throughout. No abbreviated opcodes should be used. Macro or opdef definitions should be made to define a useful mnemonic which is related to a function being performed in the code. See the subroutine conventions below for examples. 8.7. Variables and Structures Use of the stack variable and data structure facilities in MACSYM is recommended. See MACSYM documentation. Because of these facilities, the following should be observed: 1. Explicit pushing and popping of quantities is rarely done. 2. Explicit referencing of the stack, e.g. as -n(p) is never done. 3. Fields within data blocks or tables are not referenced by halfword instructions or explicit byte pointers but rather by the MACSYM facilities LOAD, STOR, etc. 4. Flags can be defined with DEFSTR, MSKSTR, or as full-word parameters, and can be referenced with the TXmn MACSYM macro. Flags should never be defined as half-word quantities which require the programmer to remember whether to use TL or TR. Other macros may be defined to operate on single-bit flags, which should be held in register 0; in this case too, no instruction should depend on the actual value or location of a flag. 8.8. Subroutines All AC's should be preserved over a subroutine call unless an explicit statement to the contrary appears in the subroutine description. ACs are changed over a subroutine call only when values are to be returned to the caller. The allocation of ACs for all inter- and intra-module subroutine calls should be: ACs 1,2,3,4 Passing arguments and returning results. ACs 0, 5-15 Preserved, not changed by subroutine (or saved and restored if necessary). AC 16 Temporary, used by MACSYM call/return procedure and reserved for use by other call/return procedures. AC 17 Global stack pointer Call and return should be effected by 'pushj p,' and 'popj p,' respectively. A set of assembler mnemonics has been defined for subroutine mechanics as Standards Page 166 follows: call (= pushj p,) should be used to call subroutines, e.g. 'call subr'. ret (= popj p,) should be used to return +1 from subroutines. retskp should be used to return +2 from subroutines. Retskp is equivalent to: jrst [ aos (p) popj p,] callret may be used to call a subroutine and return immediately thereafter. It is equivalent to call subr ret or call subr ret retskp Note that 'callret' is not guaranteed to be a single instruction; therefore it may not be skipped over. The other returns above are guaranteed to be single instructions. These mnemonics are used to emphasize the FUNCTION being performed (calling, returning) rather than the mechanics of the function (pushing, jumping, etc.). Also, these mnemonics could continue to be used even if a more general calling standard were adopted at some time in the future. Return may also be effected by transferring control to one of the global labels R or RSKP, e.g. jumpe t1, r ; Equivalent to "jumpe t1, [ret]". jumpn t1, rskp ; Equivalent to "jumpn t1, [retskp]". The general temporaries should be used for passing arguments to subroutines and returning values. AC1 (t1) should be used for a single argument routine, ACs 1 and 2 for a two-argument routine, etc. When subroutines modify of depend on global data, including flags, such effects should be clearly and completely documented. A routine defined to return caller +2 (skip) on success and caller +1 (noskip) on failure is acceptable. Returns greater than caller +2 are considered very bad form. Standards Page 167 8.9. AC Definitions The following mnemonics have been chosen to be consistent with the AC-use conventions above. The preserved ACs are divided into three groups, f intended for Flags, and q1-q3 and p1-p6 intended for general use. The ACs within each group are consecutive. 0 - f 10 - p1 1 - t1 11 - p2 2 - t2 12 - p3 3 - t3 13 - p4 4 - t4 14 - p5 5 - q1 15 - p6 6 - q2 16 - cx 7 - q3 17 - p The programmer should assume that each group (Tn, Qn, Pn,) is in ascending order, e.g. that t2 = t1+1, but that the specific assignment of numbers may change. Explicit numeric offsets from AC symbols (e.g. T1+1) should NEVER be used. Instructions which use more than one AC (e.g. div, jffo) must be given an AC operand such that the other AC(s) implicitly affected are in the same group. E.g. t3 (and t4) is OK for idiv because t3+1=t4, but q3 is not because q3+1=??. AC0 is almost universally used as a flag register, and should not be used for any other purpose. There is a facility in MACSYM to save and automatically restore ACs. The indicated ACs are saved on the stack at the point of execution and a dummy return is placed on the stack which causes these ACs to be restored automatically when the current routine returns. Use of this facility eliminates the need for matching push/pop pairs at the entry and exits of routines and eliminates the bugs which often arise from an unmatched push or pop. The facility is: saveAC in which a, b, and c are AC names; one or more may be specified. Defining a different mnemonic for a preserved AC may be of value when the AC is used for a specific function by a large body of code. However, it offers the possibility of confusion because two different symbols may refer to the same AC unbeknownst to the programmer. In smaller programs, use of certain ACs can be restricted to specific functions, and a global definition is appropriate. A very large program, however, usually cannot accomodate a sufficient number of dedicated ACs. Therefore, when a specific function-oriented AC definition is made, it should be explicitly decided which modules should use the definition (by "module" is meant a separately-compiled section of code). Within these modules, the usual name for the AC must be purged so that there is no possibility of using two different symbols for the same AC. Only preserved ACs may be used for special definitions. Parameters to subroutines may be passed in functionally defined ACs in the following cases: 1. On an intra-module call where the contents of the AC is appropriate to its function definition. Standards Page 168 2. On an inter-module call where the same definition exists in both modules and the AC is being used for its intended function. A parameter may NOT be passed in a preserved AC unless both caller and callee know it by the same name, and that name must be a specific one related to the function which the AC is performing. The procedure for declaring a functionally defined AC is: defAC newAC,oldAC This must be done at the beginning of an assembly, and it causes newAC to take on the value of oldAC. OldAC must be the mnemonic for one of the regular preserved ACs, and this mnemonic will be purged and therefore unavailable for use in the current assembly. An AC with a special definition should not be used for other purposes; e.g. "JFN" should not be used to hold some quantity other than a JFN merely because it happens to be available. 8.10. Subroutine Documentation The following is a suggested format for documenting the calling sequence of a subroutine. A description of this sort should appear at the beginning of every subroutine, no matter how short. srName: comment | One or more lines describing what the subroutine does. Enter with: t1/ Description of first argument. t2/ Description of second argument. : : global foo/ Description of required global variable. Returns: +1: Conditions giving this return, with: t1/ Value(s) returned. t2/ More value(s) returned. : : global foo/ Description of how it was effected. +2: Conditions and values as above. | Notes: 1. The arguments, if any, should be documented as the contents of registers and/or variables as shown. If there are more than 4 arguments, then the address of an argument list should be passed. 2. It is absolutely essential that all global inputs and outputs (variables, tables, flags, etc., that are not declared within the Standards Page 169 subroutine) be noted; failure to do so hides crucial assumptions and effects from readers, and makes modification a very tricky business. 3. An example of argument setup, call, and +1/+2 handling should be given if necessary for clarification. 4. The return(s) should be noted as shown; "Always" or "Never" may be used as the condition where appropriate; the +2 return need not be shown if it does not exist; values returned should be described in the same form as arguments. 8.11. Multi-line Literals The use of multi-line literals is encouraged as a technique for making code more readable and easier to follow. The following additional rules apply: 1. The opening bracket for a multi-line literal should occur in the position that the first character of the address field would have appeared if the instruction had an ordinary address, e.g. skipge foo jrst [ 2. The first and all following instructions within the literal should begin at the second tab stop, e.g. jrst [ move t1, mumble ; Comment jrst fie ] ; Comment The tab between the open bracket and the first opcode may be omitted if the line position is already at or beyond the second tab stop, e.g. GTJFN ercal [move t1, mumble If the first opcode is beyond the second tab stop, it is better to start the enclosed code on a new line, e.g. jumpge t1, [ move t1, mumble 3. The closing bracket should follow the last field of the last instruction (as above), and should be before the comment on the same line. It should have a blank before it to set it off from the preceding token. Standards Page 170 4. Nesting of multi-line literals to a depth greater than two is discouraged because of awkward formatting problems. 5. Labels should not appear in multi-line literals. 6. No hard and fast fules can be given as to when to use or not use multi-line literals. However, a literal longer than about 10 lines becomes suspect. 7. Use of ".+1" is legal in a literal to return to the main sequence. 8.12. Flow of Control - Branch Conventions In general, jumps should be to labels forward (down the page) from the point of branch except for loops. Tops of loops should be identified by comment. The expressions ".+1" and ".-1" are the only legal uses of "." (this location). All other potential uses should be avoided in favor of an explicitly defined label. "Global" jumps should be avoided altogether. Higher-level languages do not permit them, and with good reason. The only exceptions are jumps to well defined and published exit sequences, e.g. R, RSKP (see subroutine conventions, above). 8.13. Numbers In general, there should be no occasion to use a literal number in in-line code. All parameters, bit definitions, etc., should be defined mnemonically at appropriate places. It is much easier to err in the direction of too little use of mnemonics rather than too much; therefore, when in the slightest doubt, define a mnemonic. 8.14. Sharability When two or more people are running the same (copy of a) program, they can share the memory pages it occupies. This cuts down on page faults and makes the program, and the system, run more efficiently. Most higher level languages produce sharable code automatically, but the assembly language programmer must take special pains to do so. To write sharable programs, you must collect all your impure data together in one place, segregating it from actual program code and pure data (a word is impure if it can be modified at runtime). Each user gets a private copy of any page that has been written into, so sharability is greatest when all the impure data has been collected into the fewest possible pages. Pure data (e.g. command or dispatch tables) can be freely mingled with code provided there is no way for control to pass into it. Standards Page 171 8.15. Living in an Imperfect World Much of the current DECSYSTEM-20 software was written before the existence of this standard and therefore does not conform to it. A great deal of systematic editing has already been done to improve conformance, but obvious irregularities exist. In general, new code being added should conform exactly to this standard even if being integrated with old code. The following are some specific problems which may arise and the recommended solutions: 8.15.1. AC Mnemonics Some code uses absolute numeric ACs. If new code is being integrated into a sequence which uses numeric ACs, it is desirable that the existing code be edited to use the standard mnemonics, particularly for the preserved ACs. If the programmer cannot take the time to do that, then the mnemonics T1-T4 should be used for ACs 1-4, and other ACs should be referenced in the same way as is done by the existing code. Some code uses mnemonics A,B,C,D for the temporary ACs. These same mnemonics should be used for new code integrated into this existing code, or all references can be edited to use the standard mnemonics. You may write some code using the standard mnemonics for preserved ACs and then discover that the module into which you wish to put this code has redefined some of these ACs. The solution is one or a combination of the following: 1. Move the new code to a module which does not redefine the preserved ACs. 2. Use different preserved ACs -- ones which have not been redefined. (Note it is not acceptable to use an AC with a special definition for other than its special purpose.) Clearly, code which needs some of the special definitions must be placed in a module which has these ACs defined and must therefore use only the other preserved ACs. Note that a value which usually resides in a special AC need not ALWAYS reside there; for example, it can be placed in one of t1-t4 as a subroutine argument. 8.15.2. Stack Handling Use of the several stack variable facilities defined in MACSYM is recommended. Some old code uses explicit PUSH and POP and references of the form -n(p) however, and when anything more than trivial modifications must be made to such code, it is most strongly recommended that the code be edited to use STKVAR or TRVAR. Failing that, references must be consistent with the existing code. How To Page 172 9. How to Write, Assemble, and Run Programs Macro programs must be entered into the computer by means of a text editor. The best such editor is EMACS, which not only surpasses other editors in power and flexibility, but understands the syntax of Macro programs; to take advantage of this understanding you should put EMACS in Macro mode (MM Macro Mode). Macro programs are fully compatible with the Exec Load-Class commands. These are: COMPILE The COMPILE command produces a relocatable object program from one or more source programs. The resulting object file has the extension .REL. If there is already a .REL file of the given name that is newer than any of the source files, no compilation will be done unless the /COMPILE switch is included. LOAD The LOAD command invokes LINK, the system linkage-editor and loader, to bring compiled modules into memory, resolving any unresolved references, and produce an executable core image. The LOAD command will recompile any module whose most recent .REL file is older than the most recent source (.MAC), i.e., any source file you've edited since the last LOAD. If you want to have a directly-executable (.EXE) version of your program on the disk, you should issue the SAVE or CSAVE command after the LOAD command completes. LOAD does not start the program; you must use the START command to do this. EXECUTE The EXECUTE does what the LOAD command does and then STARTs execution of your program. DEBUG The DEBUG command acts like the EXECUTE command, but it also loads DDT (the source-level assembly language debugger) into memory and starts DDT instead of the program. It is equivalent to LOAD followed by DDT. If you repeatedly edit your program with EMACS, EDIT, or Otto, you can issue a special command which saves the .MAC file, exits from the editor, and then tells the Exec to reissue its last load class command, with the 'saved arguments'. In EDIT the command is 'g'; in Otto it is 'go'; in EMACS it must be assigned to a key in your EMACS.INIT file. All the LOAD-class commands have the following features: 1. Recognizing that your program is in Macro, provided the extension (filetype) of the source file is .MAC. 2. Saving their arguments, or if no arguments were specified, recalling the arguments of the last LOAD-class command in which arguments were specified. For instance, if you have already issued the command LOAD foo,mylib:bar you can achieve the same effect in subsequent compilations by merely typing How To Page 173 LOAD The same effect is achieved by exiting the editor with a 'go'-type command, as described above. 3. Taking arguments from an indirect file. You can put repeatedly used arguments in a file as they would appear in the command and then give a command like 'load @mac'. The '@' indicates that the command should look to the file for its arguments. 4. Gathering files together to produce one program. This is done by separating the files to be worked on by either commas or plus signs on the command line; commas mean to consider the files as separate modules (with surrounded by Title and End pseudo-ops), a plus signs means to concatenate the enclosing files. 5. Passing switches to the compilers and LINK. The load-class commands themselves have many switches. Type any load-class command followed by a slash and then a question mark to see what they are. DDT Page 174 10. Interactive Debugging of Assembly Language Programs This chapter was adapted from the DECsystem-10 Utilities Manual and the Rutgers University IDDT manual. DDT (Dynamic Debugging Tool) is a program that allows you to test and debug assembly language programs interactively using the same symbols (labels and register names) that are used in the program. You can invoke DDT in either of two ways: type 'DDT' after the completion of a LOAD command (or after GETting an .EXE file) or by using the DEBUG command (see 9). In either case, the effect is the same: DDT is loaded into pages 770-777 of your address space (your program should not use these pages) and started. You may then type DDT commands to examine storage locations, set breakpoints, execute selected sections of the program, single-step through the program, search for certain instructions or data, modify code or data, and so forth. DDT commands are purposely terse, and therefore somewhat cryptic. You can use DEL (RUBOUT) and ^U to edit commands; these have the expected effect, but echo as "XXX". In the following descriptions, a dollar sign ($) indicates that the ESC (ESCAPE or ALTMODE) character should be typed. All numbers are in octal unless otherwise noted. 10.1. Typeout Modes You can select how DDT will display the contents of memory. Normally (i.e. by default) it tries to interpret each word as an instruction, decoding the opcode, accumulator, index, indirect, index, and address fields and substituting appropriate values from the program's (or DDT's own) symbol table, e.g. 202067,,1000 (octal) would be typed by DDT as MOVEM T1, @FOO(Q3) assuming that T1 is defined to be 1, Q3 to be 7, and FOO to be 100 in the program's symbol table. You can tell DDT to interpret words in other ways, either on a temporary or prevailing basis. The following commands are used to set the type-out modes: $S Symbolic instruction. $C Numeric, in current radix. $F Floating point. $T 7-bit ASCII text. $6T SIXBIT text. $5T RADIX50. $H Halfwords, two addresses. DDT Page 175 $nO Bytes (of n bits each), separated by commas. 10.1.1. Address Mode Typeout The following commands are used to set the address modes in typeout of symbolic instructions and halfwords: $R Relative to symbolic address, e.g. FOO+17 $A Absolute numeric address, e.g. 4023. 10.1.2. Radix Typeout You can also determine the radix (base, e.g. decimal, octal) in which the numeric parts of a displayed word are typed, using the following command: $nR Change radix to n (n>1). 10.1.3. Prevailing vs Temporary Modes The typout-mode commands shown above may be started with a single ESC ($), or two ESC's ($$): $$ Set the prevailing type-out or address mode or a prevailing radix (e.g. $$10R - set prevailing numeric radix to 10) $ Set the temporary typeout mode; display words or fields in this mode until carriage return is typed (e.g. $2R - set the temporary numeric radix to 2). CR (carriage return) Terminate temporary modes, and revert to prevailing modes. Initial modes are: $$S (symbolic instructions), $$R (relative addresses), and $$8R (radix 8 [octal]). 10.2. Storage Words The following commands are used to examine storage words. If you type ESC between the address and the delimiter, the effective address of the typed quantity will be calculated first, e.g. 1(p)$/ will print the contents of the word after the one pointed to by p. adr/ Open and examine the contents of adr in current type-out mode. adr! Open, but inhibit the type-out. adr[ Open and examine word as a number in current radix. adr] Open and examine word as symbolic instruction. DDT Page 176 ; retype last quantity typed (useful after setting a temporary type-out mode, e.g. $6T;). adr\ Open in current mode, but don't change location counter. 10.2.1. Related Storage Words The following are used to examine related storage words: (linefeed or ^J) Close the current word (making any modifications typed in) and to open the following word. ^ (circumflex) or ^H (backspace) to close current word (with modifications) and open adr-1. ^I (tab) Close current word (with modifications) and open word specified by address of current word and to set the location counter to that place. \ Same as ^I, but location counter is not changed. Close word (with modifications) and revert type-out to prevailing mode. 10.2.2. Retyping In Modes Other than the Prevailing Mode Each of the following commands specifies the mode in which DDT should immediately retype the last expression typed by DDT or the user. Neither the temporary nor the prevailing mode is altered. = Retype as halfwords in current radix. _ (undescore) Retype as a symbolic instruction (address part determined by $A or $R). / Type out, in current type-out mode, the contents of the location specified by the address in the open instruction word, and to open that location, but not to change the location counter. [ Retype as a number and open contents of the location specified by the open instruction and not to change the location counter. ] Same as above, but type out as a symbolic instruction. 10.3. Typing In These are shown by example: ADD AC1,@DATE(17) to type in instruction simply type it in symbolically. DDT Page 177 402,,202 to type in halfwords. 1234 to type in octal values. 99. to type in fixed point decimal integer. 101.11 to type in .. 77.0E+2 ... a floating-point number. "/ABCDE/ ASCII input (/ is delimiter; up to 5 characters). "A$ ASCII input (one character, right justified). $"/ABCDEF/ SIXBIT input (/ is delimiter; up to 6 characters). $"Q$ one SIXBIT character. 10.4. Symbols A symbol is is defined in DDT as a string of up to six letters and numbers including the special characters ., %, and $. Characters after the sixth are ignored. They are defined in a table kept with the program you are DDTing, called the "symbol table". Programs have only one symbol table unless their authors have taken explicit action to create more than one. Such actions include compiling and/or linking the program from more than one source and/or .REL file. foo$: Permit reference to local symbols within a module with title "foo" (open the symbol table of foo). You can refer to symbols in a program without issuing this command, but DDT always appends "#" to the symbol on typeout in this case to show that it's only guessing. If a program has more than one symbol table, you should use this command to let it know which table to use. n>FOO+23 where breakpoint number 7 was set at location FOO+23 (for instance, by the command FOO+23$7B). 10.5.1. Proceeding from a Breakpoint After you have poked around to your satisfaction, you may continue the program using the following commands: $P Proceed from a breakpoint. Keep executing until another (or the same) breakpoint is encountered, or the program halts. n$P Proceed as above, but pass this breakpoint n-1 times (i.e. ignore it until the nth encounter). $$P Proceed from a breakpoint... n$$P ...and thereafter proceed automatically. 10.5.2. Single Stepping After a breakpoint has been encountered, you may wish to single-step through your program, rather than continuing with $P. DDT provides the following commands for single-stepping: $x Execute the current instruction, type out the new contents of any locations affected by the instruction, type out the next instruction, and wait. No breakpoints are moved or removed. n$x Like $x, but do it for n instructions. n$$x Same as $x, but execute instructions unconditionally, without typing anything out, until PC reaches either .+1 or .+2. This is useful for executing debugged subroutines or UUO's. 10.5.3. Conditional Breakpoints You can insert a conditional instruction, or a call to a closed subroutine, using the following command: $nB+1/instruction Insert a conditional instruction or call a conditional routine, when breakpoint n is reached. When the breakpoint is reached, this instruction or subroutine is executed. If the instruction (or subroutine) does not skip, the program either stops or continues based on the contents of the proceed counter (located at $nB+2). If the instruction or subroutine skips, the program stops. If the subroutine causes a double skip return, the program proceeds. This works as follows: DDT Page 181 skipe $nB+1 ; Conditional break instruction? xct $nB+1 ; Yes, execute it (it may skip). sosg $nB+2 ; Decrement proceed counter. jrst ; If greater than 0 then break jrst ; else proceed. 10.5.4. Starting the Program $G Start at .JBSA (the normal starting address). adr$G Start or continue at a specific address. 10.6. Searching There are three kinds of searches: word search, not-word search, and effective address search. In the following commands, ac are respectively: lower limit, upper limit, searchword; a defaults to 0, b defaults to "end". The word search and not-word search compare each storage word with the word being searched for in those bit positions where the mask, located at $M, has ones. The mask word contains all ones unless otherwise set by you. All words in the given range that show equality to the search object under the mask are typed out. ac$w Search for word containing c. ac$N Search for word not containg c. ac$E Search for a word containing effective address c. n$M Set the mask to n. $M/ Examine the mask. You may use the word search command to list the range of locations from 'first' to 'last', as follows: 0$M first0$W This sets the mask to 0 and then performs a word search for 0. Remember to put the mask back to -1 (or whatever its previous value was) before continuing. 10.6.1. Zeroing Memory $$Z Zero memory except DDT, locations 20-137, and the symbol table. first Break caused by conditional break instruction. >> Break because proceed counter exhausted. U Undefined symbol can not be assembled. # (as in FOO#) The symbol has been interpreted from a symbol table you have not formally opened. number,,number Half-word number type-out. #1.234E+27 Unnormalized floating-point number. 123. Decimal point indicates decimal number. ? Illegal command or all 8 breakpoints used. XXX Rubout, ^U echo. 10.8. Miscellaneous DDT Commands 10.8.1. Immediate Mode Instruction Execution instr$X Execute the instruction in immediate mode. 10.8.2. Execute Indirect Command File $Y Read and execute a command file called BATCH.DDT. $"/name/$Y Read and execute a command file called name.DDT. 10.8.3. Patch A patch is a section of code or data inserted into an .EXE file, usually to correct a bug. At the appropriate location, an instruction is replaced by an unconditional jump to the new code; the instruction that was displaced is included in the new code, and the new code usually terminates with a jump back the original sequence. You can use DDT to insert a patch with the following commands: $< Patch before the currently open location. $$< Patch after the currently open location. $> End the patch. DDT Page 183 When you begin a patch, DDT will open the first location in the patch area (an area set aside automatically by Macro, or by other means, for the installation of patches). The patch area is defined by the symbol PAT.. or PATCH or C(.JBFF), whichever is found first. Alternately, you can type a single symbol preceding the patch begin command (as in "FF$<"), and it will be taken as the beginning of the patch area. If you are doing a patch after the current location, DDT will insert your original instruction and then open open the next location. You may now proceed to enter your patch using linefeeds to enter the new instructions or data. When you have finished the patch, type $> and DDT will: 1. Close the current location, if any. 2. If a patch "before" was was being done, DDT will insert the instruction from the original location. 3. DDT will insert JUMPA 1,LOC+1 and JUMPA 2,LOC+2, where LOC is the original patch location. Thus skipping instructions may be patched. Note: the original location is not changed until the patch completion command is given. Thus, you can give up or restart the patch at any time. DDT remembers the parameters of the most recent patch begin command and uses them at patch completion, whereupon they are forgotten. A second patch completion will produce an error indication. 10.9. Sample DDT Session Here's a very short DDT session, just to get you started. A Macro program is compiled and loaded, DDT is called in, a breakpoint is set, the program is started and runs until the breakpoint is encountered. A location is examined and its contents replaced, and the program is then continued. This short example illustrates the most useful DDT commands. In the example, after DDT is started, DDT typeout is shown in upper case; your commands are shown in lower case. @load foo ; Compile and load the program. MACRO: Foo LINK: Loading EXIT @ddt ; Start DDT. DDT ; DDT's greeting. foo$: ; Open Foo's symbol table. subr$b $g ; Set a breakpoint and start. $1B>>SUBR ; The breakpoint is encountered. ./ MOVEM T1, FROB ; Display the instruction at the breakpoint. t1/ 5 -1 ; Display contents of t1, then replace ; it with -1. $x ; Now execute the instruction. T1/ -1 FROB/ -1 ; The effected locations are displayed. SUBR+1/ MOVE T1,Q1 $p ; The next instruction is shown, and . ; the program is continued using $P. . ; (the session continues... . ; ...) DDT Page 184 ^Z ; Exit from DDT @ ; back to the Exec. 10.10. IDDT (Invisible DDT) This section is extracted from a document from Rutgers University, describing IDDT, a program that has a long lineage, starting at BBN and including at least SRI, MIT, and Rutgers. All of the elementary commands and features have been included. For further details, see the complete IDDT manual. IDDT is a debugger which will handle programs with multiple forks and ones which use high core (which is used by regular DDT) as well as other programs which DDT can't handle. It has many of the same commands as the standard DDT and ordinarily may be used without regard to the fact that it is a different debugger. The primary feature of IDDT is that it operates on user programs which run in any inferior fork of IDDT. Thus, an errant user program cannot destroy the debugger or its symbol table because the debugger is in a totally different address space. This relation between the program being debugged and IDDT is much the same as the relation between current user programs (including IDDT) and the Exec. Because of this, IDDT must simulate many of the services ordinarily provided by the EXEC, such as GET, LINK, RUN, etc. 10.10.1. Using IDDT IDDT may be called into service either before or after programs have been loaded into memory. This is done by typing the Exec command IDDT. This command causes the EXEC to splice a fork containing IDDT in between itself and the program to be debugged. This operation is done in a way that preserves the state of the user's program including its fork structure. It is possible to ^C out of a running program and get IDDT. If this is done, a $P (Proceed) command will resume running the user program. The EXEC command "NO IDDT" will unsplice the fork containing IDDT in the event the user wishes to continue his program without having an IDDT above it. A fairly common practice is to get IDDT first and use it to load the program to be debugged. One of three IDDT commands may be used to load the object program: $L (run LINK in the user fork), ;L (Loadgo (RUN) named file), or ;Y (Yank (GET) the named file). The first of these is essentially the same as the EXEC command, LINK. The second is comparable to RUN, while the last is similar to GET. In addition the ;M (Merge the named file) will simulate the MERGE EXEC command. The $L command causes IDDT to run LINK in the user's address space. Upon completion, LINK may return control to IDDT. At this point IDDT will have the LOADER's symbol table. In order to switch to the symbols of the program which was loaded, the ;S (Symbol) command should be typed. ;S tells IDDT to look for a standard symbol table pointer in location 116 (.JBSYM). DDT Page 185 10.10.2. EXEC-like Features For convenience, IDDT has several commands which provide the same services as some EXEC commands. These are: ;A Give informaton about page n or if n is missing about all pages. (like a INFO MEMORY command) ;F Fork information about fork n or if n is missing about all forks. (like INFO FORK command) ;J JFN information on JFN n or if n is missing about all JFNs. (like INFO FILES) ;;J Job information (like INFO JOB and INFO FORK) ;I Interupt status of PSI system (like INFO PSI-STATUS) ;V "View cell" - sets address, contents of which is displayed when ^T (control T) is typed. If n is missing clear "view cell" ;Y Yank - analogous to EXEC command GET ;M Merge - analogous to EXEC command MERGE ;L Loadgo - Analogous to EXEC command RUN $$G Go - analogous to EXEC command START $$1G Analogous to EXEC command REENTER $$nG Analogous to EXEC command START at offset n of entry vector $P Proceed - analogous to EXEC command CONTINUE ;U Unget (SAVE 0 777) a>105. 10.10.5. Fork Handles IDDT's attention can be shifted to any fork in the program being debugged using the ;;F command. In the form: n;;F, fork n becomes current and all examines, deposits, breakpoints etc pertain to it. n is a small number (actually the low bits of IDDT's fork handle on the fork in question). 0 always means the top fork of the debugee. In the form mQQZZ$5E This command will stop after typing five instructions lying between locations "FOO" and "BAR" which have an effective address of "QQZZ". DDT Page 188 10.10.12. Single Stepping There are two flavors of single stepping, $Y and $J: $J just fetches the next instruction and executes it, so if that next instruction is a subroutine call, the entire subroutine will be executed, and if that instruction is a jrst, the program will be continued. $Y fetches the next instruction and if it is a jump of any sort, it is interpreted so that in fact only one instruction is executed at a time, so that if the instruction is a subroutine call, the program counter will be set to the beginning of the subroutine. (this is most like $X in other DDTs) Both commands have the same syntax: $J single steps $J will step num times ()$J will step num instruction or until the contents of location loc are changed, in which case it will stop and say (WP)PC LOC/ new contents of loc where pc is the instruction after the one that attempted to change the value of loc. Note, however the contents of loc will not be changed. $$J is equivalent to infinity()$J, ie it will single step until an attempt is made to change the contents of loc. If the verbose switch is on, the instruction being executed and any AC's or the view cell (see above) that change will be typed out. Both instruction are extremely slow. 10.10.13. Other Commands ^L blanks the screen $. returns current PC ;. inserts the value of the internal symbol into the expression. Current internal symbols are: SYMOFS Offset from symbol printed as in + (default value is 777 octal). PC Current PC (see also $.) (can be set with $0G). example: ;.PC/ 10010 displays the contants of the PC register Examples Page 189 11. Programming Examples The programs in this chapter are purely pedagogical in nature and don't do anything useful. 11.1. Binary Search Program title binSrc - Demonstration of binary search. $VERNO=1 ; Program version number $EDNO=2 ; Program edit number comment \ Most recent update: 10:23am Thursday, 19 April 1979 Prompts user for a number, looks it up in a list containing the even numbers from 2 to 12 (with some duplicates). H. Eskin, F. da Cruz, CUCCA, April 1979 \ search CUsym ; Obtain Columbia macros, symbols, etc. %setEnv ; Search Monsym, Macsym, initialize things. numLen=^d20 ; Buffer for number typein. ; Standard 3-word entry vector entVec: jrst start ; Start address jrst reEntr ; Reentry address %version ($VERNO,$EDNO) ; Standard version number evlen=.-entvec ; Entry vector length reentr: jrst start ; reentry handling (nothing special) start: %setUp ; RESET, set up stack, etc. hrroi t1, [asciz /Number to search for: /] call getNum call search %print < No exact match%/> %print < %d found at %o%/>, jrst start Examples: Binary Search Page 190 search: comment | Binary search routine Looks the given number up in a list, which is presumed to be in ascending numeric order (with duplications allowed). Returns values appropriately for insertion of a new element. Enter with: t1/ Number to search for. global list/ list of numbers to be searched, in ascending order. global top/ address of the word after the last element in the list. Returns: +1: if number not found, +2: if number found, with (in either case): t1/ Number that was found; t2/ The address of the greatest number less than or equal to the desired number. If no such number, then the address of the word before the beginning of the list. | Lo=p1 ; Mnemonic aliases for registers hi=p2 ; (not normally recommended). PURGE p1,p2 ; (to avoid duplicate names for same ACs). saveAC ; Save these registers. ; Set up initial conditions. movei Lo, list-1 ; Point before search list move hi, top ; and after it. ; This loop does the binary search. loop: cail lo, -1(hi) ; Is the list null? jrst nFound ; Yes, then the element has not been found. move q1, Lo ; No, calculate midpoint: set probe to Lo add q1, hi ; ... + hi idivi q1, 2 ; ... / 2. camn t1, (q1) ; Exact match? jrst found ; Yes, done. ; No match, adjust boundaries of list appropriately. caml t1, (q1) ; If search key is not less than this one skipa Lo, q1 ; then move low bound and skip move hi, q1 ; else move high bound, jrst loop ; and go back and try again. ;.. ; (continued next page... Examples: Binary Search Page 191 ;...continued from preceding page) ; Get here when the list is used up (and the search object not found). nFound: move t1, (Lo) ; Greatest element that was less than search move t2, Lo ; key was found at this address. ret ; Return indicating failure to find key. ; Get here only on exact match. Must check for duplicates. ; Put aobjn counter in left half of probe, then search. found: movn t2, top ; Negative Address of 1st element not in list. add t2, q1 ; Calculate -number of elements to be checked. hrl q1, t2 ; Make aobjn pointer. next: camn t1, 1(q1) ; Toodle down the list until no match aobjn q1, next ; or no more elements. fin: move t1, (q1) ; Return the value that was found hrrz t2, q1 ; and its address. retskp ; Skip return to indicate that it was found. p1=Lo ; Restore normal AC mnemonics p2=hi ; ... PURGE hi, Lo ; ... Examples: Binary Search Page 192 getNum: comment | Get a number from the terminal, allowing editing. Enter with: t1/ pointer to asciz prompt. global numBuf defined to be of length numLen. Returns +1 always, with t1/ number that was typed. This routine catches all format errors and reprompts the user until a valid integer is typed. | saveAC ; Since not used for passing args. move q1, t1 ; Save pointer to prompt. reTry: move t1, q1 ; And restore it in case of error. PSOUT ; Issue the first prompt. hrroi t1, numBuf ; Point to buffer for string that user types. movx t2, RD%BEL!numLen ; Break on CRLF, max length for typein. move t3, q1 ; Reprompting text. RDTTY ; Get string, allowing editing. %jsErr ; Errors are fatal, but very unlikely. ; Now convert the string to a number. We don't do the NIN directly from ; the terminal because NIN does not allow editing. hrroi t1, numBuf ; Point to string representation of number. movei t3, ^d10 ; Radix for interpretation. NIN ; Number In - do the conversion. %jsErr (,reTry) ; On error, print msg and ask again. move t1, t2 ret ; All OK, return. Examples: Binary Search Page 193 %impure ; impure data numBuf: block numLen ; For number typein. radix ^d10 list: 0 2 2 2 4 6 8 10 10 12 top: . ^L end ; - EMACS editing modes - ; local modes: ; mode:Macro ; comment start:; ; comment rounding:+1 ; end: Examples: COMND JSYS Page 194 11.2. COMND Example title Foo - A small Exec $VERNO=1 ; Program version number $EDNO=1 ; Program edit number comment \ most recent update: 11:04am Wednesday, 30 May 1979 This program is written to demonstrate various things: 1. Use of the COMND JSYS via the Columbia UUOs, with secondary parsing done by subroutines, which return success or failure indications to the main parsing sequence. This is the recommended method for writing most interactive programs. 2. The normal convention for "rescan" entry into a program. If the program is invoked by typing "foo xxx" to the Exec, where "xxx" is a Foo command, the program executes that command and then halts; in other words, it behaves as though "foo xxx" was an Exec command. On the other hand, if the program is invoked by typing "foo", the user is prompted for commands until the "exit" command is given. 3. The $dir routine shows how to do filename stepping with the GNJFN JSYS. 4. The %print UUO is demonstrated in various places. The fact that it assembles into one word in line is shown to be handy; it can be used in "case" statements, it can be skipped over, etc. F. da Cruz, CUCCA, May 1979 \ search CUsym ; Obtain macros, symbols, etc. %setEnv ; Search Monsym, Macsym, initialize things. extern rescan ; Routines from CUrel. ; Symbol and flag definitions %flags Examples: COMND JSYS Page 195 ; Standard 3-word entry vector entVec: jrst start ; Start address jrst reEntr ; Reentry address %version ($VERNO,$EDNO) ; Standard version number evLen=.-entVec ; Entry vector length reEntr: jrst start ; Nothing special on reentry. start: %setUp ; RESET, set up stack, etc. call init ; Initialize. jrst [ HALTF ; On failure, halt jrst . ] ; and don't allow continuation. ; Get here after successful initialization, or upon continuation. cont: call main ; Do the work. HALTF ; and stop the program. %trnOff xitFlg ; (In case they continue jrst cont ; ...) Examples: COMND JSYS Page 196 subttl Initialization init: comment | Initialize the program. | ; Pass rescan argument (if any) to command parser %trnOff xitFlg ; Turn off the exit flag move t1, [point 7, [asciz/Foo/]] ; Name of this program. call rescan ; Do rescan processing. %trnOn xitFlg ; Rescan arg was found, turn on flag. retskp Examples: COMND JSYS Page 197 subttl Main program - highest level command parser. main: stkVar temp ; Allocate local temporary variable on stack. %skpOff xitFlg ; Rescan entry? jrst repars ; Yes, don't set up prompt. restar: %skpOff xitFlg ; If we get here with the exit-flag on, there ret ; was an error in the rescan line, so exit. ; Come here when there were no arguments on the Exec command line. %cmIni <>,,,gjfBlk ; Issue prompt, specify GTJFN block address. %jsErr ; Handle errors. ; This is the reparse address. Come here after any parse error, including ; deletion into a field previously parsed (this is done automatically by ; the %merrep and %jmerrep macros). repars: %cmKey cmdTab, ; Get major command. %merrep restar, repars ; Handle any parse errors. hrrz t2, (t2) ; Get address of associated dispatch word hrrzm t2, temp ; (we'll need it again soon) load t1, %prsAd, (t2) ; and secondary parse routine address call (t1) ; which we call to parse the next field(s). %jmerrep restar, repars, restar ; Handle error return. ; Get here after all fields have been successfully parsed. move t2, temp ; Get command table word back again, load t1, %evlAd, (t2) ; and from it, the action routine address. call (t1) ; Call the action routine. nop ; (ignore failure) %skpOff xitFlg ; Was it an exit command? ret ; Yes, exit. jrst restar ; No, keep going. ; This is the major command table. Note that it although it is data and ; not code, it is still pure storage, and can be kept with the routine ; that uses it. cmdTab: %table %key , [.dir,,$dir] %key , [.echo,,$echo] %key , [.exit,,$exit] %key , [.help,,$help] %key , [.info,,$info] %tbEnd Examples: COMND JSYS Page 198 subttl Secondary Parsing Routines and Action Routines. .dir: comment | Parse rest of "directory" command. | movx t1, CZ%NCL!.FHSLF ; Release all non-open JFNs CLZFF ; ... %cmNoi ; Issue guide words. %pret ; Fail-return on error. ; Get file specification for directory listing. %cmFil ,,CM%SDH %pret ; Fail-return on error. move q1, t2 ; Save directory JFN in q1. %cmCfm ; Get confirmation. %pret ; Fail-return on error. retskp ; Return successfully. $dir: comment | Execute the "directory" command. | setzi p1, ; File counter. move t1, q1 ; JFN of first file. ; Loop to get the next file that matches the given specification ; and to print its name at the terminal. $dLoop: aos p1 ; Count the file. hrrzs t1 ; Clear out the wildcard flags, %print < %j%/>, ; and print the filename. move t1, q1 ; Get back the original JFN. GNJFN ; Get the Next JFN that matches. jrst $dDone ; If none, then all done. jrst $dLoop ; Loop for all files. ; Come here when no more files match the given specification. $dDone: %print <%/ %d File>, ; Say how many files. caie p1, 1 ; If other than one, %print ; then pluralize. retskp ; Return successfully. Examples: COMND JSYS Page 199 .echo: comment | Parse rest of "echo" command. | %cmNoi ; Issue noise word. %pret ; Fail-return on error. %cmTxt ; Get a text line. %pret ; Fail-return on error. move t1, [point 7, buf] ; Get the text line into buf %cmGab t1 ; from the atom buffer. %cmCfm ; Get confirmation. %pret ; Fail-return on error, retskp ; otherwise return successfully. $echo: comment | Execute the "echo" command. | %print <%s>,<[point 7, buf]> ; Print the text line. retskp Examples: COMND JSYS Page 200 .exit: comment | Parse the rest of the "exit" command. | %cmNoi ; Issue guide words. %pret ; Fail-return on error. %cmCfm ; Get confirmation. %pret ; Fail-return on error, retskp ; otherwise, return successfully. $exit: comment | Execute the "exit" command. | %trnOn xitFlg ; Just turn on the exit flag, retskp ; and return successfully. Examples: COMND JSYS Page 201 .help: comment | Parse the rest of the "help" command. | %cmNoi ; Issue noise words. %pret ; Fail-return on error. %cmKey hlpTab, , help ; Get command to help with. %pret ; Fail-return on error. hrrz p1, (t2) ; Get address of help text. %cmCfm ; Get confirmation. %pret ; Fail-return on error, retskp ; otherwise return successfully. $help: comment | Execute the "help" command. | %print <%s>,<[point 7, (p1)]> ; Print the help text. retskp hlpTab: %table ; Table of help commands. %key , $$dir %key , $$echo %key , $$exit %key , $$help %key , $$info %tbEnd $$dir: asciz | Type "directory xxx", where "xxx" is a file specification. If there is a file that matches that specification, its name will be typed. If the specification is "wild", i.e. contains "*" or "%" characters, the names of all files that match will be typed. | $$echo: asciz | The 'echo' command types back what you type as its argument, e.g. "echo xxx" types "xxx" at your terminal. | $$exit: asciz | The 'exit' command halts the program. You can continue by typing the Exec 'continue' command. | $$help: asciz | The 'help' command gives information about the various Foo commands. Type "help xxx" where 'xxx' is any Foo command. Type "help ?" to see what commands have help available. | $$info: asciz | The 'information' command prints information about various things. Type 'information ?' to see what things are available. | Examples: COMND JSYS Page 202 .info: comment | Parse rest of "information" command. | %cmNoi ; Noise word. %pret ; Fail-return on error. %cmKey ; Get keyword. %pret ; Fail-return on error. hrrz p1, (t2) ; Get info routine index. %cmCfm ; Get confirmation, %pret ; Fail-return on error, retskp ; otherwise return successfully. infTab: %table ; Info command table. %key , 0 %key , 1 %key , 2 %tbEnd $info: comment | Execute the "information" command. | xct [ %print < You are connected to %c> ; Print the quantity %print < %n> ; given by the %print < You are logged in as %u> ; "case index" ](p1) ; in p1. retskp ; Return successfully Examples: COMND JSYS Page 203 subttl Impure Data Section %impure ; Declare it impure for sharability. ; GTJFN block for %cmFil. Some of the words are filled in by COMND, ; so this block must be in the impure section. gjfBlk: GJ%OLD!GJ%IFG!GJ%FLG!GJ%ACC!.GJALL ; Flag bits,,generation number. .PRIIN,,.PRIOU ; Input,,output JFNs. 0 ; Default device (none). 0 ; Default directory (none). point 7, [asciz/*/] ; Default Filename (wild). point 7, [asciz/*/] ; Default filetype (wild). 0 ; Default protection (none). 0 ; Default account (none). 0 ; JFN to assign (none). 0 ; No extended block. block ^d10 ; Other args not used either. buf: block ^d500/5+1 ; Buffer for "echo" string. ^L end ; Address,,length of entry vector. ; - EMACS editing modes - ; local modes: ; mode:Macro ; comment start:; ; comment rounding:+1 ; end: Assembly Language Guide Page 204 Index %clear 152 %cmRes 141 %erMsg 84, 153 %jsErr 74, 153 %print 136 %prSkp 137 %setEnv 151 %setUp 141, 151 .EXE 97, 174 .MAC 34, 172 .PRIIN 75, 78, 79, 83, 88 .PRIOU 75, 78, 79, 83, 88 .REL 34, 51, 97, 172, 177 .REQUIRE 57, 124 .RTJST 126 .UNV 34, 57 Accumulator 3, 8, 9, 12, 15, 73, 132, 149, 159, 165, 167, 174 Accumulators, Restoring 167 Accumulators, Saving 11, 167 ACVAR 134 ADD 19, 28 Address 4, 8, 62, 159, 174, 175 Address Space 184 ADJBP 25 ADJSP 12 AND 26, 36, 77 ANDCA 26 ANDCB 26 ANDCM 26 AOBJ 15 AOJ 17, 28, 67 AOS 16, 28 Appending to a File 79, 80 Argument 37, 63, 64, 67, 70, 73, 163, 168 Arithmetic 19, 20, 21 Arithmetic Expression 45 Arithmetic Shift 23 AROV 27 Array 15, 37, 47 ASCII 35, 36, 47, 70, 75, 101, 174, 177 ASCIZ 48, 76, 79, 101 ASH 22, 23, 27 ASHC 23, 27 Assembler 7, 34, 97 Assembly Language 2, 159 ASUBR 134 BIN 83 Binary 7, 35, 38 Bit 4, 8, 19, 26, 40, 75, 124, 174 BLOCK 48 Assembly Language Guide Page 205 BLT 10, 11 Boolean Logic 26 BOUT 83 Breakpoint 179, 186 Byte 4, 24, 48, 58, 75, 93, 174 Byte Pointer 24, 25, 55, 75, 126, 146, 165 CAI 18 CALL 131, 165 CALLRET 131, 166 CAM 18 Case Statement 32 Character 4 CLOSF 78, 81 Closing a File 81 Command Initialization 114, 140 Comment 36, 49, 59, 61, 64, 105, 159, 160, 168 COMND 85, 102, 136, 140, 144, 149, 154 COMPILE 172 Concatenation 37, 67 Conditional Assembly 164 Confirmation 79, 114 Continuation 35, 37, 105 CPU 4 CUrel 144 CUsym 104 CUUOs 136 Data Structure 128, 165 DCK 28, 29 DDT 58, 160, 172, 174 DDT Type-in 176 DDT Type-out 174, 182 Debug 172 DEC 3, 50 Decimal 34, 36, 38, 177 DECR 129 DECsystem-10 33 DECSYSTEM-20 3 Default Argument 64, 67 DEFINE 50, 63 DEFSTR 128, 165 Destination 12, 75 Device 78, 114, 138 DFAD 22 DFDV 22, 29 DFMP 22 DFSB 22 Difference 36 Directory 4, 78, 114, 138 Disk 3 DIV 20, 29 Divide 19, 20, 29 DMOVE 20 DMOVN 28 DMUL 27 Double Precision 20, 21, 22 Double Word 20, 23 Assembly Language Guide Page 206 DPB 24 Dummy Argument 63, 64 Effective Address 8, 175 EMACS 172 END 50, 54, 55 End of File 81, 82, 83, 85, 93, 96 ENTRY 45, 51 EQV 26 ERCAL 73, 137 ERJMP 73, 137 Error Message 73, 138, 153 Errors 71 Excess 200 21 EXCH 10 Exec 3, 97, 172, 184, 185, 187 Execute 172 EXP 51, 54 Exponent 21, 22, 23, 39 Expression 10, 37, 40, 45, 60, 70, 178 EXTERN 51 FAD 21 Fail 2, 159 Failure Return 73, 157 FDV 21, 29 FDVR 29 Field 77, 113, 126 File 4, 75, 77, 78, 93 File Name 78 File Pointer 93 File Type 78 FIX 22, 28 Fixed Point Arithmetic 19, 20 FIXR 23, 28 Flag 157, 165 FLD 77, 126 FLIN 92 Floating Point 21, 23, 36, 39, 92, 114, 138, 177 Floating Point Arithmetic 21 Floating Point Overflow 28 Floating Point Zero 21 FLOUT 92 FLTR 22, 23 FMP 21 Fork 97, 184 Fortran 39 FOV 28, 29 Fraction 21, 39 FSB 21 FSC 22 Generation Number 78, 79 GetOK 144 Global 44 GTJFN 78, 140 GTSTS 82 Guide Word 112 Assembly Language Guide Page 207 Halfword 12, 36, 46, 59, 76, 165, 174 HALTF 100 Handle 99 Help Message 105, 116, 144 HRROI 76 IBP 25 IDDT 184 IDIV 19, 29 IDPB 25, 55 IFx 51 ILDB 25, 55 IMUL 19, 28 INCR 129 Indefinite Repetition 69 Indentation 160 Index 8, 15, 37, 62, 159, 174 Indirect 8, 29, 30, 32, 37, 62, 159, 174 Indirect File 105, 173 Input File 79, 112 Input/Output 4, 81, 82, 84 Instruction 4, 7, 165 Instruction Modifier 9, 27 Integer 22, 23, 38, 90, 91 Integer Arithmetic 19, 20 INTERN 52 INTERNAL 36, 37, 44 IOR 26 IOWD 12, 53 IPB 55 IRP 53, 58, 69 IRPC 54, 58, 69 JAND 130 JFCL 32, 63 JFFO 33 JFN 78, 99 JFNS 79, 138 JFOV 63 JNAND 130 JNOR 130 JOR 130 JRA 30 JRST 31, 132 JSA 30 JSERR 135, 153 JSHLT 135 JSP 27, 30 JSR 27, 30 JSYS 7, 73 JSYS Error Recovery 74 JUMP 15 JXm 127 KA10 2, 7, 20, 21, 25 Keyword 111, 141, 154 KI10 2, 7, 11, 25 Assembly Language Guide Page 208 KL10 2, 7, 11, 25, 73 KS10 2, 7 Label 36, 43, 59, 159 LDB 24 Link 34, 51, 57, 124, 172 Linked List 130 LIT 40, 54, 55 Literal 10, 37, 40, 54, 136, 169 LOAD 129, 165, 172 Load-Class Commands 172 LOC 62 Local 44 Local Label 153 Location Counter 36, 41, 62 Logical Complement 36 Logical Expression 18, 45 Logical Shift 23 Logical Testing 26 Loop 15 Lowercase 35 LSH 23 LSHC 23 Macro Expansion 65 Macro-20 2, 34, 73, 159 Macros 35, 36, 37, 50, 59, 61, 63, 77, 124, 163 MACSYM 77, 86, 104, 124, 149, 165 MAKLIB 51 Mask 26, 77, 124, 126, 127 MASKB 127 Memory 3, 9, 12, 62, 97 Midas 2, 159 MOD. 135 Module 167 Modulo 135 Monitor 3, 7 Monitor Call 2, 7, 43, 73 MONSYM 75, 77, 149 MOVE 9, 20 MOVM 28 MOVN 28 MOVX 76, 127 MSKSTR 128, 165 MUL 20, 27 Multiply 19 NIN 90, 92 NOUT 77, 91, 92, 138 Number 111, 116, 152, 170 Number Conversion 22, 90 OCT 54 Octal 34, 36, 38, 138, 177 Opcode 59, 60, 62, 73, 159, 162, 174, 178 OPDEF 55, 60, 61, 165 OPENF 78, 80 Opening a File 80 Assembly Language Guide Page 209 Operand 59, 60 Operating System 3, 7 Operator 60 OPSTR 130 OPSTRM 130 OR 36, 77, 78 ORCA 26 ORCB 26 Output File 79, 112 Overflow 11, 12, 19, 20, 22, 23, 27, 31 Page 93, 145 Pagination of Programs 162 Pass 34, 61, 72 Patch 182 PBIN 83 PBOUT 83 PC 8, 19, 27, 31, 32, 125, 179, 188 PDP-10 3, 7, 33 PDP-6 3, 7, 21, 25 Phase 51, 72 POINT 25, 55 POINTR 126 POP 11, 12, 165, 171 POPJ 165 POS 126 PRGEND 55, 58 PRINTX 56 Process 97 Product 36 Program 97 Program Control 15, 30, 42, 73, 162, 170 Program Counter 27 Program Version Number 152 Prompt 88, 105, 140 Pseudo-op 47, 60 PSOUT 86 PURGE 56 PUSH 11, 12, 133, 165, 171 PUSHJ 27, 31, 133, 165 Quadruple Word 20 Quotient 36 R 166 Radix 38, 56, 175 RADIX50 174 Random-access Input/Output 93 RDTTY 85, 87, 90 Recognition 78, 79, 105 Reentrant 30, 170 Register 3, 9 RELOC 62 Remainder 19 REPEAT 56, 61 Rescan 147 RESET 99 RET 131, 133 Assembly Language Guide Page 210 RETSKP 131, 133 Return 165 RFORK 100 RFPTR 95 RIN 96 ROT 23 Rotate 23 ROTC 24 Round 23 ROUT 96 RSKP 166 Save 187 Scale 22 SEARCH 57, 124, 149 Sequential Input/Output 82, 94 SETA 26 SETCA 26 SETCM 26 SETCMP 129 SETM 26 SETO 26 SETONE 129 SETZ 26 SETZRO 129 SFPTR 94, 95 Sharability 170 Shift 23, 37, 40, 126 Sign 21, 38, 39, 91 SIN 84, 147 SIXBIT 36, 37, 57, 70, 138, 174, 177 SKIP 16 Skip Return 31, 73, 166 Skipping Instruction 160 SOJ 17, 28 SOS 16, 28 Source 12, 75 Source/Destination Designator 75 SOUT 85 Stack 11, 12, 31, 132, 151, 165, 171 Standards 159 Statement 34, 59, 61, 159 STCMP 101 STKVAR 132, 171 STOPI 58, 69, 70 STOR 165 String 75, 115 String Input/Output 84, 86 Style 159 SUB 19, 28 Subroutine 45, 73, 142, 165, 168 SUBTTL 58, 164 Sum 36 Swap 37 Switch 112 Symbol 37, 42, 75, 124, 150, 170, 177 Symbol Table 42, 44, 55, 57, 58, 61, 177 Symbol, Created 67, 68 Assembly Language Guide Page 211 Terminal 75, 78, 87, 99 TEST Instructions 26 TEXTI 85 Timesharing 3 TITLE 58, 164, 177 TMSG 134, 152 Token 115 Tops-10 34, 57, 58 Tops-20 3, 73 TRVAR 133, 171 Two's Complement 19, 21, 36, 38 TXmn 82, 127, 165 Unconditional Jump 31 UNIVERSAL 57 UUO 7, 144, 150 WID 126 Word 4, 7, 19, 21, 38, 40, 175 XCT 32 XOR 26 XWD 59 Z 59 Assembly Language Guide Page i Table of Contents 1. Introduction 2 1.1. Basic Concepts 3 1.1.1. Terminology 3 1.1.2. Machine Organization 3 1.1.3. Instructions and Addressing Modes 4 1.1.4. Internal Representation of Numbers 4 1.1.4.1. Binary Numbers 4 1.1.4.2. Two's Complement Representation 5 1.1.4.3. Integers 5 1.1.4.4. Floating Point Numbers 5 1.1.5. Arithmetic 5 1.1.5.1. Integer Arithmetic 5 1.1.5.2. Floating Point Arithmetic 5 1.1.6. Logical Operations 5 1.1.7. Character String Manipulation 5 1.1.8. Elementary Data Structures 5 1.1.8.1. Tables (Arrays) and Indexing 5 1.1.8.2. Stacks 6 2. The PDP-10/DECSYSTEM-20 Instruction Set 7 2.1. Introduction 7 2.2. Full Word Instructions 9 2.2.1. MOVE 9 2.2.2. EXCH - Exchange 10 2.2.3. BLT - Block Transfer 10 2.2.4. Programming Examples Using Fullword Instructions 10 2.3. Stack Instructions 11 2.3.1. PUSH - Push on Stack 11 2.3.2. POP - Pop Stack 12 2.3.3. ADJSP - Adjust Stack Pointer 12 2.4. Halfword Instructions 12 2.4.1. HR - Halfword Right 13 2.4.2. HL Halfword Left 14 2.5. Arithmetic Testing 15 2.5.1. AOBJ - Add One to Both Halves and Jump 15 2.5.2. JUMP 15 2.5.3. SKIP 16 2.5.4. AOS - Add One and Skip 16 2.5.5. SOS - Subtract One and Skip 16 2.5.6. AOJ - Add One and Jump 17 2.5.7. SOJ - Subtract One and Jump 17 2.5.8. CAM - Compare Accumulator to Memory 18 2.5.9. CAI - Compare Accumulator Immediate 18 2.6. Fixed Point Arithmetic 19 2.6.1. ADD 19 2.6.2. SUB - Subtract 19 2.6.3. IMUL - Single-Word Multiply 19 2.6.4. IDIV - Single-Word Divide 19 2.6.5. MUL - Multiply 20 2.6.6. DIV - Divide 20 Assembly Language Guide Page ii 2.7. Double Word Move Instructions (KI10 and KL10) 20 2.8. Double Precision Integer Arithmetic (KL10 only) 20 2.9. Floating Point Arithmetic 21 2.10. Other Floating Point Instructions 22 2.10.1. FSC - Floating Scale 22 2.10.2. FIX - Convert Floating Point to Integer 22 2.10.3. FIXR - Fix and Round 23 2.10.4. FLTR - Float and Round 23 2.11. Shift Instructions 23 2.12. Byte Instructions 24 2.12.1. LDB - Load Byte 24 2.12.2. DPB - Deposit Byte 24 2.12.3. IBP - Increment Byte Pointer 25 2.12.4. ILDB - Increment and Load Byte 25 2.12.5. IDPB - Increment and Deposit Byte 25 2.12.6. ADJBP - Adjust Byte Pointer 25 2.12.7. POINT - Construct a Byte Pointer 25 2.13. Logical Testing and Modification 26 2.14. Boolean Logic 26 2.15. PC Format 27 2.16. Program Control 30 2.16.1. JSR - Jump to Subroutine 30 2.16.2. JSP - Jump & Save PC 30 2.16.3. JSA - Jump and Save Accumulator 30 2.16.4. JRA - Jump and Restore Accumulator 30 2.16.5. PUSHJ - Push on stack and Jump 31 2.16.6. POPJ - Pop stack and Jump 31 2.16.7. Programming Hints Using PUSHJ and POPJ 31 2.16.8. JRST - Jump and Restore 31 2.16.9. JFCL - Jump on Flag and Clear 32 2.16.10. XCT - Execute 32 2.16.11. JFFO - Jump if Find First One 33 2.17. References 33 3. The DECSYSTEM-20 Macro Assembler 34 3.1. Introduction 34 3.2. Elements of Macro 35 3.2.1. Special Characters 35 3.2.2. Numbers 38 3.2.2.1. Integers 38 3.2.2.2. Radix 38 3.2.2.3. Floating-point Decimal Numbers 39 3.2.2.4. Binary Shifting 40 3.2.2.5. Underscore Shifting 40 3.2.3. Literals 40 3.2.4. Symbols 42 3.2.4.1. Selecting Valid Symbols 43 3.2.4.2. Defining Symbols 43 3.2.4.3. Symbol-table Search Order 44 3.2.4.4. Symbol Attributes 44 3.2.5. Expressions 45 3.2.5.1. Arithmetic Expressions 45 3.2.5.2. Logical Expressions 45 3.2.5.3. Evaluating Expressions 46 Assembly Language Guide Page iii 3.3. Pseudo-ops 47 3.3.1. ARRAY 47 3.3.2. ASCII 47 3.3.3. ASCIZ 48 3.3.4. BLOCK 48 3.3.5. BYTE 48 3.3.6. COMMENT 49 3.3.7. DEC 50 3.3.8. DEFINE 50 3.3.9. END 50 3.3.10. ENTRY 51 3.3.11. EXP 51 3.3.12. EXTERN 51 3.3.13. IFx Group 51 3.3.14. INTERN 52 3.3.15. IOWD 53 3.3.16. IRP 53 3.3.17. IRPC 54 3.3.18. LIT 54 3.3.19. OCT 54 3.3.20. OPDEF 55 3.3.21. POINT 55 3.3.22. PRGEND 55 3.3.23. PRINTX 56 3.3.24. PURGE 56 3.3.25. RADIX 56 3.3.26. REPEAT 56 3.3.27. .REQUIRE 57 3.3.28. SEARCH 57 3.3.29. SIXBIT 57 3.3.30. STOPI 58 3.3.31. SUBTTL 58 3.3.32. TITLE 58 3.3.33. XWD 59 3.3.34. Z 59 3.4. Macro Statements and Statement Processing 59 3.4.1. Labels 59 3.4.2. Operators 60 3.4.3. Operands 60 3.4.4. Comments 61 3.4.5. Statement Processing 61 3.4.6. Assigning Addresses 62 3.4.7. Machine Instruction Mnemonics and Formats 62 3.4.8. Mnemonics with Implicit Accumulators 63 3.5. Using Macros 63 3.5.1. Defining Macros 63 3.5.2. Invoking Macros 64 3.5.3. Macro Invocation Format 65 3.5.4. Quoting Characters in Macro Arguments 66 3.5.5. Nesting Macro Definitions 67 3.5.6. Concatenating Macro Arguments 67 3.5.7. Default Arguments and Created Symbols 67 3.5.7.1. Specifying Default Values 68 3.5.7.2. Created Symbols 68 3.5.8. Indefinite Repetition 69 3.5.9. Alternate Interpretations of Characters Passed to Macros 70 Assembly Language Guide Page iv 3.6. Errors and Messages 71 3.6.1. Informational Messages 71 3.6.2. Single-Character Error Codes 72 3.6.3. MCRxxx Messages 72 4. Introduction to Tops-20 Monitor Calls 73 4.1. Introduction 73 4.2. General Information 73 4.3. Using Mnemonic Symbols 75 4.4. Source/Destination Designators 75 4.5. Setting up a JSYS Invocation 76 4.6. JSYS's to Open and Close Files 77 4.6.1. GTJFN (JSYS 20) - Get Job File Number (short form) 78 4.6.2. OPENF (JSYS 21) - Open a File 80 4.6.3. CLOSF (JSYS 22) - Close a File. 81 4.7. File i/o JSYS's 81 4.7.1. GTSTS (JSYS 24) - Get file Status 82 4.7.2. Sequential Byte i/o 82 4.7.2.1. BIN (JSYS 50) - Byte In 83 4.7.2.2. PBIN (JSYS 73) - Primary Byte In 83 4.7.2.3. BOUT (JSYS 51) - Byte Out 83 4.7.2.4. PBOUT (JSYS 74) - Primary Byte Out 83 4.7.2.5. Example of Byte i/o 84 4.7.3. String-oriented i/o 84 4.7.3.1. SIN (JSYS 52) - String In 84 4.7.3.2. SOUT (JSYS 53) - String Out 85 4.7.3.3. PSOUT (JSYS 76) - Primary String Out 86 4.7.3.4. Example of String I/O 86 4.7.3.5. RDTTY (JSYS 523) - Read string interactively from TTY 87 4.7.3.6. RDTTY Example 89 4.7.4. Number conversion JSYS's 90 4.7.4.1. NIN (JSYS 225) - Number In 90 4.7.4.2. NOUT (JSYS 224) - Number Out 91 4.7.4.3. NIN/NOUT Example 92 4.7.5. Random-access i/o 93 4.7.5.1. RFPTR (JSYS 43) - Read File Pointer 95 4.7.5.2. SFPTR (JSYS 27) - Set File Pointer 95 4.7.5.3. RIN (JSYS 54) - Random byte In 96 4.7.5.4. ROUT (JSYS 55) - Random byte Out 96 4.8. Fork-Handling JSYS's 97 4.8.1. What's in a Fork? 97 4.8.2. The Fork Environment 99 4.8.3. Basic Fork-Handling JSYS's 99 4.8.3.1. RESET (JSYS 147): Reset the current fork 99 4.8.3.2. HALTF (JSYS 170) - Halt the current fork 100 4.8.3.3. Examples of RESET and HALTF 100 4.9. Miscellaneous JSYS's 100 4.9.1. STCMP (JSYS 540) - STring CoMParison 101 5. The COMND JSYS - JSYS 544 102 5.1. Informal Introduction 102 Assembly Language Guide Page v 5.2. General Information 104 5.3. Bits Supplied in State Block on COMND Call 109 5.4. Function Descriptor Block 110 5.4.1. Words .CMFNP and .CMDAT of the FDB 111 5.4.2. Word .CMHLP of the FDB 117 5.4.3. Default Help Messages 118 5.4.4. Word .CMDEF of the FDB 119 5.4.5. Word .CMBRK of the FDB 119 5.5. Bits Returned on COMND Call 120 5.6. Macros 121 5.6.1. FLDDB.(TYP,FLGS,DATA,HLPM,DEFM,LST) 121 5.6.2. FLDBK.(TYP,FLGS,DATA,HLPM,DEFM,BRKADR,LST) 122 5.6.3. BRMSK.(INI0,INI1,INI2,INI3,ALLOW,DISALLOW) 122 5.6.4. FLDBK. 122 5.7. Errors 123 6. MACSYM System Macros 124 6.1. Introduction 124 6.2. Definitions 124 6.2.1. Standard Program Version 125 6.2.2. Miscellaneous Constants (Symbols) 125 6.2.3. Control Characters (Symbols) 125 6.2.4. PC Flags (Mask Symbols) 125 6.2.5. Macros to Manipulate Field Masks 126 6.2.5.1. WID(MASK) 126 6.2.5.2. POS(MASK) 126 6.2.5.3. POINTR(LOC,MASK) 126 6.2.5.4. FLD(VAL,MASK) 126 6.2.5.5. .RTJST(VAL,MASK) 126 6.2.5.6. MASKB(LBIT,RBIT) 127 6.2.6. Instructions Using Field Masks (Macros) 127 6.2.6.1. MOVX AC,MASK 127 6.2.6.2. TXmn AC,MASK 127 6.2.6.3. IORX AC,MASK; ANDX AC,MASK; XORX AC,MASK 127 6.2.6.4. JXm AC,MASK,ADDRESS 127 6.2.7. Data Structure Facility (Macros) 128 6.2.7.1. DEFSTR and MSKSTR 128 6.2.8. Subroutine Conventions (Macros/opDefs) 131 6.2.8.1. CALL address 131 6.2.8.2. RET 131 6.2.8.3. RETSKP 131 6.2.8.4. CALLRET address 131 6.2.8.5. AC Conventions 132 6.2.9. Named Variable Facilities (Macros and Runtime Code) 132 6.2.9.1. STKVAR namelist 132 6.2.9.2. TRVAR namelist 133 6.2.9.3. ASUBR namelist 134 6.2.9.4. ACVAR namelist 134 6.2.10. Miscellaneous 134 6.2.10.1. TMSG string 134 6.2.10.2. JSERR 135 6.2.10.3. JSHLT 135 6.2.10.4. MOD.(DEND,DSOR) 135 Assembly Language Guide Page vi 7. Columbia Macros and Packages 136 7.1. Utility UUO Package for Macro-20 136 7.1.1. Formatted Printing Package 136 7.1.2. %prPush and %prPop 139 7.1.3. COMND-Jsys-Made-Easy Package 140 7.1.3.1. %cmRes 141 7.1.3.2. %cmKey (keytab, help, default, flags) 141 7.1.3.3. %cmgab bp 142 7.1.3.4. %comnd flddb 142 7.1.3.5. %cmgfg flag 142 7.2. CUrel Utility Subroutines 144 7.2.1. Helper 144 7.2.2. GetOK 144 7.2.3. Gfcpg 145 7.2.4. pagMgr 145 7.2.5. Subbp 146 7.2.6. Rescan 147 7.3. CUsym MACSYM Augmentation Macros 149 7.3.1. Accumulator Support 149 7.3.2. %DefAC 150 7.3.3. Useful Symbols 150 7.3.4. UUO Package OPDEFs and Interface Symbols 150 7.3.5. Setup Environment Macros 151 7.3.5.1. %setEnv 151 7.3.5.2. %setUp 151 7.3.6. Storage Declaration Macros 151 7.3.7. General-Purpose Macros 151 7.3.7.1. %Stack 151 7.3.7.2. %Version 152 7.3.7.3. %Clear 152 7.3.8. Macros Used for Common Primary I/O 152 7.3.8.1. %typeCR(string) 152 7.3.8.2. %crType(string) 152 7.3.8.3. %typNum(num,cols,rdx) 152 7.3.8.4. %crlf 153 7.3.8.5. %tab 153 7.3.9. JSYS Support Macros 153 7.3.9.1. %jsErr 153 7.3.9.2. %erMsg 153 7.3.10. Local Label Support Macros 153 7.3.10.1. %Cat(a,b) 154 7.3.11. COMND JSYS Support Macros 154 7.3.11.1. %Ptr(string) 154 7.3.11.2. %table and %tbEnd 154 7.3.11.3. %key(name, data, flags) 154 7.3.11.4. %Flddb (typ, flgs, data, hlpm, defm, lst) 155 7.3.11.5. %Handlr(p,e), %PrsAdr, %EvlAdr 155 7.3.11.6. %CMxxx Macros to Invoke .CMxxx COMND Functions 155 7.3.12. Macros to Handle COMND Errors 156 7.3.12.1. %pret 156 7.3.12.2. %errep errlab, replab 156 7.3.12.3. %merrep errlab, replab 157 7.3.12.4. Macros for Fail-Return from Parsing Routines 157 7.3.12.5. %jerrep errlab, replab, othrlb 157 7.3.12.6. %jmerrep errlab, replab, othrlb 157 7.3.13. Flag-Handling Macros 157 7.3.13.1. %Flags(aFlg,bFlg,cFlg,...) 157 Assembly Language Guide Page vii 7.3.13.2. %trnOn & %trnOff 157 7.3.13.3. %TrOnS & %TrOfS 158 7.3.13.4. %SkpOn & %SkpOff 158 7.3.13.5. %AnyOn & %AnyOff 158 7.3.14. CUuos (CUCCA Utility UUOs) Interface 158 7.3.14.1. %print, %prSkp 158 8. Macro-20 Programming Standards and Conventions 159 8.1. Introduction 159 8.2. Statements 159 8.3. Comments 160 8.4. Pagination of Source Programs 162 8.5. Other Assembler Functions 163 8.6. Instruction Mnemonics 165 8.7. Variables and Structures 165 8.8. Subroutines 165 8.9. AC Definitions 167 8.10. Subroutine Documentation 168 8.11. Multi-line Literals 169 8.12. Flow of Control - Branch Conventions 170 8.13. Numbers 170 8.14. Sharability 170 8.15. Living in an Imperfect World 171 8.15.1. AC Mnemonics 171 8.15.2. Stack Handling 171 9. How to Write, Assemble, and Run Programs 172 10. Interactive Debugging of Assembly Language Programs 174 10.1. Typeout Modes 174 10.1.1. Address Mode Typeout 175 10.1.2. Radix Typeout 175 10.1.3. Prevailing vs Temporary Modes 175 10.2. Storage Words 175 10.2.1. Related Storage Words 176 10.2.2. Retyping In Modes Other than the Prevailing Mode 176 10.3. Typing In 176 10.4. Symbols 177 10.4.1. Special DDT Symbols 178 10.4.2. Arithmetic Operators 178 10.4.3. Field Delimiters In Symbolic Type-ins 178 10.5. Breakpoints 179 10.5.1. Proceeding from a Breakpoint 180 10.5.2. Single Stepping 180 10.5.3. Conditional Breakpoints 180 10.5.4. Starting the Program 181 10.6. Searching 181 10.6.1. Zeroing Memory 181 10.7. Special Characters 182 10.8. Miscellaneous DDT Commands 182 10.8.1. Immediate Mode Instruction Execution 182 10.8.2. Execute Indirect Command File 182 10.8.3. Patch 182 Assembly Language Guide Page viii 10.9. Sample DDT Session 183 10.10. IDDT (Invisible DDT) 184 10.10.1. Using IDDT 184 10.10.2. EXEC-like Features 185 10.10.3. View Cell 186 10.10.4. Breakpoints 186 10.10.5. Fork Handles 186 10.10.6. Escape character 186 10.10.7. RUBOUT and Type-in Editing 186 10.10.8. Interface with the Exec 187 10.10.9. Saving a Core Image 187 10.10.10. Single Instruction Executes 187 10.10.11. Search Commands 187 10.10.12. Single Stepping 188 10.10.13. Other Commands 188 11. Programming Examples 189 11.1. Binary Search Program 189 11.2. COMND Example 194 Index 204