A Beginner's Guide to Forth                      

                                          by                                  

                                      J.V. Noble                              

       Contents 

       0. Preliminaries 
         0.1 Forth in space 
         0.2 What is Forth? 
         0.3 Forth resources and further reading 

       1. Getting started 

       The structure of Forth 

       2. Extending the dictionary 

       3. Stacks and reverse Polish notation (RPN) 
         3.1 Manipulating the parameter stack 
         3.2 The return stack and its uses 

       4. Fetching and storing 

       5. Comparing and branching 

       6. Documenting and commenting Forth code 

       7. Arithmetic operations 

       8. Looping and structured programming 

       9. CREATE ... DOES> (the pearl of Forth) 
         9.1 Defining "defining" words 
         9.2 Run-time vs. compile-time actions 
         9.3 Dimensioned data (intrinsic units) 
         9.4 Advanced uses of the compiler 


       0. Preliminaries 

       Forth is an unusual computer language that has probably been applied 
       to more varied projects than any other. It is the obvious choice when 
       the project is exceptionally demanding in terms of completion sched- 
       ule, speed of execution, compactness of code, or any combination of 
       the above. 

       It has also been called "...one of the best-kept secrets in the com- 
       puting world." This is no exaggeration: large corporations have pur- 
       chased professional Forth development systems from vendors such as 
       Laboratory Microsystems, Inc., Forth, Inc. or MicroProcessor Engineer- 
       ing, Ltd. and sworn them to secrecy. 

       Some speculate (unkindly) that corporate giants prefer to hide their 
       shame at using Forth; but I believe they are actually concealing a 
       secret weapon from their rivals. Whenever Forth has competed directly 
       with a more conventional language like C it has won hands down, pro- 
       ducing smaller, faster, more reliable code in far less time. I have 
       searched for examples with the opposite outcome, but have been unable 
       to find a single instance. 


       0.1 Forth in Space 

       A recent query on the comp.lang.forth conference on Internet yields 
       some insight into Forth's significance to space exploration: 

       <PMORTENS@ESTEC.BITNET> writes: 
       > Is Forth suitable for use in embedded systems onboard a satellite 
       (for 
       > instance a micro-satellite) ? 
       > 
       > Regards, 
       > Peter Mortensen 

       Two replies: 

       Julian V. Noble at University of Virginia 
       jvn@fermi.clas.Virginia.EDU  

       | Of course Forth is appropriate for embedded applications on 
       | even the smallest satellites. 
       | 
       | You should contact the NASA group in charge of the Shuttle Solar 
       | Backscatter Ultraviolet software. Bob Caffrey or Tom Riley should 
       | be able to help you. 
       ---------------------------------------------------------------------- 

       John Hayes at Applied Physics Laboratory of Johns Hopkins University 
       john@netnews.jhuapl.edu  

       | Forth is excellent for space. I am enclosing a list of space appli- 
       | cations of Forth that has been gathered by Jim Rash of NASA/Goddard 
       | Space Flight Center. Some of the items in the list are for GSE used 
       to 
       | test spacecraft, but many are flight applications.  I was personally 
       | involved in several of these. 
       | 
       | John R. Hayes                   john@aplcomm.jhuapl.edu 
       | Applied Physics Laboratory 
       | Johns Hopkins University 
       | 
       | (The list of applications is given in the file F_SPACE.DOC.) 
       ---------------------------------------------------------------------- 


       0.2 What, exactly, is Forth? 

       The file Forth_IS.DOC contains an essay by a young Forth expert, Phil 
       Koopman, written for the Second History of Programming Languages Con- 
       ference. It represents the clearest and most succinct description of 
       the language I know of. 

       Another good description of Forth can be found in the August 1980 BYTE 
       Magazine. (This issue also contains an article by Charles Moore, the 
       inventor of Forth, describing the history of the language.) 

       The rest of this review briefly summarizes the main ideas of Forth. It 
       makes no pretense to complete coverage of standard Forth programming 
       methods. This summary is not intended to be a programmer's manual! 


       0.3 Forth resources and further reading 

       Suppose the reader is stimulated to try Forth -- how can he 
       proceed? 

       Several excellent Forth texts and references are available: 

         Starting FORTH (SF) and Thinking FORTH (TF) 

           [L. Brodie, SF, 2nd ed., Prentice-Hall, NJ (1986); 
                       TF, Prentice-Hall, NJ (1984)] 


         FORTH: a Text and Reference (FTR) 

           [M.Kelly and N.Spies, FTR, Prentice-Hall, NJ (1986)]. 


         Mastering Forth (MF) 

           [M. Tracy, et al., Brady Books, NY (1989)] 

         Scientific Forth: a modern language for scientific programming 

           [J.V. Noble, Mechum Banks Publishing, Ivy, VA (1992)] 


       Read SF or FTR or both before trying to use a Forth system. (Or 
       at least read one concurrently.) 

       Some books are available through 

         Mountain View Press 
         Box 429 Route 2 
         La Honda, CA 94020  USA 
         voice/fax/modem (via menu)  (415) 747-0760 

       as well as 

         Miller Microcomputer Services 
         61 Lake Shore Road 
         Natick, MA 01760-2099 


       Forth discussions abound on bulletin boards, notably 

        GEnie  ( respond FIG to the ! prompt ) 

        CompuServe ( GO Forth ) 

        Internet ( comp.lang.forth ) 


       What other Internet/UseNet Forth mailing lists are there? 
                                                              
           From: ns@maccs.dcss.mcmaster.ca (Nick Solntseff)          
           Subject: Forth Education List                             
           Message-ID: <1991Jul4.201259.11734@maccs.dcss.mcmaster.ca> 
           Date: Thu, 4 Jul 1991 20:12:59 GMT                        
                                                              
         Nicholas Solntseff                                          
         Department of Computer Science and Systems                  
         McMaster University                                         
         Hamilton, Ontario                                           
         Canada                                                      
         L8S 4K1                                                     
         1-416-525-9140 xtn 3443 

       Some journals that deal in Forth: 

         The Computer Journal 
         P. O. Box 12 
         S. Plainfield, NJ  07080-0012 
         - U. S. A. - 
         Phone: (US access) 908-755-6186 


         Dr. Dobb's Journal has the occasional article or two. 
         PO Box 56188 
         Fort Collins, CO 80525 
         Voice:  303-225-1410 
         Fax:    303-225-1075 
       FIG publishes a journal Forth Dimensions whose object is the 
       exchange of programming ideas and clever tricks. 

         Forth Interest Group 
         P.O. Box 2154 
         Oakland, CA 94621 


       The Institute for Applied Forth Research, Inc. 

       70 Elmwood Avenue 
       Rochester, NY 14611 

       Tel. (716) 235-0168 

       publishes the refereed Journal of FORTH Application and Research, that 
       serves as a vehicle for more scholarly and theoretical papers dealing 
       with Forth. They also publish the proceedings of the annual Rochester 
       Forth Conferences. 


       The Association for Computing Machinery (ACM) Special Interest Group 
       for Forth (SIGForth) 

         SIGForth/ACM 
         1515 Broadway 
         New York, NY 10036 

       holds annual conferences, and publishes the SIGForth Newsletter. 


       Finally, an attempt to codify and standardize Forth is underway, whose 
       object is to produce an ANS Forth and standard extensions. 


       1. Getting started 

       We will use eForth for these illustrations. Copy the contents of the 
       directory EForth to your hard disk or an empty (formatted) diskette, 
       switch to that drive/(sub)directory, and type in 

           EFORTH86 <cr> 

       The compressed eForth files will then decompress themselves. 

       Now start the eForth system by typing 

           EFORTH <cr> 

       It should respond by writing something like 

           eForth v1.01 


       Since most Forths are case-sensitive and use upper case for their most 
       common words, set the CapsLock key NOW. 

       Now type 

           BYE  <cr>. 

       The system immediately returns to the DOS prompt. 


       What just happened? Forth is an interactive programming language con- 
       sisting entirely of subroutines, called "words". 

       A word is executed (interactively) by naming it. We have just seen 
       this happen: BYE is a Forth subroutine meaning "exit to the operating 
       system". So when we entered BYE it was executed, and the system re- 
       turned control to DOS. 


       Say EFORTH again to re-start Forth. Now we will try something a little 
       more complicated. Enter 

           2 17  +  .  <cr> 19  ok 

       What happened? Forth is interpretive. An "outer interpreter" continu- 
       ally loops, waiting for input from the keyboard or mass storage de- 
       vice. The input is a sequence of text strings separated from each 
       other by blank spaces --ASCII 32decimal = 20hex-- the standard Forth 
       delimiter. 

       When the outer interpreter encounters text it first looks for it in 
       the "dictionary" (a linked list of previously defined subroutine 
       names). If it finds the word, it executes the corresponding code. 

       If no dictionary entry exists, the interpreter tries to read the input 
       as a number. 

       If the input text string satisfies the rules defining a number, it is 
       converted to a number and stored in a special memory location called 
       "the top of the stack" (TOS). 


       In the above example, Forth interpreted 2 and 17 as numbers, and 
       pushed them both onto the stack. 

       "+" is a pre-defined word as is ".", so they were looked up and exe- 
       cuted. 

       "+" added 2 to 17 and left 19 on the stack. 

       The word "." (called "emit") removed 19 from the stack and displayed 
       it on the standard output device (in this case, CRT). 

       We might also have said 

           HEX    0A  14  * . <cr>  C8 ok 

       (Do you understand this? Hint: DECIMAL means "switch to decimal arith- 
       metic", whereas HEX stands for "switch to hexadecimal arithmetic".) 

       If the incoming text can neither be located in the dictionary nor in- 
       terpreted as a number, Forth issues an error message. Try it: say X 
       and see 

           X X ? 

       or say THING and see 

           THING THING ? 


       2. Extending the dictionary 

       The compiler is one of Forth's most endearing features. Unlike 
       all other high-level languages, the Forth compiler is part of the 
       language. That is, its components are Forth words available to 
       the programmer, that can be used to solve his problems. 

       In this section we discuss how the compiler extends the 
       dictionary. 

       Forth uses special words to create new dictionary entries, i.e., 
       new words. The most important are ":" ("start a new definition") 
       and ";" ("terminate the definition"). 

       Let's try this out: enter 

           : *+    *  +  ;  <cr>  ok 

       What happened? The word ":" was executed because it was already 
       in the dictionary. The action of ":" is 

         > Create a new dictionary entry named "*+" and switch from 
           interpret to compile mode. 

         > In compile mode, the interpreter looks up words and 
           --rather than executing them-- installs pointers to 
           their code. (If the text is a number, rather than 
           pushing it on the stack, Forth builds the number 
           into the dictionary space allotted for the new word, 
           following special code that puts it on the stack 
           when the word is executed.) 

         > The action of "*+" will be to execute sequentially 
           the previously-defined words "*" and "+". 

         > The word ";" is special: when it was defined a bit 
           was turned on in its dictionary entry to mark it as 
           IMMEDIATE. Thus, rather than writing down its 
           address, the compiler executes ";" immediately. The 
           action of ";" is first, to install the code that 
           returns control to the next outer level of the 
           interpreter; and second, to switch back from compile 
           mode to interpret mode. 

       Now try out "*+" : 

           DECIMAL   5 6 7 *+ .  <cr> 47  ok 

       This example illustrated two principles of Forth: adding a new word to 
       the dictionary, and trying it out as soon as it was defined. 


       3. Stacks and reverse Polish notation (RPN) 

       We now discuss the stack and the "reverse Polish" or "postfix" arith- 
       metic based on it. (Anyone who has used a Hewlett-Packard calculator 
       should be familiar with the concept.) 

       Virtually all modern CPU's are designed around stacks. Forth effi- 
       ciently uses its CPU by reflecting this underlying stack architecture 
       in its syntax. 


       But what is a stack?  As the name implies, a stack is the machine ana- 
       log of a pile of cards with numbers written on them. Numbers are 
       always added to the top of the pile, and removed from the top of the 
       pile. The Forth input line 

           2 5 73 -16 <cr> ok 

       leaves the stack in the state 

             cell #   contents 


               0       -16        (TOS) 

               1        73        (NOS) 

               2         5 

               3         2 


       where TOS stands for "top-of-stack", NOS for "next-on-stack", etc. 

       We usually employ zero-based relative numbering in Forth data struct- 
       ures (such as stacks, arrays, tables, etc.) so TOS is given relative 
       #0, NOS #1, etc. 

       Suppose we followed the above input line with the line 

           + - * . <cr> xxx ok 

       what would xxx be? The operations would produce the successive stacks 

            cell#  initial     +        -          *      . 

              0      -16      57      -52       -104 
              1       73       5        2 
              2        5       2 
              3        2                                empty 
                                                        stack 

       The operation "." (EMIT) displays -104 to the screen, leaving the 
       stack empty. That is, xxx is -104. 


       3.1 Manipulating the parameter stack 

       Forth systems incorporate (at least) two stacks: the parameter stack 
       and the return stack. 

       A stack-based system must provide ways to put numbers on the stack, to 
       remove them, and to rearrange their order. Forth includes standard 
       words for this purpose.  

       Putting numbers on the stack is easy: simply type the number (or in- 
       corporate it in the definition of a Forth word). 

       The word DROP removes the number from TOS and moves up all the other 
       numbers. (Since the stack usually grows downward in memory, DROP mere- 
       ly increments the pointer to TOS by 1 cell.) 

       SWAP exchanges the top 2 numbers. 

       DUP duplicates the TOS into NOS. 

       ROT rotates the top 3 numbers. 


       These actions are shown below (we show what each word does to the ini- 
       tial stack) 

             cell | initial | DROP    SWAP     ROT      DUP 

               0  |   -16   |  73      73        5      -16 
               1  |    73   |   5     -16      -16      -16 
               2  |     5   |   2       5       73       73 
               3  |     2   |           2        2        5 
               4  |         |                             2 

       Forth includes the words OVER, TUCK, PICK and ROLL that act as shown 
       below (note PICK and ROLL must be preceded by an integer that says 
       where on the stack an element gets PICK'ed or ROLL'ed): 

             cell | initial | OVER    TUCK    4 PICK    4 ROLL 

               0  |   -16   |  73      -16        2        2 
               1  |    73   | -16       73      -16      -16 
               2  |     5   |  73      -16       73       73 
               3  |     2   |   5        5        5        5 
               4  |         |   2        2        2 

       Clearly, 1 PICK is the same as DUP, 2 PICK is a synonym for OVER, and 
       2 ROLL means SWAP. 


       3.2 The return stack and its uses 

       We have remarked above that compilation establishes links from the 
       calling word to the previously-defined word being invoked. The linkage 
       mechanism --during execution-- uses the return stack (rstack):  the 
       address of the next word to be invoked is placed on the rstack, so 
       that when the current word is done executing, the system knows to jump 
       to the next word. 

       In addition to serving as a reservoir of return addresses (since words 
       can be nested, the return addresses need a stack to be put on) the 
       rstack is where the limits of a DO...LOOP construct are placed. 
       (Note, eForth is non-standard: it does not include DO...LOOP.) 

       The user can also store/retrieve to/from the rstack. This is an ex- 
       ample of using a component for a purpose other than the one it was 
       designed for. Such use is discouraged for novices since it adds the 
       spice of danger to programming. See "Note of caution" below. 

       To store to the rstack we say >R , and to retrieve we say R> . The 
       word R@ copies the top of the rstack to the TOS. 


       Why use the rstack when we have a perfectly good parameter stack to 
       play with? Sometimes it becomes hard to read code that performs com- 
       plex gymnastics on the stack. The rstack can reduce the complexity. 

       Alternatively, VARIABLEs --named locations-- provide a place to store 
       numbers off the stack, again reducing the gymnastics. Try this: 

           \ YOU DO THIS            \ EXPLANATION 

           VARIABLE X <cr>  ok      \ create a named storage location X; 
                                    \ X executes by leaving its address 

           3 X ! <cr>  ok           \ ! ("store") expects a number and 
                                    \ an address, and stores the number to 
                                    \ that address 

           X @  . <cr>  3 ok        \ @ ("fetch") expects an address, and 
                                    \ places its contents in TOS. 

       However, Forth encourages using as few named variables as possible. 
       The reason: since VARIABLEs are typically global --any subroutine can 
       access them-- they often cause unwanted interactions among parts of a 
       large program. 

       Although Forth can make variables local to the subroutines that use 
       them (see "headerless words" in FTR), the rstack can often replace 
       local variables: 

         > The rstack already exists, so it need not be defined anew. 

         > When the numbers placed on it are removed, the rstack shrinks, 
           reclaiming some memory. 


       A note of caution: since the rstack is critical to execution we mess 
       with it at our peril. If we use the rstack for temporary storage we 
       must restore it to its initial state. A word that places a number on 
       the rstack must remove it --using R> or RDROP-- before exiting that 
       word. Since DO...LOOP also uses the rstack, for each >R folowing DO 
       there must be a corresponding R> or RDROP preceding LOOP. Neglecting 
       these precautions will probably crash the system. 


       4. Fetching and storing 

       As we just saw, ordinary (16-bit) numbers are fetched from memory to 
       the stack by @ ("fetch"), and stored by ! (store). 

       @ expects an address on the stack and replaces that address by 
       its contents using, e.g., the phrase   X  @ 

       ! expects a number (NOS) and an address (TOS) to store it in, and 
       places the number in the memory location referred to by the address, 
       consuming both arguments in the process, as in the phrase   3 X  ! 

       Double length (32-bit) numbers can similarly be fetched and stored, by 
       D@ and D!, if the system has these words. 

       Positive numbers smaller than 255 can be placed in single bytes of 
       memory using C@ and C!. This is convenient for operations with strings 
       of ASCII text, for example screen and keyboard I/O. 


       5. Comparing and branching 

       Forth lets you compare two numbers on the stack, using relational 
       operators ">", "<", "=" . Ths, e.g., the phrase 

                2 3 > <cr> ok 

       leaves 0 ("false") on the stack, because 2 (NOS) is not greater than 3 
       (TOS). Conversely, the phrase 

                2 3 < <cr> ok 

       leaves -1 ("true") because 2 is less than 3. 

       Notes: In some Forths "true" is +1 rather than -1. 

              Relational operators consume both arguments and leave a "flag" 
              to show what happened. 

       (Many Forths offer unary relational operators "0=", "0>" and "0<". 
       These, as might be guessed, determine whether the TOS contains an 
       integer that is 0, positive or negative.) 

       The relational words are used for branching and control. For example, 

           : TEST     CR   0 =  NOT  IF  ." Not zero!"   THEN  ; 

           0 TEST <cr>  ok     ( no action) 
           -14 TEST <cr> 
           Not zero!  ok 

       The word CR issues a carriage return (newline). Then TOS is compared 
       with zero, and the logical NOT operator (this flips "true" and 
       "false") applied to the resulting flag. Finally, if TOS is non-zero, 
       IF swallows the flag and executes all the words between itself and the 
       terminating THEN. If TOS is zero, execution jumps to the word 
       following THEN. 

       The word ELSE is used in the IF...ELSE...THEN statement: a nonzero 
       value in TOS causes any words between IF and ELSE to be executed, and 
       words between ELSE and THEN to be skipped. A zero value produces the 
       opposite behavior. Thus, e.g. 


           : TRUTH    CR   0 =  IF  ." false"  ELSE  ." true"  THEN  ; 

           1 TRUTH <cr> 
           true  ok 

           0 TRUTH <cr> 
           false  ok 

       Since THEN is used to terminate an IF statement rather than in its 
       usual sense, some Forth writers prefer the name ENDIF. 

       6. Documenting and commenting Forth code 

       Forth is sometimes accused of being a "write-only" language, i.e. some 
       complain that Forth is cryptic. This is really a complaint against 
       poor documentation and untelegraphic word names. Unreadability is 
       equally a flaw of poorly written FORTRAN, PASCAL, C, etc. 

       Forth offers programmers who take the trouble tools for producing ex- 
       ceptionally clear code. 


       6.1 Parenthesized remarks 

       The word ( -- a left parenthesis followed by a space -- says "disre- 
       gard all following text until the next right parenthesis in the 
       input stream". Thus we can intersperse explanatory remarks within 
       colon definitions. 


       6.2 Stack comments 

       A particular form of parenthesized remark describes the effect of a 
       word on the stack. In the example of a recursive loop (GCD below), 
       stack comments are really all the documentation necessary. 

       Glossaries generally explain the action of a word with a 
       stack-effect comment. For example, 

           ( adr -- n) 

       describes the word @ ("fetch"): it says @ expects to find an address 
       (adr) on the stack, and to leave its contents (n) upon completion.  
       The corresponding comment for ! would be 

           ( n adr -- ) . 


       6.3 Drop line (\) 

       The word "\" (back-slash followed by space) has recently gained favor 
       as a method for including longer comments. It simply means "drop ev- 
       erything in the input stream until the next carriage return". Instruc- 
       tions to the user, clarifications or usage examples are most naturally 
       expressed in a block of text with each line set off by "\"  . 


       6.4 Self-documenting code 

       By eliminating ungrammatical phrases like CALL or GOSUB, Forth pre- 
       sents the opportunity --via telegraphic names for words-- to make code 
       almost as self-documenting and transparent as a readable English or 
       German sentence. Thus, for example, a robot control program could con- 
       tain a phrase like 

           2 TIMES   LEFT EYE  WINK 

       which is clear (although it sounds like a stage direction for Brun- 
       hilde to vamp Siegfried). It would even be possible without much dif- 
       ficulty to define the words in the program so that the sequence could 
       be made English-like:  WINK  LEFT EYE 2 TIMES . 


       7. Arithmetic operations 

       The 1979 or 1983 standards require that a conforming Forth system con- 
       tain a certain minimum set of pre-defined words. These consist of 
       arithmetic operators + - * / MOD /MOD */ for (usually) 16-bit signed- 
       integer (-32767 to +32767) arithmetic, and equivalents for unsigned (0 
       to 65535), double-length and mixed-mode (16- mixed with 32-bit) arith- 
       metic. The list will be found in the glossary accompanying your 
       system, as well as in SF and FTR. 

       Try this example of a non-trivial program that uses arithmetic and 
       branching to compute the greatest common divisor of two integers using 
       Euclid's algorithm: 

           : TUCK   ( a b -- b a b)   SWAP  OVER  ; 
           : GCD    ( a b -- gcd)  ?DUP  IF  TUCK  MOD  GCD  THEN  ; 

       The word ?DUP duplicates TOS if it is not zero, and leaves it alone 
       otherwise. If the TOS is 0, therefore, GCD consumes it and does 
       nothing else. However, if TOS is unequal to 0, then GCD TUCKs TOS 
       under NOS (to save it); then divides NOS by TOS, keeping the remainder 
       (MOD). There are now two numbers left on the stack, so we again take 
       the GCD of them. That is, GCD calls itself. However, if you try the 
       above code it will fail. A dictionary entry cannot be looked up and 
       found until the terminating ";" has completed it. So in fact we must 
       use the word RECURSE to achieve self-reference, as in 

           : TUCK   ( a b -- b a b)   SWAP  OVER  ; 
           : GCD    ( a b -- gcd)  ?DUP  IF   TUCK  MOD  RECURSE   THEN  ; 

       Now try 

           784 48 GCD .  16 ok 


       8. Looping and structured programming 

       Forth has several ways to loop, including the implicit method of re- 
       cursion, illustrated above. Recursion has a bad name as a looping 
       method because in most languages that permit recursion, it imposes 
       unacceptable running time overhead on the program. Worse, recursion 
       can --for reasons beyond the scope of this Introduction to Forth-- be 
       an extremely inefficient method of expressing the problem. In Forth, 
       there is virtually no excess overhead in recursive calls because Forth 
       uses the stack directly. So there is no reason not to recurse if that 
       is the best way to program the algorithm. But for those times when 
       recursion simply isn't enough, here are some more standard methods. 

       8.1 Indefinite loops 

       The construct 

           BEGIN xxx ( -- flag)  UNTIL 

       executes the words represented by xxx, leaving TOS (flag) set to 0 (F) 
       --at which point UNTIL terminates the loop-- or to -1 (T) --at which 
       point UNTIL jumps back to BEGIN. Try: 

           : COUNTDOWN    ( n --) 
                BEGIN  CR   DUP  .  1 -   ?DUP   0  =   UNTIL  ; 

           5 COUNTDOWN 
           5 
           4 
           3 
           2 
           1  ok 

       A variant of BEGIN...UNTIL is 

           BEGIN xxx ( -- flag) WHILE  yyy  REPEAT 

       Here xxx is executed, WHILE tests the flag and if it is 0 (F) leaves 
       the loop; whereas if flag is -1 (T) WHILE executes yyy and REPEAT then 
       branches back to BEGIN. These forms can be used to set up loops that 
       repeat until some external event (pressing a key at the keyboard, 
       e.g.) sets the flag to exit the loop. They can also used to make end- 
       less loops (like the outer interpreter of Forth) by forcing the flag 
       to be 0 in a definition like 

           : AGAIN    0  [COMPILE]  UNTIL  ; IMMEDIATE 

           : ENDLESS   BEGIN xxx 0 AGAIN ; 


       8.2 Definite loops 

       Most Forths allow indexed loops using DO...LOOP (or +LOOP or /LOOP). 
       These are permitted only within definitions 

           : BY-ONES   ( n --)   0 TUCK  DO   CR  DUP  .  1 +  LOOP   DROP  ; 

       The words CR  DUP  .  1 +  will be executed n times as the lower 
       limit, 0, increases in unit steps to n-1. 

       To step by 2's, we use the phrase 2 +LOOP to replace LOOP, as with 

           : BY-TWOS   ( n --)   0 TUCK 
                DO   CR  DUP  .  2 +    2 +LOOP    DROP  ; 


       8.3 FOR...NEXT 

       eForth does not possess DO...LOOP, since these words should be coded 
       for efficiency, and the best way to do that will vary from one CPU to 
       another. (Of course there is nothing to prevent you from adding 
       DO...LOOP to eForth.) 

       Instead eForth provides a simple countdown loop (like COUNTDOWN 
       above). Thus we could have written 

           : COUNTDOWN   ( n -- )  DUP  FOR  CR  DUP  .  1 -  NEXT  DROP  ; 


       8.4 Structured programming 

       As a reaction to program flow charts resembling a plate of spaghetti, 
       often seen in early languages like FORTRAN and assembler, N. Wirth 
       invented the Pascal language. His intent was to eliminate line labels 
       and direct jumps (GOTOs), thereby forcing control flow to be clear and 
       direct. The ideal was subroutines and functions that perform a single 
       task, hence have single entries and exits. Unfortunately, programmers 
       insisted on GOTOs so many Pascals and other modern languages now have 
       them. Worse, the ideal of short subroutines that do one thing only is 
       unreachable in such languages because the method for calling them and 
       passing arguments imposes a large overhead. Thus execution speed re- 
       quires minimizing calls, which in turn means longer, more complex sub- 
       routines that perform several related tasks. Today structured program- 
       ming seems to mean little more than writing code with nested IFs in- 
       dented by a pretty-printer. 

       Paradoxically, Forth is the only truly structured language in common 
       use, although it was not designed with that as its goal. In Forth word 
       definitions are subroutine calls. The language contains no GOTO's so 
       it is impossible to write "spaghetti code". Forth also encourages 
       structure through short definitions. Little speed penalty is incurred 
       in breaking a long procedure into many small ones (this is called 
       "factoring"). Each word has one entry and one exit point, and can be 
       written to perform one job. 


       8.5 "Top-down" design 

       "Top-down" programming is a doctrine that one should design the entire 
       program from the general to the particular: 

         > Make an outline, flow chart or whatever, taking a broad overview 
           of the whole problem. 

         > Break the problem into small pieces (decompose it). 

         > Then code the individual components. 

       The natural programming mode in Forth is "bottom-up" rather than "top- 
       down" --the most general word appears last, whereas the definitions 
       must progress from the primitive to the complex. This leads to a some- 
       what different approach from more familiar languages: 

         > In Forth, components are specified roughly, and then as they are 
           coded they are immediately tested, debugged, redesigned and 
           improved. 

         > The evolution of the components guides the evolution of the outer 
           levels of the program. 


       9. CREATE ... DOES> (the pearl of FORTH) 

       Michael Ham has called the word pair CREATE...DOES>, the "pearl of 
       Forth". CREATE is a component of the compiler, whose function is to 
       make a new dictionary entry with a given name (the next name in the 
       input stream) and nothing else. DOES> assigns a specific run-time ac- 
       tion to a newly CREATEd word. 


       9.1 Defining "defining" words 

       CREATE finds its most important use in extending the powerful class of 
       Forth words called "defining" words. The colon compiler  ":"  is such 
       a word, as are VARIABLE and CONSTANT. 

       The definition of VARIABLE in high-level Forth is simple 

           : VARIABLE  CREATE   2 ALLOT ; 

       We have already seen how VARIABLE is used in a program. 


       Forth lets us define words initialized to contain specific values: for 
       example, we might want to define the number 17 to be a word. CREATE 
       and "," ("comma") can do this: 

           17 CREATE SEVENTEEN  ,  <cr>  ok 

       Now test it via 

           SEVENTEEN @ .  <cr>  17 ok . 


       Remarks: 

         > The word , ("comma") puts TOS into the next 2 bytes of the dic- 
           tionary and increments the dictionary pointer by 2. 

         > A word "C," ("see-comma") exists also -- it puts a byte-value into 
       the next byte of the dictionary and increments the pointer by 1. 


       9.2 Run-time vs. compile-time actions 

       In the preceding example, we were able to initialize the variable 
       SEVENTEEN to 17 when we CREATEd it, but we still have to fetch it to 
       the stack via SEVENTEEN @ whenever we want it. This is not quite what 
       we had in mind: we would like to find 17 in TOS when SEVENTEEN is 
       named. The word DOES> gives us the tool to do this. 

       The function of DOES> is to specify a run-time action for the "child" 
       words of a defining word.  Consider the defining word CONSTANT , de- 
       fined in high-level (of course CONSTANT is usually a machine-code 
       primitive, for  speed) Forth by 

           : CONSTANT  CREATE  ,  DOES>  @  ; 

       and used as 

           53 CONSTANT PRIME  <cr> ok 

       Now test it: 

           PRIME . <cr>  53  ok . 


       What is happening here? 

         > CREATE (hidden in CONSTANT) makes an entry (named PRIME , the 
           first word in the input stream following CONSTANT). Then "," 
           places the TOS (the number 53) in the next two bytes of the dic- 
           tionary. 

         > Then DOES> (inside CONSTANT) appends the actions of all words be- 
           tween it and ";" (the end of the definition) --in this case, "@"-- 
           to the child word(s) defined by CONSTANT. 

       9.3 Dimensioned data (intrinsic units) 

       Here is an example of the power of defining words and of the distinc- 
       tion between compile-time and run-time behaviors. 

       Physical problems generally work with quantities that have dimensions, 
       usually expressed as mass (M), length (L) and time (T) or products of 
       powers of these.  Sometimes there is more than one system of units in 
       common use to describe the same phenomena. 

       For example, U.S. or English police reporting accidents might use 
       inches, feet and yards; while Continental police would use centimeters 
       and meters. Rather than write different versions of an accident ana- 
       lysis program it is simpler to write one program and make unit conver- 
       sions part of the grammar. This is easy in Forth. 

       The simplest method is to keep all internal lengths in millimeters, 
       say, and convert as follows: 

                : INCHES  254   10  */ ; 
                : FEET   [ 254 12 * ] LITERAL  10  */ ; 
                : YARDS  [ 254 36 * ] LITERAL  10  */ ; 
                : CENTIMETERS   10  * ; 
                : METERS   1000  * ; 

       Note: This example is based on 16-bit integer arithmetic.  The word */ 
             means "multiply the third number on the stack by NOS, keeping 32 
             bits of precision, and divide by TOS". That is, the stack com- 
             ment for */ is ( a b c -- a*b/c). 


       The usage would be 

                10 FEET  .  <cr>  3048 ok 

       These are more definitions than necessary, of course, and the tech- 
       nique generates unnecessary code. A more compact approach uses a de- 
       fining word, UNITS : 

           : D,  ( hi lo --)   SWAP  , ,  ; 
           : D@  ( adr -- hi lo)   DUP  @   SWAP   2 +  @   ; 
           : UNITS  CREATE  D,   DOES> D@  */ ; 

       Then we could make the table 

                254 10        UNITS INCHES 
                254 12 *  10  UNITS FEET 
                254 36 *  10  UNITS YARDS 
                10  1         UNITS CENTIMETERS 
                1000  1       UNITS METERS 

                \ Usage: 
                10 FEET  . <cr>  3048  ok 
                3 METERS . <cr>  3000  ok 
                \ ....................... 
                \ etc. 

       This is an improvement, but Forth permits a simple extension that 
       allows conversion back to the input units, for use in output: 

           VARIABLE  <AS>    0 <AS> ! 
           : AS  -1 <AS> ! ; 
           : UNITS  CREATE  D,  DOES>  D@  <AS> @ 
                   IF  SWAP  THEN  */  0 <AS> ! ; 

           \ UNIT DEFINITIONS REMAIN THE SAME. 
           \ Usage: 
           10 FEET .  <cr>  3048  ok 
           3048 AS FEET .  <cr>  10  ok 

       9.4 Advanced uses of the compiler 

       Suppose we have a series of push-buttons numbered 0-3, and a word WHAT 
       to read them. That is, WHAT waits for input from a keypad: when button 
       #3 is pushed, for example, WHAT leaves 3 on the stack. 

       We would like to define a word BUTTON to perform the action of pushing 
       the n'th button, so we could just say: 

           WHAT BUTTON 

       In a conventional language BUTTON would look something like 

           : BUTTON  DUP  0 =  IF  RING  DROP  EXIT  THEN 
                     DUP  1 =  IF  OPEN  DROP  EXIT  THEN 
                     DUP  2 =  IF  LAUGH DROP  EXIT  THEN 
                     DUP  3 =  IF  CRY   DROP  EXIT  THEN 
                     ABORT" WRONG BUTTON!"   ; 

       That is, we would have to go through two decisions on the average. 

       Because the programmer has full, interactive access to the Forth 
       compiler, Forth makes possible a much neater algorithm. The components 
       we shall use are CREATE, which we have already discussed, and two new 
       words, "]" and "[" . 

       The word "[" switches from compile mode to interpret mode while com- 
       piling. (If the system is interpreting it changes nothing.) The word 
       "]" switches from interpret to compile mode. 

       Barring some error-checking, the "definition" of the colon compiler 
       ":" is just 

           : :   CREATE  ]  DOES>  doLIST  ; 

       and that of ";" is just 

           : ;   next  [  ;  IMMEDIATE 


       Another use for these switches is to perform arithmetic at compile- 
       time rather than at run-time, both for program clarity and for easy 
       modification, as we did in the first try at dimensioned data (that is, 
       phrases such as 

           [ 254 12 * ] LITERAL 

       and 

           [ 254 36 * ] LITERAL 

       which allowed us to incorporate in a clear manner the number of 
       tenths of millimeters in a foot or a yard. 


       To define BUTTON so it never needs more than one decision, we will use 
       "]" and "[" to create an "action table". WHAT will provide an index 
       into that table, and BUTTON will choose (and EXECUTE) the appropriate 
       action. 

       This program can be tried out with eForth: 

           : RING   CR  ." ding dong!"  ; 
           : OPEN   CR  ." yoo hoo"     ; 
           : LAUGH  CR  ." ha ha!"      ; 
           : CRY    CR  ." boo hoo!"    ; 

       We now CREATE the action-table BUTTONS 

           CREATE BUTTONS   ]  RING  OPEN   LAUGH   CRY  [ 

       We define the BUTTON as: 

           : BUTTON  ( n --) 0 MAX  3 MIN  2*  BUTTONS +   @ EXECUTE ; 

       If, as before, I push #3, then the action CRY will be executed. Try 
       it: 

           3 BUTTON  <cr> 
           boo hoo!  ok 

       Note how the phrase 0 MAX  3 MIN protects against an out-of-range arg- 
       ument. Although the Forth philosophy is not to slow the code with un- 
       necessary error checking (because words are checked as they are de- 
       fined), when programming a user interface, some form of error handling 
       is vital. It is usually easier to prevent errors as we just did, than 
       to provide for recovery after they are made. 

       How does the action-table method work? 

         > CREATE BUTTONS makes a dictionary entry BUTTONS. 

         > "]" then  turns on the compiler, so that the previously-defined 
           word-names RING, etc. are looked up in the dictionary and compiled 
           into the table rather than being executed. 

         > Finally, "[" returns to interactive mode so that the next colon 
           definition (BUTTON) can be processed. 

         > The table BUTTONS now contains addresses of the code fields 
           (CFA's) of the various actions of BUTTON . 

         > 2 * then multiplies by 2 to get the offset into the table BUTTONS 
           of the desired CFA. 

         > BUTTONS +  then adds the base address of BUTTONS to get the abso- 
           lute address where the desired CFA is stored. 

         > @ fetches the CFA for EXECUTE to execute. 

         > EXECUTE then executes the word corresponding to the button pushed. 
           Simple!