NOVICE.TXT --- copyright (c) 2009 by Hugh Aguilar

This is documentation for the NOVICE.4TH file. The code is intended to be a starter-kit for Forth novices. The ANS-Forth standard that came out in 1994 had several problems. For one thing, it continued to support and encourage obsolete programming techniques such as CREATE DOES>, and it failed to provide programmers with the tools they need to get started writing programs. 

The NOVICE.TXT file is intended to be a work-around for some of the ANS-Forth limitations. Here I have provided basic programming tools that are needed to begin programming even the simplest programs. Most of these work-arounds are beyond the ability of the novice to write. Without such basic tools available, the novice is largely unable to get started in programming. Advanced Forth programmers can write code such as I have provided in NOVICE.TXT, but this doesn't help the novices become advanced, unless this code is made available to them. By providing this code to the novices, I hope to pave the way toward them becoming advanced.

The novice should be aware that some of the words provided in NOVICE.TXT are state-smart. For example, the words generated by FIELD are state-smart. This is a problem because the programmer might obtain the XT of the word using ['] and try to compile it with COMPILE, which will result in a bug. Hopefully this problem won't arise so long as the user of NOVICE.TXT is aware of it.

The following describes some of the tools provided in NOVICE.TXT. There is more available than what is described here, so the novice should examine the source-file to find out what else is in there. What is described here are the more important words.

*******************************************************************************
*******************************************************************************
*******************************************************************************

: try ( -- )        \ stream: marker_name

This word recompiles a source-file. The source-file should have a MARKER at the beginning prior to any code, with the marker name being the source-file name. 

For example, if your source-file is called MY-BIG-FAT-PROGRAM.4TH, your file should have this line near the top:

marker my-big-fat-program.4th

Interpretively, when you want to compile your program, you type this:

try my-big-fat-program.4th

If your program hasn't been compiled yet, this will compile it. If it has been compiled already, this will remove the existing code from memory and recompile it.

*******************************************************************************
*******************************************************************************
*******************************************************************************

: macro: ( -- )     \ stream: name      \ this works like colon, but inlines the code

This word can replace colon. When the defined word is used, inline code is generated rather than a function call. There are two reasons for doing this. One, is that inlined code executes faster than function calls. Another, is that you can factor out code involving the r-stack (including words such as >R and R@) which helps to make your functions less cluttered and more readable.

The macros can be used interpretively assuming that all of the words in them can be used interpretively (not >R etc.). The macros are just string expansion.

Note that MACRO: always creates an immediate word; you don't have to use IMMEDIATE after the definition. See the definition of RR@ for an example of how MACRO: is used.

The word COMMENT is used for commenting out sections of code.

: comment ( delimiter -- )

The programmer must provide it with a character delimiter like this:

char & comment

bad code...

&

You can use any delimiter that you want, but you must make sure this character does not occur within the code that you are commenting out. The ampersand is generally a good choice.

*******************************************************************************
*******************************************************************************
*******************************************************************************

We have some miscellaneous support words. These include some constants:

1 chars     constant c                  \ the size of a char
1 cells     constant w                  \ the size of a word, either an integer or pointer
2 cells     constant d                  \ the size of a double word
1 floats    constant f                  \ the size of a float

: b. ( n -- )                           \ display as a 32-bit binary number

The B. word is provided as a convenience for displaying unsigned integers as binary numbers.

: item ( -- )       \ stream: name      \ this was the word used in Forth-83 systems
    0 value ;
    
: dupd ( a b -- a a b )             \ "dupe deep" --- this is borrowed from Factor

: not ( n -- true|false )           \ this isn't a bit-wise operation; it results in -1 or 0

: non-negative? ( n -- ? )          \ is the argument zero or positive?

: dsqrt ( Darg -- Uroot )           \ square root

: gcd ( u v -- denominator )        \ u and v are unsigned; returns greatest-common-denominator

: dgcd ( Du Dv -- Ddenominator )    \ du and dv are unsigned; returns greatest-common-denominator

0 constant nil                      \ a pointer to nowhere

: alloc ( size -- adr )

: dealloc ( adr -- )                \ the adr has to have been provided by ALLOCATE or RESIZE

: realloc ( adr size -- new-adr )   \ the adr has to have been provided by ALLOCATE or RESIZE

: allocation ( adr -- size )        \ the adr has to have been provided by ALLOCATE or RESIZE

ALLOC, DEALLOC and REALLOC just allocate, free and reallocate memory on the heap. They will abort with an error message if they fail. For ALLOC, this usually means that the heap is full, which usually means that you have a memory leak. For DEALLOC and REALLOC, this usually means that you gave them an address that isn't on the heap, or isn't NIL.

Note that ANS-Forth has a serious design flaw in that the size of the allocated block is not accessible to the user. We know that this value is stored internally, because FREE and RESIZE have access to it, but the ANs-Forth designers decided to prevent the programmer from having access to this important information. Because of this, I had to rewrite ALLOCATE, FREE and RESIZE in order to be able to have an ALLOCATION function.

macro: nth-float ( index array -- element-adr )     \ this is for simple one-dimensional float arrays

macro: rover ( a b c -- a b c a )       \ rotate over
    
macro: rev ( a b c -- c b a )           \ reverse

: find-substr ( adr cnt target-adr target-cnt -- sub-adr sub-cnt true | false )
        
: replace-str { adr cnt new-adr new-cnt -- }

macro: fnear ( -- )                     \ float: n -- integer   \ better than FROUND because it actually rounds

macro: fint? ( -- ? )                   \ float: n --

macro: fcbrt ( -- )                     \ float: n -- cube-root 

1.e fasin 2.e f*    fconstant pi
pi f2*              fconstant 2pi
    
macro: deg>rad ( -- )                   \ float: degrees -- radians 
    
macro: rad>deg ( --)                    \ float: radians -- degrees 
    
macro: cm>inches ( -- )                 \ float: cm -- inch

We also have some support for creating counted strings:

: <cstr ( -- )                          \ this initializes a counted string to be constructed

: char+cstr ( char -- )                 \ this appends a character onto our counted string

: <+cstr> ( adr cnt -- )                \ this appends an adr/cnt string onto our counted string

: +cstr ( str -- )                      \ this appends a counted string onto our counted string

macro: -cstr ( n -- )                   \ remove N characters from CSTR

: pad-cstr ( size -- )                  \ boosts CSTR> to SIZE, padding with blanks as necessary

: -leading ( adr cnt -- new-adr new-cnt )       \ like -TRAILING except removes leading spaces

macro: concat ( str1 str2 -- str )      \ concatenates the two counted strings

: cstr> ( -- str )                      \ this concludes the construction of our counted string

For example, you could enter this code:

: ttt ( -- )
    <cstr  c" hello " +cstr  c" world" +cstr  cstr> count type ;
    
When TTT is executed, it will print out: "hello world"

A ring-buffer is used for the strings. It is possible to nest <CSTR strings inside of each other.
Unfortunately, you can only have one <CSTR at each level.

If you want to have more than one <CSTR in use at a time, you have to use CSTR-NEXT after CSTR>. This is done in <STRING> and CONCAT.
CSTR-NEXT only works at the top level however. If you use CSTR-NEXT inside of a nested <CSTR, the upper level will be messed up.
If you need multiple <CSTR strings inside of a nested <CSTR, then you must use HSTR to make copies of the strings.
This is admittedly a messy arrangement, but it was the best that I could figure out.

: hstr ( str -- hstr )                  \ hstr is a copy in the heap

HSTR makes a copy of any counted string and stores it in the heap. These words should be removed from the heap with DEALLOC after they are no longer needed, lest that they eventually fill up the heap. Forth doesn't have garbage collection due to Forth's emphasis on speed, so the programmer has to deallocate unused heap objects himself. If you want a language that does support garbage collection, you would be better off with Factor (www.factorcode.org) than Forth --- but Factor is an order of magnitude slower than Forth in many applications.

Finally, we have a few words provided for convenience:

: concat ( str1 str2 -- hstr )          <cstr  swap +cstr  +cstr  cstr> hstr ;

: ," ( -- )                             \ stream: string"

: ccompare ( str1 str2 -- -1|0|1 )      \ like COMPARE except for counted-strings

: get-str ( -- str )                    \ used for user-interface input
    
: get-float ( -- )                      \ float: -- n

: string ( delimiter -- )               \ runtime: -- adr cnt

The STRING word is like S" except that it takes a user-supplied delimiter, rather than always use the " character. Unlike S", STRING is state-smart, so be careful. We also have S| which is like S" except that it uses a | delimiter rather than a " delimiter.

*******************************************************************************
*******************************************************************************
*******************************************************************************

item ~#soft                             \ TRUE implies that ~# shouldn't output zeros

: ~# ( Da -- Db )                       \ needs ~#SOFT set
        
: ~. ( D -- D )                         \ needs ~#SOFT set
    ~#soft not if  [char] . hold  then ;

\ ~# is used for suppressing extraneous digits to the right of the decimal point.        
\ ~. is used for suppressing an extraneous decimal point.

: comma ( d -- d )                      \ output a comma in a pictured number

: dot ( d -- d )                        \ output a dot (decimal point) in a pictured number

: _# ( d -- d_less_digit )              \ digit if number is non-zero, else a blank

: _comma ( d -- d )                     \ comma if number is non-zero, else a blank

\ _# and _comma are used to blank-out the leftmost of the number if the digits are zero.

: book-sign ( d neg? -- d )             \ output the sign as either a + or a - (bookkeeping style)

*******************************************************************************
*******************************************************************************
*******************************************************************************

: <scientific>  { prec engineering? | sgn exp -- adr cnt }       \ float: n -- 

\ The e is optional, which prevents these numbers from being used in Forth source-code.  >FLOAT still works though.

The <SCIENTIFIC> word is used for converting floats into strings. A design flaw in ANS-Forth is that we have F. and E. for displaying floats, but we don't have any way to convert floats into strings. It would make more sense for the standard to provide a way to convert floats into strings, and not provide F. and E., because the latter are trivial to implement if the former is available --- TYPE can be used to display a string.

In general, you don't use <SCIENTIFIC> itself, but use one of the following:
    
10 constant max-prec    \ 11 works sometmes, but fails if the most significant digit is large.  10 always works.

: scientific ( -- adr cnt )                         \ float: n --

: max-scientific ( -- adr cnt )                     \ float: n --

: engineering ( -- adr cnt )                        \ float: n --

: max-engineering ( -- adr cnt )                    \ float: n --

SCIENTIFIC and ENGINEERING both use PRECISION, the same as F. and E. do. MAX-SCIENTIFIC and MAX-ENGINEERING use the maximum possible precision. This is MAX-PREC (10) decimal digits.

*******************************************************************************
*******************************************************************************
*******************************************************************************

0.0e fconstant impossible
1.0e fconstant certain

macro: prob-not ( -- )              \ float: prob -- prob-not
    
macro: prob-and ( -- )              \ float: prob1 prob2 -- prob
    
macro: prob-xor ( -- )              \ float: prob1 prob2 -- prob
    
macro: prob-ior ( -- )              \ float: prob1 prob2 -- prob

: rnd ( ulimit -- uvalue )              \ random result in the range: (0,ulimit)    

: bias ( ulimit -- uvalue )             \ in the range: (0,ulimit) --- tends toward zero

: frnd ( -- fvalue )                    \ float in the range: (0,1)

: ?rnd ( -- ? )                         \ flag: [false,true]

: init-seed ( -- )                      \ initializes the prng seed from the time&date

Our pseudo-random numbers use the LC53 method. This is the linear-congruential method using constants that I found through experimentation to provide good randomness. Note that these constants are much bigger than are typically used. This is because Forth has mixed-precision arithmetic. Languages such as C++ lack mixed-precision arithmetic, which requires them to use tiny constants in order to avoid overflowing single-precision arithmetic. LC53 in Forth is just as fast, and much more random.

The intention is that LC53 should be used for games and simulations on computers with hardware multiply and divide (any desktop computer). LC53 is not suitable for use in encryption.


*******************************************************************************
*******************************************************************************
*******************************************************************************

: :name ( str wid -- )                  \ like colon except takes its name as a parameter

This is the word that I rely on to replace CREATE DOES>. It creates a colon word. Unlike colon however, it gets the name of this word as a parameter, rather than obtaining the name from the input stream. Also, :NAME takes the word-list as a parameter rather than use the current word-list. If you want to use the current word-list, just use GET-CURRENT as your WID parameter.

We also have these words:

: :2name ( prefix-str suffix-str wid -- )       \ used for suffixing or prefixing names

: :3name ( prefix-str mid-str suffix-str wid -- )

: :name! ( str wid -- )                         \ like :NAME but with a ! suffix
    
: :name@ ( str wid -- )                         \ like :NAME but with a @ suffix

The :2NAME and :3NAME concatenate strings together to construct the name. These are provided as a convenience to the programmer as prefixing or suffixing a string is typical. We also have :NAME! and :NAME@, that provide the @ and ! suffixes. Other languages (including Factor) use GET and SET, but in Forth we use @ and !. 

For examples of the use of :NAME, see the definitions of 1ARRAY, 2ARRAY, etc., that use :NAME internally. Compare these implementations to arrays implemented with CREATE DOES>, such as described in the "Starting Forth" book.

*******************************************************************************
*******************************************************************************
*******************************************************************************

As a demonstration of how :NAME works, I implemented ring buffers using :NAME. My algorithm for the ring buffers is purposely simple to make demonstration clear. The novice can implement a more efficient algorithm if he wants.

We have two words:

: <wbuf> { records name | base limit data past used -- }

: wbuf ( records -- )                                   \ stream: name

This is typical. The defining word in the pointy brackets takes the name as a string on the stack. The defining word without the pointy brackets takes the name out of the input stream.

If the name is BUF, the defining word will generate the following colon words:

: INIT-BUF ( -- )                   \ initializes the ring buffer (called automatically when the buffer is defined)
: ROOM-BUF ( -- empty-slots )       \ is there room for more data?
: DATA-BUF ( -- occupied slots )    \ is there data available?
: BUF! ( record -- )                \ store a datum in the buffer
: BUF@ ( -- record )                \ fetch a datum from the buffer

To develop a word like <WBUF>, you would first write it for a single instance. You would write the colon words described above and you would get a single buffer called BUF. After your code is debugged, you would write the <WBUF> defining word to generate these colon words at compile-time. You could then use your defining word to define multiple instances of the code, each with a unique name.

*******************************************************************************
*******************************************************************************
*******************************************************************************

: { ( -- )     \ creates a string of LOCALS| and evaluates it

One of the biggest blunders made in the ANS-Forth standard was that the LOCALS| word accepts its parameters backwards. I have written a word { that fixes this problem. My locals are modeled after John Hopkins' locals. I haven't seen any source-code from John Hopkins however, so the implementation is my own.

As an example, consider this word:

: ttt { aaa bbb ccc | ddd eee -- fff ggg hhh } 
    aaa . bbb . ccc . ddd . eee . ;

The { and } look like a comment, but they are actually code that defines the locals. The local variables AAA, BBB and CCC are initialized from the parameter stack. The local variables after the vertical bar (DDD and EEE) are initialized to zero. Everything after the double-hyphen is ignored. The programmer typically indicates here what values are returned on the parameter stack, but the compiler just considers everything between the -- and the } to be a comment. The idea is that the entire { ... } should look like a stack-picture comment, even though it is actually code.

When TTT is executed, it prints out: 1 2 3 0 0

There are many more examples of local variables to be found in our source-code.

*******************************************************************************
*******************************************************************************
*******************************************************************************

: bit-field { offset size | name -- new-offset }        \ stream: name      

The BIT-FIELD word is used to define records whose fields are some number of bits. As a convention, the field names should be proceded with a dot, reminiscent of Pascal field names.

For example, consider the following code:

0
    3 bit-field .aaa
    5 bit-field .bbb
    8 bit-field .ccc
drop

This entire record fits inside of one single-precision integer. Consider this code.

create vvv  0 ,

17 vvv .bbb!

vvv @ b.

vvv .bbb@ .

Notice that there is no space between the .BBB and the !. This is because .BBB! is a colon word, as is .BBB@. There actually is no BBB word defined.

After we store the number 17 into the .AAA field, we print out VVV as a binary number and we get: 10001000. The value 17, as a binary number, is 10001. The 000 in the lower bits is the AAA field, which is three bits wide. When we use .BBB@ to obtain the contents of the BBB field, we get 17.

As a default, bit-field records can be as large as a single-precision integer. Sometimes this is a problem however. If you are using bit-field records to represent I/O ports, those I/O ports may be only eight bits wide. For example, on the PIC24 processor, the single-precision integers are typically 16-bit, whereas the I/O ports are 8-bit. In this case, you want to limit your bit-field records to eight bits so that when you read or write to the fields, you don't affect the adjacent I/O ports. In many cases, reading an I/O port is not a benign operation --- it can have an effect on the hardware. Similarly, writing a value into an I/O port is generally not a benign operation either, even if that value is the same as the value previously written to that I/O port --- this too can have an effect on the hardware. We have these words for specifying the size of your records:

: use-word-records ( -- )

: use-byte-records ( -- )

The default is to use word-sized records. Only go with USE-BYTE-RECORDS if you are dealing with 8-bit I/O ports. I don't provide support for double-word sized records. That would be more complicated than I want to mess with, and I doubt that anybody really needs them. Something like multi-word bit-field records would really be best implemented in assembly-language --- any Forth implementation would be too slow to be practical. Rather than have a double-word bit-field record, ou should use two word-sized records; the downside is that you might have a few wasted bits at the top of each because no field can overlap the two words. Live with it.

*******************************************************************************
*******************************************************************************
*******************************************************************************

: field ( offset size -- new-offset )           \ stream: name      

The FIELD word is used to define records whose fields are some number of bytes. As a convention, the field names should be proceded with a dot, reminiscent of Pascal field names. These are comparable to the STRUCT records in C. For example, consider the following code:

0
    w field .aaa
    f field .bbb
    d field .ccc
constant rrr

This defines an object called RRR, that has fields .AAA, .BBB and .CCC, of type word, float and double (remember the W, F and D constants that we defined earlier). We can use it like this:

create sss  rrr allot

1.0e sss .bbb f!

sss .bbb f@ f.

Here we store the value 1.0 into the .BBB field, then fetch it out again and print it.

We can derive objects from existing objects like this:

rrr
    w field .ddd
    c field .eee
constant rrr-child

This results in the same record as if we had done this:    

0
    w field .aaa
    f field .bbb
    d field .ccc
    w field .ddd
    c field .eee
constant rrr-child

This isn't exactly OOP, but it is a step in that direction.

*******************************************************************************
*******************************************************************************
*******************************************************************************

: 1array ( dim1 size -- )

This word defines a one-dimensional array whose elements are of SIZE bytes. For example, consider the following code:

10 w 1array aaa

^aaa .

aaa-lim .

aaa-dim .

0 aaa .
1 aaa .
9 aaa .

We define our array AAA, and this generates several colon words:
1.) ^AAA ( -- adr )         Returns the base-address of the array.
2.) LIM-AAA ( -- adr )      Returns the limit-address of the array.
3.) AAA-ZERO ( -- )         Fills the entire array with zeros. Words, doubles and floats will become zero.
4.) AAA-SIZE ( -- size )    Returns the size of the record.
5.) AAA-DIM ( -- dim1 )     Returns the dimension of the array (10 in the example above).
6.) AAA ( index -- adr )    Returns the address of an element given the index.

The LIM-AAA and ^AAA words are primarily provided so that they can be used in DO loops. For example, consider the following code:

: truthify-word-array ( lim adr -- )        \ sets every element to TRUE
    do  true I !  w +loop ;
    
lim-aaa ^aaa truthify-word-array

This is similar to AAA-ZERO except that it fills each element with TRUE (-1) rather than zero. Note that TRUTHIFY-WORD-ARRAY only works on word arrays. If you try to use it on an array of some other type, chaos will result (especially if that type's size is not a multiple of W). 

We also have:

: <1array> ( dim1 size name -- )

This is similar to 1ARRAY except that it takes the name of the array as a parameter rather than obtaining it from the input stream. 

If you want to program in an OOP-like manner and develop an array that is only for words, and which supports TRUTHIFY, this can be done. Write your defining word so that it removes the name from the input stream (using BL WORD and storing the name in a local variable NAME) and gives this name to <1ARRAY>. Also have your defining word generate a colon word like this:

NAME C" TRUTHIFY" :2NAME

Use similar techniques for defining your colon word as I did when I defined the xxx-ZERO words. The result will be that you can define an array called AAA and get a colon word AAA-TRUTHIFY generated automatically. The process isn't as smooth as you might find in languages such as C++, but it can be done. The upside is that the code generated by most Forth compilers will be faster executing than the code generated by most C++ compilers. OOP, as implemented in C++, involves a lot of behind-the-scenes machinations in regard to the virtual-method-table (VMT) that tend to degrade the speed a lot.

Note that unlike in an OOP system, my implementation of arrays doesn't allow for the arrays to be in the heap. The arrays are stored statically in the dictionary. It would be possible to implement arrays that get allocated in the heap rather than alloted in the dictionary. This isn't difficult, so it might be a good exercise for the novice to try. If you were going to implement matrices, you would likely want them to be in the heap as they are getting created dynamically during program execution as the result of the various arithmetic functions done to matrices. A mathematically inclined reader might want to give this a shot.

We have support for arrays from one to six dimensions. I'm not really sure who uses six-dimensional arrays, but we have them. If you need arrays of more dimensions than this, you will have to write your own. Look at my source-code and you will see an obvious pattern --- you can use the lower-dimensioned arrays as a guide for writing higher-dimensioned arrays. These are the defining words provided at this time:

: <1array> ( dim1 size name -- )
: <2array> ( dim1 dim2 size name -- )
: <3array> ( dim1 dim2 dim3 size name -- )
: <4array> ( dim1 dim2 dim3 dim4 size name -- )
: <5array> ( dim1 dim2 dim3 dim4 dim5 size name -- )
: <6array> ( dim1 dim2 dim3 dim4 dim5 dim6 size name -- )

: 1array ( dim1 size -- )
: 2array ( dim1 dim2 size -- )
: 3array ( dim1 dim2 dim3 size -- )
: 4array ( dim1 dim2 dim3 dim4 size -- )
: 5array ( dim1 dim2 dim3 dim4 dim5 size -- )
: 6array ( dim1 dim2 dim3 dim4 dim5 dim6 size -- )

If you are using a processor that lacks a hardware multiply, you should try to use dimensions that are powers of two. This will result in faster executing code as a shift will be compiled rather than a multiply. 

Note that the code for five and six dimensional arrays is commented out. This is because Win32Forth doesn't allow any more than 12 local variables. Assuming that your Forth system does allow more local variables (SwiftForth and Gforth do), then you can use the five and six dimensional arrays.


*******************************************************************************
*******************************************************************************
*******************************************************************************

We also have an alternative array definer that uses Iliffe vectors.

You can define a two-dimensional array of words in either of these ways:
W 0 3 5 C" TEST" <ARY>
W 0 3 5 ARY TEST

Assuming that I and J are indexes, an element can be accessed like this:
I J TEST                \ this provides the element address, which you can @ or ! to.
In the above, I is in the range [0,3) and J is in the range [0,5). They are in the same order as the dimensions were.

It is also possible to access the element like this:
<TEST J -> I TEST>      \ this provides the element address, which you can @ or ! to.
In the above, I is in the range [0,3) and J is in the range [0,5). They are in the opposite order as the dimensions were.

The [] only works for word-sized elements. 
You should always use the xxx> word (even for words) because it takes into account the size of the element.

Arrays of dimensions other than two are defined the same --- you can have as many dimensions as you want.
<ARY> figures out how many dimensions there are, as there is a 0 under the non-zero dimensions.

This is a classic example of the problem with CREATE DOES> words being limited to *one* action per data type.
The CREATE DOES> version of the Iliff vectors relied on [] for resolving the final level, which had to be word elements.
There is no way to do this with CREATE DOES> because you can't have a type-specific version of [] (such as my xxx> word).


*******************************************************************************
*******************************************************************************
*******************************************************************************

At the beginning of this document I spoke disparagingly of the use of CREATE DOES>. This is a very primitive technique that Chuck Moore invented back in the 1970s, but there are much better techniques available now. One of the biggest problems with CREATE DOES> is that it only allows for one action to be associated with the datum. My 1ARRAY word defined a total of six colon words (AAA, ^AAA, LIM-AAA, AAA-SIZE, AAA-DIM and AAA-ZERO). With CREATE DOES> you would only get AAA. 

CREATE DOES> words are also about an order of magnitude slower than colon words. This is primarily due to the fact that the parameters comma'd into memory after the CREATE have to fetched out of memory at run-time by the code after DOES>. This is a lot of extra effort, especially when there are a lot of parameters (as there are in the higher-dimensioned arrays). By comparison, in my system all of these parameters are literals inside of a colon definition. My system is also generally easier to read because (at compile-time) the parameters are stored in local variables with names. This makes for more readable code than is found in CREATE DOES> words in which the DOES> code has to fetch the parameters out of memory at run-time, and there is a possibility of the programmer getting confused as to what order the parameters were comma'd into memory.

Consider how the typical novice would write the FIELD word:

: field ( offset size -- new-offset )
    create  
        over ,  +
    does>  ( record -- field-adr )
        @ + ;

0
    w field .aaa
    w field .bbb
constant rrr

create sss  rrr allot

: ttt ( record -- aaa+bbb )            
    dup .aaa @  swap .bbb @  + ;

Using SwiftForth, our novice effort at FIELD results in the following machine-code:

see field
4756FF   40DDBF ( CREATE ) CALL         E8BB86F9FF
475704   4 # EBP SUB                    83ED04
475707   EBX 0 [EBP] MOV                895D00
47570A   4 [EBP] EBX MOV                8B5D04
47570D   40828F ( , ) CALL              E87D2BF9FF
475712   0 [EBP] EBX ADD                035D00
475715   4 # EBP ADD                    83C504
475718   40C2CF ( (;CODE) ) CALL        E8B26BF9FF
47571D   4 # EBP SUB                    83ED04          .AAA and .BBB call this address
475720   EBX 0 [EBP] MOV                895D00
475723   EBX POP                        5B              EBX is now the base-adr
475724   0 [EBX] EBX MOV                8B1B
475726   0 [EBP] EBX ADD                035D00
475729   4 # EBP ADD                    83C504
47572C   RET                            C3

see .aaa
47573F   47571D ( field +1E ) CALL      E8D9FFFFFF

see .bbb
47575F   47571D ( field +1E ) CALL      E8B9FFFFFF

see ttt
47579F   4 # EBP SUB                    83ED04
4757A2   EBX 0 [EBP] MOV                895D00
4757A5   47573F ( .aaa ) CALL           E895FFFFFF
4757AA   0 [EBX] EAX MOV                8B03
4757AC   0 [EBP] EBX MOV                8B5D00
4757AF   EAX 0 [EBP] MOV                894500
4757B2   47575F ( .bbb ) CALL           E8A8FFFFFF
4757B7   0 [EBX] EBX MOV                8B1B
4757B9   0 [EBP] EBX ADD                035D00
4757BC   4 # EBP ADD                    83C504
4757BF   RET                            C3

There are 11 instructions in TTT, 1 each in .AAA and .BBB, and 7 in the DOES> part of FIELD (called by both .AAA and .BBB). This results in 11+1+1+7+7 = 27 instructions executed. By comparison, when my version of FIELD is used, TTT looks like this:

see ttt
47577F   4 # EBP SUB                    83ED04
475782   EBX 0 [EBP] MOV                895D00
475785   0 [EBX] EBX MOV                8B1B
475787   0 [EBP] EAX MOV                8B4500
47578A   EBX 0 [EBP] MOV                895D00
47578D   EAX EBX MOV                    8BD8
47578F   4 # EBX ADD                    83C304
475792   0 [EBX] EBX MOV                8B1B
475794   0 [EBP] EBX ADD                035D00
475797   4 # EBP ADD                    83C504
47579A   RET                            C3

Now there are only 11 instructions executed. This is less than half of what the CREATE DOES> version requires. This is for SwiftForth on the Pentium. The speed difference is much greater than the 27/11 ratio implies though. As a rule of thumb, what primarily kills the speed on microprocessors are jumps, calls and returns. On processors with a prefetch-queue (such as the 8088), jumps empty out the prefetch-queue. On modern processors that concurrently execute instructions, jumps prevent the concurrent execution of instructions. This is also why the MACRO: word improves the speed --- because it gets rid of one CALL and one RET instruction. See Michael Abrash's books on 8088 and Pentium assembly-language for a more in-depth discussion of this effect. 

The SwiftForth compiler doesn't do much code-optimization. Because of this, my FIELD is about an order of magnitude faster than a FIELD written using CREATE DOES>. On a more modern compiler that does better code-optimization, the speed improvement will not be as great. In no case, however, will CREATE DOES> generate faster code than my FIELD or my array words, or any other defining word provided in NOVICE.TXT. In order to get good speed on any system, without concerning oneself with how the compiler's code-optimization works under the hood, it is best to use the defining words based upon :NAME and : rather than CREATE DOES>.

CREATE DOES> is grossly lacking in robustness, in that it only allows for one action to be associated with the datum. It also generates slow code. I strongly recommend that the novice ignore CREATE DOES> and use :NAME instead. Also, it would be worthwhile to petition the Forth-200x standards committee to get rid of CREATE DOES> and provide :NAME instead. Unfortunately, the people on the Forth-200x standards committee today are the same people who were on the ANS-Forth standards committee in 1994. Considering what a bad job they made of ANS-Forth, it is likely that Forth-200x will be more of the same. If enough people petition for :NAME however, maybe they will listen. I tried but was considered to be a VCIW and was ignored (VCIW means "Voice Crying In the Wilderness"). Most of the Forth-200x members have been around since the 1970s and they are hanging onto CREATE DOES> in an effort to be faithful to Chuck Moore's original slightly-flawed language of the 1970s. Significantly however, Chuck Moore himself is n
ot wasting his time on the Forth-200x committee.