Valid
	XHTML 1.1! Valid CSS!
Created 2003-10-01   Modified 2009-04-11
Chelton Evans

proj rpn home

Intro
rpn User Manual     - Documentation
TODO
      Critical
      Long Term
The Interpreter and the Stack
Patterns
Pointers
Memory
Data Types
Real Time Type Checking

Source

Files

Makefile
keyboard.cpp
keyboard.h
main.cpp
mathfunc.cpp
mathfunc.h
pathstuff.cpp
pathstuff.h
rawinterpreter.cpp
rawinterpreter.h
rpn.cpp
rpn.h
rpn2.h
rpnfunc.cpp
rpnfunc.h
scopedependentfunctions.cpp
scopedependentfunctions.h

projcompile.txt
unittestsreport.txt

Doxygen

main.cpp
Makefile
addtothedictionary
dssize
dssize2
fbuild
fbuildbase
fbuilduser
fdatainterp
ifthen
ifthenelse
initscopedependentfunctions
innerinput
inputstate
inputstatescope
isrpncomplex
isrpnprogram
isrpnreal
keyboardinterface
mathconstants
outerinput
passtype
pathstuff
pathtoggle
rev
rot
rotn
rpnascii
rpnbase
rpnclear
rpnclearboth
rpnclearvar
rpndup
rpndupn
rpnerase
rpneval
rpnfunction
rpngcd
rpninsert
rpnintegerbin
rpnintegerconvert
rpnintegerdec
rpnintegerhex
rpnintegeroct
rpnmod
rpnpointermake
rpnpop
rpnpopn
rpnprogram
rpnprogramstackstate
rpnpush
rpnpushn
rpnreal
rpnrealconvert
rpnstring
rpnstringconvert
rpnvar
rpnvector
rpnvectormake
rpnvectorpointermake
stateevalquery
stateevalset
stateevalunset
thenelseif
thenif
varpwd
vartree

Intro

Status: Not working.
Still broken after port from a long time ago.

Reverse Polish Notation - implementing an application language. Consider the HP calculators. The motivation comes from using HP calculators which are great to program on (sticking to the basic programming structure, passing information between functions through the stack). However I came across a book called "Write your own programming language using C++" by Noman E. Smith. He argued for an application having its own mini-compiler programming language and I don't know much but it sounded good.

Bringing a command line to an application is an offer I can't refuse. The aim is to implement a stack based programming language in C++.

<TODO>

Critical

Long Term

Data Types

Variable Space

A dequeue will hold the variables. To determine if a string is a variable, search the dequeue from start to finish, and take the first match. This lets the variables be overloaded by putting a new variable at the start of the list. For example implementing temporary variables. Or for a more 3D data structure implement directories and the current directory, the variables are from the directories, home to the current one each added down the list.

Error Messages

Since its a runtime compiler any error will be ignored. Generally the stack state is preserved. Sometimes an error message is printed, but I would prefer a strategy of non-action.

Patterns

The visitor pattern was used to match the functions to the data. For one dimensional functions - that is functions that only need to have one parameter type identified the standard visitor pattern does the job. On closer inspection you will have observed the visitor or data in the base class and the functions derived. This is because the functions are also data, can be put on the stack and evaluated too. So the hierarchy is a natural consequence of having everything evaluated on the stack.

For two dimensional functions such as addition the matching of data type is more complicated. However it is solved by reducing the problem to solving two one-dimensional functions. For example the function rpnadd determines the first arguments type. It then forwards the work to another class which assumes the first argument type. So if its an rpninteger the work is forwarded to addtointeger.

The visitor pattern is being used as an interpreter. It is matching the data and function. "Write your own programming language in C++" page 36 refers to other interpreters - of which the switch threaded code is fast, and the subroutine threaded code is fastest. Probably both these methods are faster than what I am implementing. I don't know too much about compilers, but I hope that by in-lining my virtual function calls the code will be competitive - say about 8 times slower. This is by no means the end of the story. To be honest the way a rpnprogram data type evaluates has no reuse - once a program is evaluated its thrown away. The data structure is very versatile and provides support for local variables, but it seems to me to be a good candidate for a pattern.

Real Time Type Checking

Consider real time type checking vs pattern. Each time the pattern is used two function calls are made, and at least one class has to be implemented for each dimension. Well thats how I started thinking about the issue. However I am not a purest and implementing a class for an if statement is questionable. After some thought I believe that both can be used together. The visitor pattern still has a clear separation between function and data - and can be used for these mappings. Runtime type checking can then be used to validate the stack arguments. The solution still has structure and is versatile.

Compare rpnaddition and rpnmultiply against other binary functions. The visitor pattern uses three classes, but using real time type determination (although built into C++ I didn't use) only one class was required.

Programming Language

All programming constructs will be in reverse polish notation - ie stack based.

For example consider the problem with traditional languages, the FOR loop structure of the HP.
   1 N FOR I ... NEXT
The FOR and NEXT are coupled syntax and the whole has to be read before it can be interpreted. This is ok but a more generic for loop has been written.

{} --- test
{} --- loop body
for

This has the advantage of using one token and can be immediately interpreted. Similarly an if-then-else construct is built.

result of test
{ "1" }
{ "0" }
ifthenelse

The Interpreter and the Stack

The commands have to be interpreted. This is based on the state - either objects are placed on the stack or evaluated on the stack. The user sets the state. @+ Functions such as +,dup derived from rpnfunction are placed in a dictionary. When a string from the input stream is encountered, it is interpreted. If its a primitive data type its put on the stack as such. If its a program the interpreter has to recursively build it.

If the string token isn't a number or program data type its looked up in dictionary hierarchies. If its not found in these its assumed to be a primitive string data type.

The dictionaries are layered. If the string isn't found in one its looked for in the next. The first correct match is the one chosen, so variables can be overloaded. However more common is overloading within a dictionary itself. As it stands a dictionary is implemented as a LIFO structure. Overloading is simply being closer to the start of the stack.

Memory

The memory and evaluation were coupled so that when an object is evaluated it releases its resource. The exception to this is data types that are not programs which don't free themselves but are released by functions. The implications are far reaching - when a function or program is evaluated its destructed and can not be reused.

The reason for implementing it this way was that it was the easiest way to handle evaluation by forwarding the cleanup responsibility to the functions which know when an object has to die.

To be efficient this compiler would have to be rewritten and memory issues better addressed. A fix it strategy is to implement reference counted objects, say on programs would drastically reducing copying in something like the for loop. This kind of optimization would benefit this compiler. A persistence boolean flag in every class would be implemented. When an object is evaluated it would delete itself based on the persistence flag.

Another strategy is to build another compiler that compiles the code much better. I will have to read up. Being independent it could plug in to this framework - where it shares the same language.

Pointers

Pointers arose from the need to give the programmer flexibility when writing algorithms. eg multiply two numbers in the stack, swap two numbers in the stack. The pointer mechanism allows referencing into the stack, not just on the bottom. Thanks to Nigel Stewart for raising the issues about programming on the stack. From the shortcomings of specific functions I implemented pointers - a more generic solution.

Reference counting and pointers were simultaneously implemented. For a robust system it is critical that a null pointer can never be dereferenced (and segfault). As this is to be an application language stability takes priority over efficiency. The application language serves the program, it can never bring it down.

So reference counting lets an object stay alive as long as there exists a reference to it. So if its deleted from the stack and there still exists a pointer to it then the pointer is still pointing to a valid object.

Saving the system state is also of interest - how is a pointer to be saved. I choose to realize the pointer, so the value is written. So pointers are a unique data type in that they always represent something else and are not manipulated by the user. ie the user can create pointers but any operations on them are on what the point to.