@sundialservices, All,
Summary for the Impatient:
This is all background and Please skip if your Verbose setting is Low
Agreed. Compilers are awesome tools. Been using them since mid 80s.
I spent more than a little time looking into using an existing Compiler/Interpreter.
Finally decided to write my own Interpreter.
Why you ask?
The project is (right now) my own somewhat skewed *Hobby.
I am learning from my mistakes.
This is my first Interpreter.
I found Haxe and wanted to give it a try.
I like a lot of what Haxe offers.
Technically the project allows nearly any mix of Infix, Prefix or Postfix notation.
You realize that nearly All programming languages have the concepts of LHS and RHS?
And will be adding support for Parsing RIGHT to LEFT (Ex: Hebrew, Arabic)
The project has 3 types of Assignment statements:
- What other programming languages use (this is resolved by Parser/Interpreter)
= Assign to Left the value/expression evaluation of the Right
x = 42 .
- Assign to Left (similar to = but always deterministic)
:= where you see the Colon : is the side that gets changed
x := 7 * 6 . Or x from 42 .
- Assign to Right
45 radians sin into x . Or sin( radians( 45 ) ) =: x .
3 different Assignments are to allow easier learning of Computer Programming coming from a non English and possibly not deeply technical background.
The from and into allow a logical way to express relationships at the Natural Language level but are alias for the internal := and =:
Also Semantically the ordinary = Assignment can be extended to something like:
x * y = 5 - z .
Which conventional programming languages do not handle but in the Future the project may allow for more of a Symbolic approach to Systems of Equations.
Like 4 Equations and 4 Unknowns for example.
Some of you may recognize a Forth like syntax above.
The Interpreter implements a Stack based approach that is more verbose than Forth.
You may note that a simple period is used to end a statement. , ; : may also be used.
Project uses period as example as it fits Natural Language syntax better (at least for statements in European languages).
Looked at some BNF and EBNF and PEG tools.
I found 1 tool that was flexible enough but was big $ and proprietary.
None of the other tools offered a solution. Would need to patch tools together at Best.
At worst I would be Debugging assumptions about LHS and RHS Grammers/Parsers, etc.
Earlier said I was working on a rough Prototype.
Right now, I do not care about Performance only Correctness.
If I really cared about performance now, I likely would have finished version 0.1 in C++
and used LLVM or various other compilers for distribution to what platform(s). Similar to options allowed kindly by hxcpp.
But I wanted to learn to be dangerous with JavaScript as well.
I have the entire Interpreter running as a Web Worker with a primitive UI in HTML.
No CSS, No JavaScript libraries.
Need more work on Web side to do same as Console apps in C++, C#, Java, Python
(Show of hands, please … does anybody like to juggle characters? … Anybody? …)
Imagine me holding up Both hands.
First full time Programming job was 8086 ASM on original IBM PC using Edlin and MASM and Debug (my humble SKEWED beginnings, so this is Easier!)
Approaches
User/Programmer defines words in simplified form of their own natural language in a Dictionary. I use .toml style as Dictionary Input/Output.
Verbs, Nouns, Operators, Built In Verbs are provided.
Verbs are like Code. Now most examples are Imperative.
In future more Declarative is planned.
Nouns are like Data
Nouns can be declared within a block in a Verb and those act like local variables.
Operators do well known actions. Some use Data, some produce Data, some both.
Built In Verbs do well known actions in a Declarative way. Show or Repeat for example.
Constants like Numbers or Strings or Bools can be sprinkled in for extra flavor.
Abstract/Concrete Syntax Array and not Tree
This allows easy tracking of Reading order which I think is important for both Users/Programmers and my own sanity. Also not sure how to support well Reading order within a Tree representation. Also (perhaps) less complexity than a Tree. Also suspect either 3 Tree traversal approaches or 3 different trees to handle the various prefix/infix/postfix combinations. Also actually maps much closer to ASM than a Tree does. Perhaps output to WASM or LLVM IRL sometime later.
Paradigms
Concatenative
User/Programmer sees a Data stack as part of learning.
Data stack is passed in to a called Verb. Verb returns values by adding to Data stack.
Why?
Because every general purpose CPU and many embedded CPU have Stack Registers/Segments. I felt that knowing a little about a Data stack is useful.
Call by Syntax
Supported (in progress today) as part of Export the flexible syntax to more conventional programming languages like Haxe
Haxe to learn programming, WOW !
Haxe to learn programming in about a dozen programming languages WOW squared!
Declarative
Limited examples now like Show or Repeat
Future direction. Consider this as applying the SQL approach to databases in a way as a DSL to programming languages or a Programming Design Language. There are many examples of various DSL making programming easier and less error prone.
Functional
Future direction. Thinking about integration with Haskell or F# or joy or ?
Programming by Logic
Future direction. Thinking about integration with ProLog or similar.
A lot of AI now started with ProLog and a fair amount still use ProLog.
Code Generation
Future. Likely do 1) easily and quickly.
Add support to Generate new Verbs in 2 different ways:
-
Change a Noun definition until it looks like a Verb.
Use a reserved keyword like to_Verb to take the Noun and make a new Verb.
Run the new Verb as would run any other.
-
Create 1 or more Verbs from embedded Comments
Comments may use a specific format to enable/encourage some extra parsing.
Extra parsing may be more on Declarative side than Imperative.
Two sources of idea for 2)… current Database engine research allows for Declarative description of Data (like SQL logical layer) and then research engine figures out details of Data Structures and Algorithms. Found from the Daily Paper site. Other idea is my epiphany that the Semantic contents of Comments can act as place holders to do a similar approach for not just Databases (“a well-known and well-studied problem”) but more generally for other programming patterns.
Library Interface
Future. Leaning toward using .toml file to declare a IDL style interface.
I like IDL idea as it implements the iUnknown interface so that existing Libraries will work and new Libraries may be added with new features without breaking old interfaces.
Also allows runtime Documentation from actual interfaces.
May allow Dynamic Libraries that do not consume Resources when not being used.
Target Platforms
Phones, Smart Phones, Tablets, Laptops, Desktops, etc.
Project is intended to not consume large resources of the platform.
A problem being well known and well studied does NOT mean it is well Implemented.
Current claim is at Prototype set of expectations, Implementation internally is not significant. Only significance is Correctness of Results.
My assertion is that it only takes 1 counter example to show another way as an effective approach. Have read on the order of 500 papers/articles & several books. On the order of 100+ related to the problem/design/solution/implementation domains of the project.
Mid 90s wrote a Windows or NetWare multithreaded database engine that was anywhere from 2x to 10x faster than SQL DB on same HW/OS. Also easy to fall into Sorting too late trap. “When Sorting Needs To Be Faster”.
*Hobby now defined as 10+ K lines of Haxe with 3+ K lines of Comments in about a dozen source files, all Haxe except for 1 HTML and 1 JavaScript for UI front end of Web Interpreter.
Having Fun, Hope You Are Also !