Have you ever written an interpreter or compiler for your own toy programming language?
I was bored one afternoon a long time ago and wrote a little C program to emulate an imaginary RISC CPU and invented an instruction set for it. I never got around to writing an assembler, but the disassembler was trivial.
Some years later I wrote a completely crazy stack-based language where the only data type was a string and used it for drawing pictures with a home-made graphics library.
I also wrote an incredibly simple, but entirely serious, scripting interpreter to write regression test scripts for an API for a device driver I had written.
One of these days I will get around to doing another one, for my own amusement.
What crazy ideas have you had? What have you tried? What would your ideal or ideal absurd language be like, and could you implement it?
(Score: 1, Informative) by Anonymous Coward on Thursday March 16, @09:51PM (1 child)
https://en.wikipedia.org/wiki/Yacc [wikipedia.org]
(Score: 0) by Anonymous Coward on Thursday March 16, @10:13PM
Are you drunk too?
(Score: 3, Informative) by DannyB on Thursday March 16, @10:07PM (1 child)
I have played with implementing existing languages. But not inventing any new languages. Three things.
Back in the mid 1980s using Pascal. When building a report writer for an accounting system with database, I implemented a small expression language. No statements. Just calculations in the form an an expression that calculates to a value. Like what you type into the cell of a spreadsheet. It could reference other values, call a few built in functions and use a few operators and constants. The real learning experience here was making serious use of recursion in Pascal, writing the syntax chart of what constituted legal values. (The entire Pascal language was written as this type of a chart, and Apple computer made it into a huge colorful wall sized poster.)
In the mid 1990s I was learning C and C++ both. Considering them as a replacement for Pascal. (But the end result needed to be portable across our customer's platforms, and have a GUI. Ultimately this eventually led to Java, in a roundabout way, eventually leading to build web based application.) While learning C, I decided to do something I had wanted to do way back in my college days when I was Dunning-Kruger'ed with BASIC. So I built a BASIC interpreter. Launch the C program. Up comes a console with BASIC. I added plenty of whiz bang bells and whistles because I had been fascinated with extending BASIC to be very extensively extended back in college when I thought I could do ANYTHING in BASIC!
In the mid 2000s, I could see that Visual FoxPro was not going to last forever. I wondered about building a compatible implementation in Java. So I experimented. There were two major elements (well, more, the GUI, but only two that I focused on initially). I built working code that could read and write the DBF (records) FPT (memo) and CDX (index) files of this "dbase" like database. I also worked on the language. I built it as an interpreter. I had a significant enough part of the language working to test it. I could open tables, manipulate data, etc. Alas, I was unhappy with the performance. This was back when Java bytecode was interpreted not JIT compiled by advanced compilers as it is today. After this adventure I turned my sights to "what if we rebuilt everything as web applications?"
It is not that I wouldn't try creating a toy language. But I would need to have a reason. Maybe a spark of insight where I think I've recognized some opportunity to make programming easier, or less error prone, or to make complex ideas easier to express and represent. Those are what I consider to be true motivations for creating yet another language. An advancement upon the current state of the art. It seems most of the languages now in existence were created from one of those motives. To make things either easier, safer or even simply possible.
How often should I have my memory checked? I used to know but...
(Score: 0) by Anonymous Coward on Thursday March 16, @10:30PM
> Ultimately this eventually led to Java
Undeniably the world's most popular toy programming language ;^o
State stores and dispatch tables have been every serious program I worked on - but yo dawg, I built my VM for a VM.
(Score: 2, Interesting) by Anonymous Coward on Thursday March 16, @10:32PM (3 children)
I've been writing an array language on and off for ~2 years now. I'm constantly changing the foundations as I make progress, so it's been slow, but after rewriting from scratch around 20 times in Rust, Go, C, different assembly langs, I think I'm getting pretty close to a fully fleshed-out design. Immutable objects, mutable context, implicit mapping over arrays unless explicitly specified. RPNish+prefix symbols, but not actually stack-based for now, though I have flop-flopped on this several times and syntax has changed drastically.
(Score: 1, Informative) by Anonymous Coward on Friday March 17, @12:05AM (1 child)
This https://en.wikipedia.org/wiki/Array_programming [wikipedia.org] suggests that there are already quite a few of these. Do you have any specific uses in mind?
(Score: 2, Insightful) by Anonymous Coward on Friday March 17, @01:47AM
At first I was just doing it for fun, but during the process of designing and building I've come to think that the array paradigm has promise for low-level work on modern architectures (but most of the existing array langs are pretty exclusively high level). C and other Algol-descendants don't really seem to map all that well to the actual hardware anymore, and despite 50 years of compiler work look like a pile of mostly-unnecessary shit when disassembled. Mostly just for fun though.
(Score: 3, Interesting) by istartedi on Friday March 17, @11:23PM
Although my toy language is stack-based, if I were starting from scratch I might consider a register based VM. Check out Lua. The register VM allegedly gives it a performance boost. Of course any VM is going to be slower than native code, and I have "self-hosting" via compilation to C as an objective so I'm not that upset about my VM being slower. It's just an idea. Also, your internal "compiler" for the VM seems like it might be a bit more complicated for a register machine but I'm not sure how much harder it is because I've never tried.
Appended to the end of comments you post. Max: 120 chars.
(Score: 2, Interesting) by khallow on Friday March 17, @05:26AM (8 children)
My ideal language for the purpose I was pursuing is homoiconic [wikipedia.org] (that is, programs are the same as any other data and easily manipulated), extremely simple, and not a Turing tarpit [wikipedia.org].
(Score: 4, Interesting) by Mojibake Tengu on Friday March 17, @07:39AM (7 children)
Instruction sets on von Neumann machines are homoiconic. Data is code and code is data. So assemblers on said machines are homoiconic too.
It's just a failure of common programming languages design engineering they are not.
Metaprogramming in macroassemblers (slightly advanced ones, like nasm, which closely imitates ancient industrial assemblers) using combinator systems like BCKW is trivially easy.
The edge of 太玄 cannot be defined, for it is beyond every aspect of design
(Score: 1) by khallow on Friday March 17, @05:33PM (6 children)
There's a subtle problem with that. For example, if you wish to apply data to a program that you're manipulating, how do you distinguish between the applied data and the program? The fungibility of program and data complicates that. With combinators I have natural ways to assemble combinators that allow for such distinctions. For example, if I want to run program P on data D, with combinators, I can just multiply them together (combinators form a weird algebra) PD. If I want to switch the role of the two? DP instead.
For the von Neimann model, this can break homoiconicity to some degree because you have to put the code somewhere and the data that the code operates on elsewhere. And thus, the code is treated somewhat differently than the data it operates on.
(Score: 2) by hendrikboom on Friday March 17, @07:51PM (4 children)
With your PD model, presumably the P is in different memory from the D as well. Not a different kind memory, not a memory with different access rights, but different physical memory. If they were in exactly the same memory, then P and D would be equal. You usually don't want a program to be operating on itself as data.
The story goes that when they ran the very first in-memory sort program they had a bug (not surprising) but they had immense trouble finding it (somewhat more surprising). Until they discovered that they has set the bounds of the storage containing the data to be sorted wrong and
the program had been sorting its own code -- the same copy it was executing. Of course that did not work.
So with your combinators it's perfectly legal to let P = D and effectively be executing PP or DD. It's just that's usually not what you want.
--hendrik
(Score: 1) by khallow on Friday March 17, @09:49PM (1 child)
But in a natural way - you don't have to specify where the program and data are in the language. The PD product is a single tree structure that is computed - there's no consideration of what parts are P or D after it fires up. It's a mosh pit that goes till it stops. Mojibake mentioned that combinators can be emulated in von Neumann machines and that would be one way to sidestep the problems I mentioned earlier.
(Score: 0) by Anonymous Coward on Saturday March 18, @06:56AM
Of course you're going to have to specify a lot of things that are implied in the notation - at first. Once you build a VM that handles all those annoying behind-the-scenes operations explicity, you put the parser on top and handle them implicitly in the syntax. (you could do without a VM, but messing with program memory under an OS is a surefire way to find heisenbugs)
(Score: 1, Insightful) by Anonymous Coward on Saturday March 18, @12:23AM (1 child)
In many machines, the program instructions are loaded into the lowest memory and data into higher memory addresses. The story as I heard it is that they loaded the data into memory. They were supposed to set the start of the data to the lowest address of the loaded data but instead set it to zero, which is a surprisingly common bug in low-level programming. So the program dutifully sorts away in the memory space, including the program instructions themselves. Eventually Bob's your uncle and you've got a completely scrambled program doing all sorts of "fun" things. You can do similar things today and the results can be quite interesting and fun.
(Score: 2) by hendrikboom on Saturday March 18, @01:28AM
Yes. This actually happened in the early days of computing, as I mentioned elsewhere on this page. The programmers were surprised.
(Score: 3, Interesting) by Mojibake Tengu on Saturday March 18, @10:36AM
At the language level, I'd use some explicit designator operator to syntactically distinguish (or morph) meaning between program and data, of course with proper accessor operators understanding designators intrinsically.
At machine code level it does not matter, I can put code or data anywhere I want to (and I often do mixing code and data), with only respect to proper section decorators (such as executable bit on some architectures and platforms). Any JIT is just conceptually "making data executable".
Note the C compilers use funny orthodox rigid sections designed by dogma, strictly separating data and code, but that's not quite the limitation in assembler nor in linker.
What is certain, truly homoiconic high level language should have it's own execution model and linkage rules, but even that is not beyond common capabilities of current toolchains.
The edge of 太玄 cannot be defined, for it is beyond every aspect of design
(Score: 2) by Freeman on Friday March 17, @01:51PM (2 children)
I've not re-invented the wheel for the sake of re-inventing the wheel. Sure, I'm quite certain I've run into that trap with whatever I was toying with, but it wasn't for the sake of re-inventing the wheel. I'm glad that some people have taken the time to create useful programming languages and the like. I also am likely not good enough at the science part of computers to actually create my own language. I certainly don't have the will to do so.
Joshua 1:9 "Be strong and of a good courage; be not afraid, neither be thou dismayed: for the Lord thy God is with thee"
(Score: 3, Interesting) by turgid on Friday March 17, @02:27PM (1 child)
I'm not good enough, or patient enough, to implement a "real" programming language, but it's fun to play with toy languages. It's fun to see what you can do very simply and cheaply.
I refuse to engage in a battle of wits with an unarmed opponent [wikipedia.org].
(Score: 1, Touché) by Anonymous Coward on Saturday March 18, @12:34AM
Don't count yourself out too soon. A number of common programming languages and projects started as toys and Linux started as a toy OS. In fact, I know of a number of toy languages turned DSLs that support huge businesses and projects. All it takes is for a toy to meet feature creep or shared interest and you are well on your way.
(Score: 1) by shrewdsheep on Friday March 17, @02:29PM
To prototype a parser use Parse::RecDecent. IMO nothing else beats it, not even Raku which was specifically designed to handle grammar parsing. Otherwise I can only echo the sentiment that in 99 out of 100 cases you will be doing something redundant. For your (hobby) assemply project Parse::RecDecent should be a precise match, a full assembly parser (for a parse tree only) should not be more than 50 lines. Then, you have to implement some games and add your emulator to retroarch.
(Score: 2) by Tork on Friday March 17, @05:14PM (2 children)
Ummm... I can sort of say yes. I am *not* a programmer but I am someone who uses scripting languages quite a bit, mainly to automate tasks. (To me the distinction is that a lot of the commands I have memorized are app-specific and not language-specific.) The first one I used, for example, was VBS inside of Excel... I had a boss that gave me a little time to research and learn that on my own so I wrote functions to do things like sort rows/columns alphabetically. Those functions actually existed already but it was a way for me to stretch my legs with the language. I was able to demonstrate that I could take Excel data and operate on it with custom code, that opened a few doors for me.
Since then I've been employed to reduce mouse clicks by automating tedious tasks in various apps. On a few occasions I've created a simplified macro language and an interpreter to run through them. Very very basic stuff, like: "open=C:/PathToMyNewFile.txt", "save=C:/MyNewLocation/PathToMyNewFile.txt". But then I'd add features specific to the app being scripted. If I were talking about GIMP for example I might add a command like "GaussianBlur=32", and my interpreter would apply a 32px blur to the image.
The reason behind approaching it this way is that it makes it pretty easy to write text files with these simple instructions and use them as a form of batch processor. I'm not sure it quite counts in the context you mentioned but it's as close as I can get to contributing to this particular discussion.
Slashdolt Logic: "25 year old jokes about sharks and lasers are +5, Funny." 💩
(Score: 0) by Anonymous Coward on Friday March 17, @06:06PM (1 child)
I used to make good use of M4 [wikipedia.org] back in the day. You could actually write a compiler using a macro processor.
The best CS students are those who played around and discovered things for themselves before studying formal theories - which only really click when you fully understand the problem. That is to say, don't discount your ability to contribute to discussion because you lack foundation in formal theories.
(Score: 2) by hendrikboom on Friday March 17, @07:59PM
I used m4 to maintain a grocery list during the pandemic. I was at elevated risk from the pandemic, so a very kind friend of my daughter bought me groceries.
I used two m4 macros -- one that included its argument in the output, and one that excluded the argument.
I used the source file as a checklist, setting include or exclude according to my inventory (and he amount of space left in my too-small freezer) and used m4 to create the actual list, which I emailed to her. When she delivered the groceries, she'd let me know how much it cost, and I'd use my bank's mechanisms to email the money to her account.
That wasn't a toy language, though. It was an application.
-- hendrik
(Score: 2) by istartedi on Friday March 17, @06:56PM (3 children)
Unique Name For Which I Can Search.
I still have my RiouxSVN up for it, but I've never gotten it to a point where I wanted to make it public. The biggest idea explored with the language is that you can use posfix, prefix, and infix whenever you like. Hence the name "unfwics" was contrived to make it easy to find when searching if it ever got released (the antithesis of Go), but also as a pun on fixity--it's not prefix, infix, or postfix. It's unfwics. Under the hood it's a simple stack-based untyped language but I have lambda (yes, you can postfix lambda!) and tail-call detection (but only at runtime which is one of the stupid things that makes me keep it under wraps). The last problem I started on was making it a typed language. I don't think that's really such a hard problem, but it's hard enough and I was tired enough of hacking on it that it became a real de-motivator, right around the time vaccines were about to come out, LOL; because the whole project (as you might imagine) got a boost during lockdown. I've told myself that if I ever get it self hosting* and typed I'll make the repository republic. I'd be proud enough of that.
*I consider it being able to translate itself in to C "good enough".
Appended to the end of comments you post. Max: 120 chars.
(Score: 3, Funny) by hendrikboom on Friday March 17, @08:03PM (2 children)
"unfwics" is an excellent name. I am really tired of searching for documentation of obscure features of the Racket programming language and finding endless pages of links to tennis and badminton equipment.
(Score: 0) by Anonymous Coward on Saturday March 18, @02:12AM (1 child)
racket -ball -tennis -badminton
(Score: 2) by turgid on Saturday March 18, @10:49AM
racket -loud popular music for young people
I refuse to engage in a battle of wits with an unarmed opponent [wikipedia.org].
(Score: 3, Informative) by hendrikboom on Friday March 17, @08:38PM (4 children)
Let's limit it to languages that really ran or were intended for production and mostly worked.
Lisp 1.5 -- way back in the mid 1960's
macros I used in Algol W to get for-style loops for iterating through linked lists (with while loops I had the habit of leaving out the statement that advanced to the next element of the list -- the macro never left that out.)
Algol 68. (without formats and without a complete function library)
Lithp -- a Lisp/scheme-like language implemented using the Stage2 macro processor. It translated Lithp into VAX assembler, relying on STAGE2's insistence that macro parameters always have matching parentheses. I initially concatenated the Lithp code into a single line, since STAGE2 also insisted that macro calls be single lines. I used this language in an experimental program to do static type checking in a language that had dependent types based on constructive type theory.
C++ interpreter.Mostly working when the project was cancelled.
A forth-like language that did static type checking and garbage collection. The garbage collector was written in its own language, and could even collect itself while operating. But that was all it did -- not very useful.
An experiment in bootstrapping -- how small a basis did one need to start bootstrapping a language in its own language and start, in successive stages, expanding it into usefulness. The basis consisted of 115 lines of C, (including a long comment that contained the grammar of the language it translated. The grammar it translated to produce most of this code was 15 lines of its own language. After nine stages of bootstrapping, I ended up with a tool that did recursive-descent parsing with backtracking, tree transformations on the parse tree, and finally recursive writing of that tree according to very terse programmer-provided templates.
(Score: 2) by turgid on Saturday March 18, @02:16PM (1 child)
The C++ interpreter sounds intriguing. That must have been very difficult! You've worked on some very interesting projects, I see, in times where coding was much more difficult.
I refuse to engage in a battle of wits with an unarmed opponent [wikipedia.org].
(Score: 0) by Anonymous Coward on Tuesday March 21, @04:48AM
C and C++ aren't quite as bad to implement as most people think at a basic level. The massive amounts of undefined, unspecified, and implementation-defined behavior let you basically do whatever you want in certain situations. According to a colleague of mine that has worked on a number of implementations, the only real problem comes in when people insist you follow the behavior of a different compiler as if that is what the standard says the expected behavior is. The reason for that is that most people don't actually know the complete bounds of what you can and cannot do in C and C++. My favorite example of that is this is the perfectly valid output to the following program. If you filed a bug on a compiler bug tracker, they can to close the ticket WONTFIX-NOTABUG and still be fully compliant with the C specification.
#include
int main()
{
printf("Hello World");
return 0;
}
(Score: 2) by istartedi on Monday March 20, @01:20AM (1 child)
That last one is the most interesting to me. I love the idea that an engineer is the "dual" of a technician. An engineer takes a few simple rules and designs complexity that gets built. A technician looks at the complexity that got built and tries to find the one simple thing (usually) that went wrong with it. Having done both in the realm of software, I think technicians are under-appreciated and underpaid. Rightly or wrongly, there's the idea that technicians are more easily trained and supplied. I don't know if there's any science to back that up though. I just know I wanted to get out of the call center ASAP.
OTOH, engineering is generally regarded as practical vs. science which is more inclined to the pursuit of knowledge for its own sake. I'd place that bootstrapping project somewhere in between science and engineering. Did you ever use the resulting language for anything else, or did it stop there? If that was the end of it, did something make it unappealing for use beyond the experiment, or were you just boot-strapping something that was like a pre-existing language?
Appended to the end of comments you post. Max: 120 chars.
(Score: 2) by hendrikboom on Monday March 20, @05:47PM
It never got used for anything outside of its own implementation.
It lacked a concept of data types, which could have provided useful sanity checks on the code.
It was also difficult to debug sets of transformations in the code transformation subsystems, because there was no practical distinction between a coding error and a rule that happened not to apply in a specific circumstance. Tracking these down was an exercise in frustration.
I may revisit this sometime, but for practical use on the scale of computers we have nowadays, multi-language systems like Racket seem easier to use.
(Score: 2) by krishnoid on Friday March 17, @10:43PM (4 children)
I'd think it would be interesting to create/adapt a few restricted domain-specific languages targeted at specific infrastructural uses. Erlang (for phone switches) is one, and it was adapted to create the first Jabber server. In any case, it would be helpful to understand the feature tradeoff for a specific (e.g., infrastructural) domain.
(Score: 1, Interesting) by Anonymous Coward on Saturday March 18, @02:53AM
Erlang is a great example of what I mentioned in my reply to turgid about toys vs "real" languages. It started out an expiramental toy, turned into a DSL, and then a full-fledged general purpose language. It met the feature creep then shared interest and grew from there. Same thing happened to PHP, perl, shell, R, and a number of other languages.
(Score: 1, Interesting) by Anonymous Coward on Saturday March 18, @09:16PM (1 child)
Further to the AC above, the first I heard of Erlang was when it was used for Wings3d back in 2001. A highly parallel DSL developed for packet switching used for a desktop subdivision modelling application... Is that considered a success?
(Score: 1, Informative) by Anonymous Coward on Sunday March 19, @11:23PM
Erlang is much more than a DSL anymore. It is a fully-featured, general-purpose language. One of the major limitations to its popularity is that it their big users are using it on servers not clients. There are some massive deployments of Erlang out there working relatively silently in the background. The language itself, it's VM, and environment are great. I could go on at length about how great everything Erlang (and declarative languages in general) is but I don't want to derail the thread.
(Score: 3, Interesting) by istartedi on Monday March 20, @01:23AM
If anyone on this hasn't read the story of how Erlang got developed, seek it out. Without spoiling too much, it's an inspiring story of one developer doing "the right thing" and management eventually catching on.
Appended to the end of comments you post. Max: 120 chars.
(Score: 3, Interesting) by Rich on Thursday March 23, @09:45PM
I did a few interpreters for actual business, most notably a Forth dialect that was used to script animation for a "Garfield" screen saver in System 7 and Windows 3.1 days. That was before there was Flash, and when screensavers were a big thing. And when software came in boxes, the product shipped in an actual aluminium take-away lasagna bowl. I also did an interpreter for 68HC11 machine code that was used in a device simulation that had over a dozen of the things, and a dynamic recompiler for Microblaze-to-x86 (before QEMU had it) in a similar simulation that required a magnitude more of performance.
However, I had a toy project that never got finished, a modern take on a Pascal dialect. Or kind of what was considered "modern" in the day: it had a type system with co- and countervariant generics, similar to what Scala came up with shortly after. I had a syntax defined with Coco/R that built an abstract tree, but that sadly never got to the point where anything was executable. I have a cheesegrate G5 here, back in its original box, that still should have the code on the drive, but I've got so much other stuff to sort out that it might never get priority.