This user has been banned.
June 16, 2025
This demo was hard. Initially, when I'd implemented the checkerboard, I'd hinted that I didn't want to make a chess implementation. Knowing the constraints of Mini-8, I'm still not planning on it. However, I did want to try my hand at a simpler game, Tic-Tac-Toe. Tic-Tac-Toe is, in theory, an incredibly simple game to implement. In fact, I'd bet I could make an implementation in under 20 lines of python.
However, when working on such a low level, trivial things like checking if three things are equal in a reasonably compact way becomes tricky. In the end, my solution was to start with a full byte, 0b1111_1111
, in a register. Then, I'd AND
that byte with the state of the board at wherever I wanted to check. In my implementation, the board is stored in RAM, with 0
being empty, 1
being an X
, and 2
being an O
. This means that AND
ing with an empty would set the whole thing to 0
, X
would set the last bit to 1
, and O
would set the second to last bit to 1
. Then, I can repeat this process across the row or column. In effect, I'm essentially sequentially masking bits. If all three are equal, then the bit that's set will be the player who has won. If not, the whole thing will be 0
. This is really fast to do in my ISA, as I can simply AND together the accumulator register and the RAMDATA value, then increment and go on to the next cell.
People who have looked into my ISA might have thought of a problem: I don't have any way to get input from the user. Well, thanks to the magic of it being my own project, I can just add a RFT
read from terminal instruction. I chose to replace the JRE
instruction, as I'd never used it and frankly can't see much of a use for it in my ISA now that I've done more programming in it.
Regardless, the basic game loop is fairly simple.
1. Sit in a loop, waiting for the user to press a key.
2. When we get a key back, check if the decimal value is strictly less than 9. RFT
returns 0xFF
if the value was invalid for the given format and 0xFE
if there were no bytes left to read in the internal buffer. We can handle invalid cases (in this case, 9 is the only invalid number) and checking if there's anything to read with only one JGT
.
3. Check if the cell that was input is empty. If it's not, print that it's taken, then jump back up to the input loop. Otherwise, place the cell
4. Check for winners using the algorithm explained above. Because there's only 2 cases and they're weird to program in, I check the diagonals separately.
5. Check for draws. We can simply loop through each cell and check if it's zero or empty. If it is, we can't have tied. If we get to the end of the loop and there are no empty squares left, it must be a draw. If a player won on the last move, that would have been detected in step 4.
6. Loop back up.
There's a bit more bookkeeping to handle turns and whatnot, but that's the general gist of how I implemented Tic-Tac-Toe
I'd had this idea for a while, but actually implementing it seemed nigh impossible: Adding an actual editor to the web VM.
The editor I had before was, to put it simply, not good. Under the hood, it was simply a div with the contenteditable attribute on. This might work for a bit, but it quickly would become unmanagable to extend if I wanted syntax highlighting or similar extra features.
I knew I had to get a real editor, but how does one simply aquire a better editor? Well, I looked around a bit, and it seemed like the best option for me was Monaco. Monaco is interesting, as it's basically just the editor view out of VSCode in library form. I found it quite easy to work with and progressively extend, which was nice. I didn't feel like I had to go all-in at once with every last feature of my language. I was able to add syntax highlighting, then PC highlighting, then completion, inlay, and hover info.
Overall, while it was not trivial to implement, it came together much cleaner, easier, and faster than if I'd tried to continue with my contenteditable div. Plus, Monaco looks and feels like VSCode, which is nice.
I also able to add editable registers, which may be useful for debugging or experimenting.
This past two and a half hours was not as productive as I'd have liked it to be, but in the end, I still achieved what I wanted. First, I attempted to add some code that would print the actual sqrt of a value (see last post for inspiration). However, all that would be quite challenging to do on such a limited CPU. My second idea was similar, but printing PI instead. For similar reasons, this was also extremely challenging. Finally, I had the idea of making a checkerboard. While I have no current plans to implement chess (yet), it's still a cool demo. However, for the checkerboard, I wanted to have actual unicode box drawing characters for the filled squares. At first, I assumed that it would just be a matter of adding the characters through normally by removing the limits and letting the raw values stack up in the text box. However, this doesn't work. JS internally stores everything as UTF-16, so a character with a value of, say, 0xE2, is just invalid. The end solution was keeping a buffer of text and then trying to decode it every time it's written to. If it successfully decodes, it's written out to the output. If not, it'll wait for the next character, then check again.
Another feature I got working fully are define
constants. Previously, they could only be static numbers. Now, they can also be registers, allowing for the creation of named variables. I also just generally tidied up their implementation a bit.
I've also added a ascii version, in case the UTF-8 one causes issues on other browsers or platforms.
As far as the actual coding of the checkerboard goes, it's nothing special. I keep track of a row and a column, add them together, and modulo by 2, or in my case, AND with 1, then jump accordingly.
This version also comes with a DX upgrade: actual error handling. Previously, if it ran into an error while assembling, the machine code wouldn't update and the text wouldn't be processed, but it would be up to the user to open the console to check the error. Now, it's output to the output window.
More demos!
After implementing syntax highlighting in my previous devlog, I found myself actually wanting to use and write code for my computer. So, I added 2 new demos, multiplication and primes.
Primes was tricky to conceptualize at first, especially with limited registers, but once I thought about it, it wasn't hard to fit it together with my preexisting modulo and decprint subroutines to find composites and print out numbers. One difficulty was figuring out the upper bound I should check for. The correct upper bound is sqrt(N) (look it up yourself), but I wasn't fitting a sqrt operation into the code, at least not without ruining the benefits it would provide. My next thought was N/2. After all, if we have M>N/2, multiplying M by 2, the smallest checked value, will result in 2M>N, which means it couldn't possibly be a factor. That does work, but it does a lot of extra work, especially on larger numbers. Through some discussion with ChatGPT, it suggested clamping the maximum checked value to 16. Why does this work? Well, the maximum checked value is 255. sqrt(255), which is the actually correct upper bound for that, is around 15.96. So, if I clamp everything to checking all the numbers below 16, I can save a ton of time on larger numbers. So, my end equation for the maximum value checked is min(M/2,16). This effectively means that checks on prime numbers greater than 32 (162) all do the same number of loops as if they were 31. While not optimal, this trims off extra calculation on the large values, speeding up the program.
Multiplication, by comparison, was quite easy. It was relatively easy to implement unsigned integer multiplication as repeated addition. It was essentially an inversion of my modulo subroutine to add instead of subtract. I also already had a decprint subroutine for printing an integer in decimal, so that was taken care of. All that was really left for me was to implement two nested loops, multiply the 'i' and 'j' values, pass them to decprint, then handle a bit of formatting. There is one thing that I think is important to note about the multiplication demo. The last element, 256, is not really calculated. If we take 0x10 (16) and multiply it by 0x10 (16), we get 0x100. But, as it's truncated to 8 bits, 0x100 is really 0x00. In other words, 16*16=0 (mod 256). So, I do have a manual case to detect if it's 0 (256), and print out the correct value. Every other number is calculated at runtime and printed with the decprint algorithm.
While making new demos, I kept staring at the wall of gray text in my editor, feeling like I was missing something. That something was syntax highlighting, which I've now implemented. As of now, there's only a vscode theme, but considering that a majority of developers use VSCode, hopefully that's not a major limitation. Honestly, the hardest part about making a syntax highlighter was deciding what syntax part to assign each bit of my language. Most languages are pretty similar in syntactic representations, with only differences in symbols. However, assembly languages don't have the same concepts of functions, classes, or even variables. So it was a bit of a challenge to fit my assembly language into VSCode's highlighting format, but I made it work in the end, and I think it greatly helps the readability of my code.
You can see a before and after below using the default dark theme in VSCode.
While I'd hoped this would have been easier to implement, adding all the minutiae that fit together into a hopefully seamless experience was harder than expected. In the end, I was able to add some demo programs to the mini-8 website and improve the overall UX of the Mini-8 website. Of course, the demos can also be compiled with the Python assembler and ran with the vm.
To run FizzBuzz, for example, click on the Select Demo button, then select FizzBuzz. Then, click the Assemble button to compile the assembly into raw machine code. If you've run the processor before, you can click the Reset VM to clear the VM's state, leaving your programs intact. Then, to run it, click the Run button. You can set how fast instructions will operate with the speed selector. This represents how many milliseconds the program will wait between instructions. To get a good picture of how the machine works, you can also press the Step button to step instruction by instruction and see the registers update in real time.
Mini-8 now has a fully-featured web version, including an assembler and disassembler. You can mess around with it here. An online version of the ISA doc is also included. Most of this was a fairly simple direct translation, assisted by ChatGPT as my JS skills are a bit rusty. Overall, I'm fairly happy with the current state it is in. I might add a few new features or refactor some bits of it, but it's getting close to being ready to ship.
Mini-8 is a tiny 8-bit RISC-style virtual CPU architecture. I designed it after playing Turing Complete, an excellent game that has you building up a simple yet turing complete machine over the course of its levels. While Mini-8 is inspired by LEG, the final architecture in TC, it has several major architectural and ideological changes from LEG.
The general idea is that each instruction is 4 bytes: 1 byte opcode and 3 bytes for operands (OP1
, OP2
, DEST
).
Bits in the opcode specify whether operands are immediate values or register references; DEST
is always a register, while OP1
and OP2
can be either, depending on the immediate mode bits.
The actual instructions take up the lower 5 bits of the opcode. Bits 4
and 3
describe the class of the opcode. For example, 00
are ALU operations and 01
are CONDitional operations. Bits 210
are the actual operation. For example, 000 in the ALU class is AND
.
This was one of the first fully working programs ran on Mini-8. 0x37
(55) is the 10th Fibonacci number, and 0xC8
(200) is added to that to get 255. Then, the obligatory hello world is written.
Mini-8 is a tiny 8-bit virtual CPU architecture.
After quite a few rewrites and refactors, I was able to get an arbitrary number of arbitrary triangles rendering correctly (or, more correctly with the addition of a depth buffer), which is a definite improvement.
This project is based on a few things, without them, I could not have even started this project. Firstly, Sebastian Lague's video (https://www.youtube.com/watch?v=yyJ-hdISgnw) on software rasterizing. Secondly, Taichi. The ability to write GPU-accelerated code in python, albeit with some restrictions, is incredibly useful.
In this devlog, I was able to get some triangles rendering in 3d, with a simple orthographic projection. My planned next steps are to implement perspective projection, then perlin noise and marching cubes.
MCTaichi is a Taichi implementation of Marching Cubes, an algorithm for converting a density field into triangles, for further rendering
Well, I finally did it! Cyberfire 3D works up to my standards. It's not as interactive as 2D, but both were originally intended as strictly visual demos, so I'm at least glad one of them has interactivity. While there are still a few rough spots, especially around the bottom of the flame, they are not all that noticeable. The camera can be controlled like most other 3D software. Left click and drag to rotate, right click and drag to pan, and mouse wheel to scroll.
Another thing that I've added is the ability to change the number of passes the renderer does per frame. This increases the visual quality, at the cost of decreased performance. I find the sweet spot (for my setup, at least) is around 2-4 passes, YMMV.
Something that I haven't talked about before is the launcher. With the addition of 3D, I knew I wanted to have some way for the end user to toggle between the 2D and 3D versions. I added it earlier, but did not think it was noteworthy to include. Now, it also handles setting the height, width, and, for the 3d version, depth of the fire matrix. It's just a simple PySide6 application that modifies env variables to be able to correctly launch the other applications from there.
That's about it. While I might keep updating it if I get new ideas, this project is in a state where I'm happy enough with it.
I was able to squash quite a few bugs in the renderer. Flipping the image so that the fire rendered the right way was easy enough. Figuring out what magic numbers to flip in order to get the camera controls was not, however. I still haven't fully gotten color to work yet, but it's... going. The empty voxels are now transparent, as are the low-intensity fire voxels to some degree. The cube is now also rendering on two sides instead of only one, though some other issues have cropped up.
Namely, those issues are that the sides of the cubes that don't render correctly have an oddly patterned pattern of mostly red pixels throughout. In addition, the fire has decided to split into red, green, and blue sections in a remarkably distinct pattern. There is also an odd hotspot on one of the top edges of the cube. The bottom is a completely noisy mess, as well. So, overall, progress, but not completion.
Once I'd mostly completed the 2D version of Cyberfire, the natural question for me was: what about 3D?
The code wasn't particularly specific to 2D implementations, and I could easily adjust the logic. The hard part was going to be the rendering, though, especially keeping it performant.
My initial approach was to treat the heat as a sort of density and use the Marching Cubes (https://en.wikipedia.org/wiki/Marching_cubes) algorithm, an algorithm that is able to turn a 3d grid of scalar density values into a triangle mesh. In this case, I was going to use it to turn heat into a triangle mesh, but the idea remains the same. However, while implementing Marching Cubes, I kept running into issues related to the generation and projection of triangles. Once I'd spent enough time staring at the code, I realized that trying to fix the code would be more trouble than it was worth. Then, while browsing through the Taichi documentation, I realized that Taichi provides its own voxel renderer, for use in a 100 line challenge. While I certainly wasn't going to be fitting any of my code in 100 lines, the renderer itself was still useful. After a lot of issues, I was finally able to get the fire rendering in 3d. However, there are still a lot more issues to tackle before the code is done. Firstly, that the fire is rendering upside-down. Next, it's not using the correct color. The empty pixels are still being rendered, instead of being discarded and transparent. The cube of fire is only being rendered on one side.
All of these issues mean that the 3d renderer is still not complete, though it is getting closer to it.
Well, this is the first devlog (8 hours in), and it very well may be the only devlog I post, but I'd still like to talk a bit about the process and the decisions that I made along the way.
Why this project? Well, I was looking around my unfinished projects directory when I came across this project from a couple years ago. I wanted to give it a shot at finishing it, or at least polishing it up a bit.
The general idea of how this program works is a bit counterintuitive. While most fire simulations represent heat as things that rise, and so pixels tend to set pixels higher than them. This simulation does the reverse. Each pixel looks at pixels below it in order to determine its heat value. This also has the unintended (but not unwelcome) side effect of making the bottom row of pixels always keep its heat value, as they do not have pixels below them to look up their new heat values, so the ground is flammable. The X coordinate is also warped by a random amount, plus some scrolling perlin noise for added variety with coherence. The sparks at the tops of the flames are emergent behavior, there is no part of the code that intentionally creates those.
Why Python? Well, it's what I know best. Additionally, I'd recently discovered Taichi, a domain specific, high performance, parallel language that is embedded within, and uses the semantics of, Python. Taichi allows me to run the fire simulation at a solid 60fps, and I've gotten it to run at 120 FPS at 1440x960 before being bottlenecked. At 640x480, it runs at a blazing (ha) fast 500 fps. None of these are precise benchmarks, and YMMV.
I use PySide6 for the GUI, as my machine uses Plasma for the Desktop Environment, so it just made sense to write something for QT. Also, I know the library reasonably well.
While this project is not intended to be a realistic fire simulation, it does have an excellent stylized fire look to it.
BadCodec, also known as a method for perceptually warped, entropy-efficient remapping of a continuous signal into a compressible, visually structured format (AMPWEERCSCVSF), is a project that is able to convert audio, such as music, into a picture.
Cyberfire is a real-time stylized fire visualization, inspired by DOOM's fire animation, but spiced up with modern techniques.
This was widely regarded as a great move by everyone.