Overview

This documentation does not reflect the actual current implementation state

Words

Wherever possible we try to stick to keeping everything as a 64 bit value. Throughout the specifications of the VM, a word should be interpreted to mean a 64 bit value. Similarly a half word (or hword) should be interpreted to mean a 32 bit value.

Instructions

See Is.md

Registers

Each core has a bank of 8 general purpose registers, as well as a couple of internal registers. The general purpose registers are called a, b, c, d, e, f, x, and y. The internal registers are called l, u, and n.

l: exec pointer
u: flags and stuff
n: stack pointer

Privledge

The VM has a few different priviledge rankings that can be used to isolate processes.

Memory

Memory is word addressable, and uses a word as its address space. This gives a theoretical maximum memory space of 2^64 words of memory. This gives us a theoretical maximum of 32 exabytes of data, which is large.

2^10 * 2^10 * 2^10 * 2^10 * 2^10 * 2^10 * 2^4
^ K    ^ M    ^ G    ^ T    ^ P    ^ E    ^ 8    ^ 8 (64 bits -> 8 bytes)

Architecture

Registers

x   -> 1000       (4-bit word encoding)
x   -> 1110_1000  (8-bit full encoding)
xh0 -> 1100_1000
xh1 -> 1101_1000
xq0 -> 1001_1000
xq3 -> 1011_1000
xb0 -> 0000_1000
xb7 -> 0111_1000

16 registers                       16
16 registers * 2 half indexes    + 32  = 48
16 registers * 4 quarter indexes + 64  = 112
16 registers * 8 byte indexes    = 128 = 240

Could theoretically be coded as 8 bits... is saving the 1 bit worth it? 🤔

00000000 .. 01111111 -> pb0 .. fb7 (0rrrriii)
10000000 .. 10111111 -> pq0 .. fq3 (10rrrrii)
11000000 .. 11011111 -> ph0 .. fh1 (110rrrri)
11100000 .. 11101111 -> p   .. f   (1110rrrr)

...or a slightly easier to parse version...

00000000 .. 01111111 -> pb0 .. fb7 (0iiirrrr)
10000000 .. 10111111 -> pq0 .. fq3 (10iirrrr)
11000000 .. 11011111 -> ph0 .. fh1 (110irrrr)
11100000 .. 11101111 -> p   .. f   (1110rrrr)
11110000 ..          -> Impossible (used for decoding)
11110001 .. 11111011 -> undefined behavior
11111100 .. 11111111 -> fwi        (full word immediate)
11111111             -> None       (rarely useful, but worth having)

...which is nice because the physical register is always in the same spot! Oh, and the index is just the remainder of dividing the highest 4 bits by the number of register indexes.

Integers

Integers are encoded in the usual way. Unsigned integers count up starting from zero, and signed integers are stored in two's compliment fashion.

Floats

Floating point numbers are stored using the typical IEEE 754 standard.

Words

byte (8 bits)
qword (16 bits)
hword (32 bits)
word (64 bits)

Words are just generic bits of data, without any preassigned meaning. They can be whatever you want! Might be text, might represent a color, or something more!

Immediates

Word immediate

|--------|--------|--------|--------|--------|--------|--------|--------|
 i....... ........ ........ ........ ........ ........ ........ ........

Hword immediate

|--------|--------|--------|--------|
 i....... ........ ........ ..r.....

|--------|--------|--------|--------|
 i....... ........ ........ ........

|--------|--------|--------|--------|
 00000000 00000000 i....... ........

|--------|--------|--------|--------|
 00000000 00000000 00000000 i.......

i is the binary content of the immediate value
r is an unsigned 6-bit value that indicates how much i should be left-rotated to form the true value of the encoded immediate (only applicable to hword immediates which are encoding a word value.

The general logic behind this encoding is...

If the size of the word being stored is less than or equal to the available size for encoding the immediate, it should be stored exactly, aligned to the least significant digits, and padded with zeroes in the unused most significant digits, including cases where the value is a signed integer.
If the size of the word being stored is greater than the available size for encoding the immediate, then the first n least significant bits should be used to store an unsigned integer indicating how many digits to the left the partial value should be rotated, where s is the size of the word in bits and n is ⌈log2(s)⌉, and the remaining available bits should be used to store exact binary data.

A few examples...

An i8 with a value of -1 would be encoded as 00000000 00000000 00000000 11111111
An i64,u64 with a value of 1 could be encoded as 00000000 00000000 00000000 01000000
An i64,u64 with a value of 0xf0 could be encoded as 00000000 00000000 00000011 11000100
A u64 with a value of 0x8000000000000001 could be encoded as 00000000 00000000 00000000 11111111

Caveats

f64 values cannot be stored in an hword immediate, and must be addressed as a fwi.

In general the structure of the impl names is {opname}_{datatype}_{operands}

datatype is usually one of...
- i8
- u8
- i16
- u16
- i32
- u32
- i64
- u64
- b (byte, 8 bits)
- q (quarter-word, 16 bits)
- h (half-word, 32 bits)
- w (word, 64 bits)
operand is usually one of...
- i (immediate value, encoded in the instruction)
- r (register)

Instructions

`add`

add_i8_rir
add_i8_rrr
add_u8_rir
add_u8_rrr
add_i16_rir
add_i16_rrr
add_u16_rir
add_u16_rrr
add_f32_rir
add_f32_rrr
add_i32_rir
add_i32_rrr
add_u32_rir
add_u32_rrr
add_f64_rir
add_f64_rrr
add_i64_rir
add_i64_rrr
add_u64_rir
add_u64_rrr

`and`

and_b_rrr
and_q_rrr
and_h_rrr
and_w_rrr

`br`

br_x_i

`cmp`

cmp_b_r
cmp_b_rr
cmp_q_r
cmp_q_rr
cmp_h_r
cmp_h_rr
cmp_w_r
cmp_w_rr

`div`

div_i8_rirr
div_i8_rrrr
div_u8_rirr
div_u8_rrrr
div_i16_rirr
div_i16_rrrr
div_u16_rirr
div_u16_rrrr
div_f32_rir
div_f32_rrr
div_i32_rirr
div_i32_rrrr
div_u32_rirr
div_u32_rrrr
div_f64_rir
div_f64_rrr
div_i64_rirr
div_i64_rrrr
div_u64_rirr
div_u64_rrrr

`exp`

exp_i8_rir
exp_i8_rrr
exp_u8_rir
exp_u8_rrr
exp_i16_rir
exp_i16_rrr
exp_u16_rir
exp_u16_rrr
exp_f32_rir
exp_f32_rrr
exp_i32_rir
exp_i32_rrr
exp_u32_rir
exp_u32_rrr
exp_f64_rir
exp_f64_rrr
exp_i64_rir
exp_i64_rrr
exp_u64_rir
exp_u64_rrr

`halt`

halt_x_x

`mul`

mul_i8_rir
mul_i8_rrr
mul_u8_rir
mul_u8_rrr
mul_i16_rir
mul_i16_rrr
mul_u16_rir
mul_u16_rrr
mul_f32_rir
mul_f32_rrr
mul_i32_rir
mul_i32_rrr
mul_u32_rir
mul_u32_rrr
mul_f64_rir
mul_f64_rrr
mul_i64_rir
mul_i64_rrr
mul_u64_rir
mul_u64_rrr

`mv`

mv_b_rr
mv_q_rr
mv_h_rr
mv_w_rr

`not`

not_b_rr
not_q_rr
not_h_rr
not_w_rr

`or`

or_b_rrr
or_q_rrr
or_h_rrr
or_w_rrr

`pop`

pop_w_r

`push`

push_w_r

`put`

put_b_i
put_b_r
put_w_r

`rotl`

rotl_b_ri
rotl_b_rr
rotl_q_ri
rotl_q_rr
rotl_h_ri
rotl_h_rr
rotl_w_ri
rotl_w_rr

`rotr`

rotr_b_ri
rotr_b_rr
rotr_q_ri
rotr_q_rr
rotr_h_ri
rotr_h_rr
rotr_w_ri
rotr_w_rr

`shiftl`

shiftl_b_ri
shiftl_b_rr
shiftl_q_ri
shiftl_q_rr
shiftl_h_ri
shiftl_h_rr
shiftl_w_ri
shiftl_w_rr

`shiftr`

shiftr_b_ri
shiftr_b_rr
shiftr_q_ri
shiftr_q_rr
shiftr_h_ri
shiftr_h_rr
shiftr_w_ri
shiftr_w_rr

`slp`

slp

`sl`

sl_b_rir
sl_b_rrr
sl_q_rir
sl_q_rrr
sl_h_rir
sl_h_rrr
sl_w_rir
sl_w_rrr

`sr`

sr_b_rir
sr_b_rrr
sr_q_rir
sr_q_rrr
sr_h_rir
sr_h_rrr
sr_w_rir
sr_w_rrr

`ser`

ser_b_rir
ser_b_rrr
ser_q_rir
ser_q_rrr
ser_h_rir
ser_h_rrr
ser_w_rir
ser_w_rrr

`sub`

sub_i8_rir
sub_i8_rrr
sub_u8_rir
sub_u8_rrr
sub_i16_rir
sub_i16_rrr
sub_u16_rir
sub_u16_rrr
sub_f32_rir
sub_f32_rrr
sub_i32_rir
sub_i32_rrr
sub_u32_rir
sub_u32_rrr
sub_f64_rir
sub_f64_rrr
sub_i64_rir
sub_i64_rrr
sub_u64_rir
sub_u64_rrr

`xor`

xor_b_r
xor_b_rrr
xor_q_r
xor_q_rrr
xor_h_r
xor_h_rrr
xor_w_r
xor_w_rrr

Prologue

curly braces ({ and }) are used to denote a fixed set of options that can be chosen from usually separated by commas
- {a,b} would mean: in this location, must either be "a" or be "b"
- {a} would mean: in this location, must be "a"
numeric represents i8, u8, i16, u16, f32, i32, u32, f64, i64, and u64
- often used as {numeric} to mean: in this location, must be one of the numeric data types
- an f, an i, or a u can also be used in place of numeric in places where the data size can be inferred (usually from the operands of an instruction)
integer represents i8, u8, i16, u16, i32, u32, i64, and u64
float represents f32, or f64

`add.{numeric} $r1 #i`

`add.{numeric} $r1 #i $rr`

|--------|--------|--------|--------|--------|--------|--------|--------|
 00000000 0000000s rr...... r1...... i....... ........ ........ ........

|--------|--------|--------|--------|--------|--------|--------|--------|
 00000000 0000000s rr...... 11110000 r1...... 00000000 00000000 00000000
 i....... ........ ........ ........ ........ ........ ........ ........

s indicates whether the operands are signed (1) or not (0)
r1 The first operand register
i The inlined immediate value
rr The register to store the result in — Defaults to r1 if omitted

Adds together the values from r1 and i, stores the result in rr.

`add.{numeric} $r1 $r2`

`add.{numeric} $r1 $r2 $rr`

|--------|--------|--------|--------|--------|--------|--------|--------|
 00000000 0000001s r1...... r2...... rr...... 00000000 00000000 00000000

s indicates whether the operands are signed (1) or not (0)
r1 The first operand register
r2 The second operand register
rr The register to store the result in — Defaults to r1 if omitted

Adds together the values from r1 and r2, stores the result in rr.

`div.{float} $r1 #i`

`div.{float} $r1 #i $rr`

|--------|--------|--------|--------|--------|--------|--------|--------|
 00000000 0000000s r1...... i....... ........ ........ ........ rr......

|--------|--------|--------|--------|--------|--------|--------|--------|
 00000000 0000000s r1...... 11111110 rr...... 00000000 00000000 11110000
 i....... ........ ........ ........ ........ ........ ........ ........

s indicates whether the operands are signed (1) or not (0)
r1 The first operand register
i The inlined immediate value
rr The register to store the result in — Defaults to r1 if omitted

Divides the value from r1 by the value from i, stores the result in rr.

`div.{float} $r1 $r2`

`div.{float} $r1 $r2 $rr`

|--------|--------|--------|--------|--------|--------|--------|--------|
 00000000 0000000s r1...... r2...... rr...... 00000000 00000000 00000000

s indicates whether the operands are signed (1) or not (0)
r1 The first operand register
r2 The second operand register
rr The register to store the result in — Defaults to r1 if omitted

Divides the value from r1 by the value from r2, stores the result in rr.

`div.{integer} $r1 #i $rr -`

`div.{integer} $r1 #i - $rm`

`div.{integer} $r1 #i $rr $rm`

|--------|--------|--------|--------|--------|--------|--------|--------|
 00000000 000000ms r1...... i....... ........ ........ ........ rr......

|--------|--------|--------|--------|--------|--------|--------|--------|
 00000000 0000000s r1...... 11111110 rr...... rm...... 00000000 11110000
 i....... ........ ........ ........ ........ ........ ........ ........

Divides the value from r1 by the value from i, stores the quotient in rr and the remainder (otherwise called the modulus) in rm.

`div.{integer} $r1 $r2`

`div.{integer} $r1 $r2 $rr -`

`div.{integer} $r1 $r2 - $rm`

`div.{integer} $r1 $r2 $rr $rm`

|--------|--------|--------|--------|--------|--------|--------|--------|
 00000000 0000001s r1...... r2...... rr...... rm...... 00000000 00000000

s indicates whether the operands are signed (1) or not (0)
r1 The first operand register
r2 The second operand register
rr The result storage register for the quotient — Defaults to r1 if omitted
rm The result storage register for the remainder — Defaults to r2 if omitted

Divides the value from r1 by the value from r2, stores the quotient in rr and the remainder (otherwise called the modulus) in rm.

General design

Many instructions have variants (denoted by a . and the variant name). As an example, the typical signed integer addition instruction is written as add.i64. The instruction is add and the variant is i64, meaning it is intended to operate on signed 64 bit values. Some variants can be deduced by the register types used as operands, but some, like for signed and unsigned arithmetic, are required.

Syntax

The descriptions given make use of a variant of Lunar Assembly (LA) to show the options of how each instruction might be written in assembly by an engineer (and in turn assembled to Lunar Machine-code (LM)).

{a,b,c,...} means one of any of the comma separated values

add.{i64,u64} -> add.i64 | add.u64

:_ means one of any of the general purpose registers (a, b, c, d, e, f, x, y)

:! means one of any of the internal registers (l, u, n)

:a means the a register, :b means the b register, etc. :l means the l internal register, :u means the u internal register, etc.

# means any number (in any of the supported forms)

Decimal numbers can be used unprefixed
Hex numbers should use the 0x prefix
Binary numbers should use the 0b prefix

Binary representation

Opcode

Lun is designed to have a somewhat simple instruction set, and most importantly is designed to be easily (relatively) understood by a human, since it's mostly just for fun. As such, the opcode refers to the most significant two bytes of the first word of an instruction. Variants should be determined by the next four most significant bits (or in some cases, the two most significant bits of the opcode, which determine instruction size)

Registers

A register is always represented by 4 bits.

0000 l
0001 u
0010 n
------
1000 x
1001 y
1010 a
1011 b
1100 c
1101 d
1110 e
1111 f

It is worthwhile to note 2 things

The binary representation of registers a-f corresponds with their representation in hexadecimal. i.e. a in hexidecimal is 1010 in binary, which is the binary value that represents the a register.
All of the general purpose registers are represented with a most-significant bit of 1. This also means that all of the internal registers are represented with a most-significant bit of 0.

Instruction size

Instructions are all stored as words. Some instructions can be stored across multiple words, up to 5. The number of words in an instruction are encoded as part of the opcode, which is always contained in the first word.

00 - 1 word 01 - 2 words 10 - 4 words 11 - 5 words

Arithmetic

clr|xor :_

|--------|--------|--------|--------|--------|--------|--------|--------|
 00000000 00000100 r.......{00000000 00000000 00000000 00000000 00000000}

: r...  The register to clear

Shorthand for xor :_ :_ :_ (useful for zeroing a register)

xor :_ :_ :_

|--------|--------|--------|--------|--------|--------|--------|--------|
 00000000 00000101 r1...... r2...... rr......{00000000 00000000 00000000}

: r1..  The first operand register
: r2..  The second operand register
: rr..  The result storage register

Performs an exclusive or of the numbers from the first and second given registers and stores them in the third register.

Memory

i.{at,by} :_ # i.at :_ :_ i.by :_ [label]

|--------|--------|--------|--------|--------|--------|--------|--------|
 00100000 00000000 10000000 ri..0000 dddddddd dddddddd dddddddd dddddddd

: ri.. The register to inload from memory
: d    The offset from the address of the word containing this instruction

|--------|--------|--------|--------|--------|--------|--------|--------|
 00100000 00000000 11000000 ri..r@.. 00000000 00000000 00000000 00000000

: ri.. The register to inload from memory
: r@.. The register containing the address in memory to read from

|--------|--------|--------|--------|--------|--------|--------|--------|
 01100000 00000000 10000000 ri..0000 00000000 00000000 00000000 00000000
 @@@@@@@@ @@@@@@@@ @@@@@@@@ @@@@@@@@ @@@@@@@@ @@@@@@@@ @@@@@@@@ @@@@@@@@

: ri.. The register to inload from memory
: @    The address in memory to read from

Reads a value from an address in memory into the specified register

o.{at,by} :_ # o.at :_ :_ o.by :_ [label]

|--------|--------|--------|--------|--------|--------|--------|--------|
 00010000 00000000 10000000 ro..0000 dddddddd dddddddd dddddddd dddddddd

: ro.. The register to outload to memory
: d    The offset from the address of the word containing this instruction

|--------|--------|--------|--------|--------|--------|--------|--------|
 00100000 00000000 11000000 ro..r@.. 00000000 00000000 00000000 00000000

: ro.. The register to outload to memory
: r@.. The register containing the address in memory to write to

|--------|--------|--------|--------|--------|--------|--------|--------|
 01010000 00000000 10000000 ro..0000 00000000 00000000 00000000 00000000
 @@@@@@@@ @@@@@@@@ @@@@@@@@ @@@@@@@@ @@@@@@@@ @@@@@@@@ @@@@@@@@ @@@@@@@@

: ro.. The register to outload to memory
: @    The address in memory to write to

Writes a value from the specified register into memory at an address

Registers

One of the registers could be used to store the thread/core number? Timers?
Instructions to push/pop entire register set states on and off of the stack

Hypervisor

Enable a "hypervisor" to schedule bounded work on cores (i.e. must return eventually)
Limit its access to a certain parts of memory

Threading

Enable a single VM to have multiple "cores"

Cores

In addition to a register bank, I think each core should have some working memory? Perhaps like, 4kb?

Bus

What would a minimal "bus" implementation look like?

Sys

What should the Lun binary/executable format look like?

Instructions

Should instructions have condition predicates built in (like ARM)?
Instructions should be as wide reaching as possible. If something can already be done with a single instruction, don't make a more specific one. Sugar and shorthands can be handled at assembly time, so there's no need to make the decoding and runtime logic more complex than it needs to be by duplicating functionality.