Oración
Oración | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Created by | Alexander Nicholi | ||||||||||
Written in | ANSI C | ||||||||||
OSes | Sirius DOS, A* | ||||||||||
ISAs | i286 | ||||||||||
Licence | ASL 1.1 |
Oración is an assembler program that originally targets the i286 and ARMv4T instruction sets. It leverages Gordian to perform symbol resolution and its language is the target for higher level language compilers, including both FCC and Sirius C*.
Language overview
Unlike most assembler dialects, Oración is not line-oriented at all (in fact, \r
and \n
are semantically equivalent to all other whitespace characters). It uses semicolons as statement terminators, and bracing in place of colon-delimited labels. This was decided upon out of kindness to 80-character terminals (which, believe it or not, are a huge cornerstone of accessibility even into the modern day). Directives begin with either a dot .
or exclamation mark !
depending on whether they are emitting, and opcodes are normal identifiers which have their arguments follow them like in other assembler dialects (no parentheses, only whitespace between the opcode and the first argument, and further arguments comma-separated). Only ANSI C style block comments /*
*/
are permitted. Finally, it is a strictly ASCII-only grammar, albeit with a doc comment exception.
Directives
Oración provides many directives supplicating the main task of machine code assembly. There are two categories of directives: logical directives, prefixed with a dot ., and emitting directives, prefixed with an exclamation mark ! They include: .abort
, !align
, .define
, .elif
, .elifdef
, .elifndef
, .else
, !globl
, .if
, .ifdef
, .ifndef
, .include
, !isa
, .log
, !raw
, !rawr
, and .undef
. Logical directives are evaluated and resolved before any emitting directives, and are evaluated sequentially to provide temporality to such directives as .define
, .undef
and the various messaging directives. Once those are resolved out, the emitting directives are resolved together with actual opcodes to create the final machine code.
.abort
FORM:
.abort | ; |
Aborts the assembly process. The .log
directive can be used immediately preceding this to give an error message.
!align
FORM:
!align | number of octetsuintn | [ , fill octetuint8 ] | ; |
Aligns the assembler output to a given number of octets. If fill octet is omitted, it defaults to zeroes. This is behaviourally equivalent to GNU as's .balign
directive.
.define
FORM:
.define | nameident | , valueconstexpr | ; |
Creates a definition by name to a given value that is a constant expression.
.elif
FORM:
.elif | expressionconstexpr | { } |
Conditional testing a constant expression. If truthy, the contents of the block that follows are evaluated for emission. This directive must follow the block of another .elif
, .elifdef
, .elifndef
, .if
, .ifdef
or .ifndef
directive.
.elifdef
FORM:
.elifdef | symbolident | { } |
Conditional testing for whether a symbol is .define
d. If truthy, the contents of the block that follows are evaluated for emission. This directive must follow the block of another .elif
, .elifdef
, .elifndef
, .if
, .ifdef
or .ifndef
directive.
.elifndef
FORM:
.elifndef | symbolident | { } |
Conditional testing for whether a symbol is not .define
d. This is the inverse of the .elifdef
directive. This directive must follow the block of another .elif
, .elifdef
, .elifndef
, .if
, .ifdef
or .ifndef
directive.
.else
FORM:
@else | { } |
Provides a block for emission evaluation for the inverse value of a previous conditional directive. This directive must follow the block of another .elif
, .elifdef
, .elifndef
, .if
, .ifdef
or .ifndef
directive.
!globl
FORM:
!globl | symbolident | [ , symbolident , ... ] | ; |
Marks one or more symbols as globals for emission, meaning that they are externally visible to other assembly units if defined here, and are externally visible and defined in another assembly unit if not defined here.
.if
FORM:
.if | expressionconstexpr | { } |
Conditional testing a constant expression. If truthy, the contents of the block that follows are evaluated for emission.
.ifdef
FORM:
.ifdef | symbolident | { } |
Conditional testing for whether a symbol is .define
d. If truthy, the contents of the block that follows are evaluated for emission.
.ifndef
FORM:
.ifndef | symbolident | { } |
Conditional testing for whether a symbol is not .define
d. This is the inverse of the .elifdef
directive.
.include
FORM:
.include | filepathstringlit | ; |
Physically .include
another assembly unit by file path, as if its contents were transcluded in-place at the position of the directive.
!isa
FORM:
!isa | isastringlit | [ , isastringlit , ... ] | ; |
Oración uses an !isa
directive to specify what opcodes are permitted in a given file. At the base level, the names given refer to lists of opcode names. It is an error to have any overlap in the names provided by each opcode group. This approach is done to promote composability of ISA specification in assembly code, such that a base ISA can be given along with a list of additions (such as later revisions or extensions) that it may use.
.log
FORM:
.info | messagestringlit | [ , messagestringlit , ... ] | ; |
Print out a diagnostic string as given in a list of string literals.
!raw
FORM:
!raw | bitsizeuintn | , dataintn | [ , dataintn , ... ] | ; |
Emit raw data in the assembly process. The first argument specifies the size of each data item, denominated in bits. One or more data items may be given, and they are emitted sequentially in the order specified.
!rawr
FORM:
!rawr | bitsizeuintn | , repeatcountuintn | , dataintn | [ , dataintn , ... ] | ; |
Emit raw data in the assembly process, with a specified repeating of the data given. The first argument specifies the size of each data item, denominated in bits. The second argument specifies how many times the sequence emission shall be repeated. One or more data items may be given, and they are emitted sequentially in the order specified. In the repeating process, this ordered sequence is then repeated the number of times given.
.undef
FORM:
.undef | nameident | ; |
Undefine a previously .define
d definition by name. If the name given was not defined, this directive is ignored.
Constant expressions
A constant expression, or constexpr, is an arbitrary expression comprised of primitives that are either arbitrary-length integers or other symbols defined non-circularly by a .define
directive, joined together with any of the following operators: addition +, subtraction -, multiplication *, division /, modulus %, exponent **, logical left shift <<, logical right shift >>, bitwise (long-circuit) AND &, bitwise (long-circuit) OR |, bitwise (long-circuit) XOR ^, bitwise (long-circuit) NOT ~, logical (short-circuit) AND &&, logical (short-circuit) OR ||, logical (short-circuit) XOR ergo inequality !=, logical (short-circuit) NOT !, logical equality ==, and the ternary conditional ? :.
Opcode structure
Oración defines a general structure for defining the structure of an opcode:
struct instr[sz] { struct instr_param * params[sz]; ptri param_sz; } enum instr_param_type { INSTR_PARAM_TYPE_LIT, INSTR_PARAM_TYPE_REG, INSTR_PARAM_TYPE_IMM }; struct instr_param[sz] { bit occupy[sz]; enum instr_param_type type; u8 index; };