RFC format: Difference between revisions
(Created page with "The '''RFC format''' is a renderer-agnostic structured document format built using ASCII C0 control codes as a superset of the TTY text format. It uses other C0 control codes not employed by printers (and thereby not known to TTY) to create rich document structure otherwise attained with advanced systems such as Troff and LaTeX. The format does this instead of providing a meta syntax like Markdown, RTF or HTML so that it can satisfy its design constraint o...") |
m (→General formatting: add page breaks to even and odd pages) |
||
| (4 intermediate revisions by the same user not shown) | |||
| Line 14: | Line 14: | ||
! Enc. | ! Enc. | ||
! Dec. | ! Dec. | ||
|- | |- | ||
| Data Link Escape | | Data Link Escape | ||
| <tt>DLE</tt> | | <tt>DLE</tt> | ||
| <tt>0x10</tt> | | <tt>0x10</tt> | ||
| <tt> | | <tt>0x0</tt> | ||
|- | |- | ||
| Device Control 1 | | Device Control 1 | ||
| <tt>DC1</tt> | | <tt>DC1</tt> | ||
| <tt>0x11</tt> | | <tt>0x11</tt> | ||
| <tt> | | <tt>0x1</tt> | ||
|- | |- | ||
| Device Control 2 | | Device Control 2 | ||
| <tt>DC2</tt> | | <tt>DC2</tt> | ||
| <tt>0x12</tt> | | <tt>0x12</tt> | ||
| <tt> | | <tt>0x2</tt> | ||
|- | |- | ||
| Device Control 3 | | Device Control 3 | ||
| <tt>DC3</tt> | | <tt>DC3</tt> | ||
| <tt>0x13</tt> | | <tt>0x13</tt> | ||
| <tt> | | <tt>0x3</tt> | ||
|- | |- | ||
| Device Control 4 | | Device Control 4 | ||
| <tt>DC4</tt> | | <tt>DC4</tt> | ||
| <tt>0x14</tt> | | <tt>0x14</tt> | ||
| <tt> | | <tt>0x4</tt> | ||
|- | |||
| Enquiry | |||
| <tt>ENQ</tt> | |||
| <tt>0x05</tt> | |||
| <tt>0x5</tt> | |||
|- | |- | ||
| Synchronous Idle | | Synchronous Idle | ||
| <tt>SYN</tt> | | <tt>SYN</tt> | ||
| <tt>0x16</tt> | | <tt>0x16</tt> | ||
| <tt> | | <tt>0x6</tt> | ||
|- | |- | ||
| End of Transmission Block | | End of Transmission Block | ||
| <tt>ETB</tt> | | <tt>ETB</tt> | ||
| <tt>0x17</tt> | | <tt>0x17</tt> | ||
| <tt> | | <tt>0x7</tt> | ||
|- | |- | ||
| Cancel | | Cancel | ||
| <tt>CAN</tt> | | <tt>CAN</tt> | ||
| <tt>0x18</tt> | | <tt>0x18</tt> | ||
| <tt> | | <tt>0x8</tt> | ||
|- | |- | ||
| End of Medium | | End of Medium | ||
| <tt>EM</tt> | | <tt>EM</tt> | ||
| <tt>0x19</tt> | | <tt>0x19</tt> | ||
| <tt>0x9</tt> | |||
|- | |||
| End of Text | |||
| <tt>ETX</tt> | |||
| <tt>0x03</tt> | |||
| <tt>0xA</tt> | |||
|- | |||
| End of Transmission | |||
| <tt>EOT</tt> | |||
| <tt>0x04</tt> | |||
| <tt>0xB</tt> | | <tt>0xB</tt> | ||
|- | |- | ||
| Line 121: | Line 121: | ||
! Command description | ! Command description | ||
! Number | ! Number | ||
|- | |||
| No-op | |||
| <tt>0000</tt> | |||
|- | |- | ||
| Begin heading, level 1 | | Begin heading, level 1 | ||
| <tt> | | <tt>0001</tt> | ||
|- | |- | ||
| Begin heading, level 2 | | Begin heading, level 2 | ||
| <tt> | | <tt>0002</tt> | ||
|- | |- | ||
| Begin heading, level 3 | | Begin heading, level 3 | ||
| <tt> | | <tt>0003</tt> | ||
|- | |- | ||
| Begin heading, level 4 | | Begin heading, level 4 | ||
| <tt> | | <tt>0004</tt> | ||
|- | |- | ||
| Begin heading, level 5 | | Begin heading, level 5 | ||
| <tt> | | <tt>0005</tt> | ||
|- | |- | ||
| Begin heading, level 6 | | Begin heading, level 6 | ||
| <tt> | | <tt>0006</tt> | ||
|- | |- | ||
| End heading (relative) | | End heading (relative) | ||
| <tt> | | <tt>0007</tt> | ||
|- | |- | ||
| Begin code block | | Begin code block | ||
| <tt> | | <tt>0010</tt> | ||
|- | |- | ||
| End code block | | End code block | ||
| <tt> | | <tt>0011</tt> | ||
|- | |- | ||
| Begin block quote | | Begin block quote | ||
| <tt> | | <tt>0012</tt> | ||
|- | |- | ||
| End block quote | | End block quote | ||
| <tt> | | <tt>0013</tt> | ||
|- | |- | ||
| Begin block quote credit | | Begin block quote credit | ||
| <tt> | | <tt>0014</tt> | ||
|- | |- | ||
| End block quote credit | | End block quote credit | ||
| <tt> | | <tt>0015</tt> | ||
|- | |- | ||
| Begin hard shadow | | Begin hard shadow | ||
| <tt> | | <tt>0016</tt> | ||
|- | |- | ||
| End hard shadow | | End hard shadow | ||
| <tt> | | <tt>0017</tt> | ||
|- | |- | ||
| Begin hyperlink display text | | Begin hyperlink display text | ||
| <tt> | | <tt>0020</tt> | ||
|- | |- | ||
| End hyperlink display text | | End hyperlink display text | ||
| <tt> | | <tt>0021</tt> | ||
|- | |- | ||
| Begin hyperlink shadow bridge | | Begin hyperlink shadow bridge | ||
| <tt> | | <tt>0022</tt> | ||
|- | |- | ||
| End hyperlink shadow bridge | | End hyperlink shadow bridge | ||
| <tt> | | <tt>0023</tt> | ||
|- | |- | ||
| Begin hyperlink URI | | Begin hyperlink URI | ||
| <tt> | | <tt>0024</tt> | ||
|- | |- | ||
| End hyperlink URI | | End hyperlink URI | ||
| <tt> | | <tt>0025</tt> | ||
|- | |- | ||
| Begin centred text | | Begin centred text | ||
| <tt> | | <tt>0026</tt> | ||
|- | |- | ||
| Begin right-aligned text | | Begin right-aligned text | ||
| <tt> | | <tt>0027</tt> | ||
|- | |- | ||
| Begin justified text | | Begin justified text | ||
| <tt> | | <tt>0030</tt> | ||
|- | |- | ||
| Revert to left-aligned text | | Revert to left-aligned text | ||
| <tt> | | <tt>0031</tt> | ||
|- | |- | ||
| Render table of contents | | Render table of contents | ||
| <tt> | | <tt>0032</tt> | ||
|- | |- | ||
| Begin internal link target ident | | Begin internal link target ident | ||
| <tt> | | <tt>0033</tt> | ||
|- | |- | ||
| End internal link target ident | | End internal link target ident | ||
| <tt> | | <tt>0034</tt> | ||
|- | |- | ||
| Begin internal link display text | | Begin internal link display text | ||
| <tt> | | <tt>0035</tt> | ||
|- | |- | ||
| End internal link display text | | End internal link display text | ||
| <tt> | | <tt>0036</tt> | ||
|- | |- | ||
| Begin internal link shadow bridge | | Begin internal link shadow bridge | ||
| <tt> | | <tt>0037</tt> | ||
|- | |- | ||
| End internal link shadow bridge | | End internal link shadow bridge | ||
| <tt> | | <tt>0040</tt> | ||
|- | |- | ||
| Begin internal link reference ident | | Begin internal link reference ident | ||
| <tt> | | <tt>0041</tt> | ||
|- | |- | ||
| End internal link reference ident | | End internal link reference ident | ||
| <tt> | | <tt>0042</tt> | ||
|- | |- | ||
| Paragraph break | | Paragraph break | ||
| <tt> | | <tt>0043</tt> | ||
|- | |- | ||
| Line break | | Line break | ||
| <tt> | | <tt>0044</tt> | ||
|- | |- | ||
| Begin all capitals | | Begin all capitals | ||
| <tt> | | <tt>0045</tt> | ||
|- | |- | ||
| End all capitals | | End all capitals | ||
| <tt> | | <tt>0046</tt> | ||
|- | |- | ||
| Whole paragraph indent | | Whole paragraph indent | ||
| <tt> | | <tt>0047</tt> | ||
|- | |||
| Page break to even | |||
| <tt>0050</tt> | |||
|- | |||
| Page break to odd | |||
| <tt>0051</tt> | |||
|- | |||
| colspan="2" style="background-color:rgba(162,169,177,0.5);font-style:italic;font-size:90%;text-align:center" | <tt>0052</tt>–<tt>0077</tt> are undefined | |||
|- | |||
| Figure 5 lines, ¼ width | |||
| <tt>0100</tt> | |||
|- | |||
| Figure 10 lines, ¼ width | |||
| <tt>0101</tt> | |||
|- | |||
| Figure 15 lines, ¼ width | |||
| <tt>0102</tt> | |||
|- | |||
| Figure 20 lines, ¼ width | |||
| <tt>0103</tt> | |||
|- | |||
| Figure 25 lines, ¼ width | |||
| <tt>0104</tt> | |||
|- | |||
| Figure 30 lines, ¼ width | |||
| <tt>0105</tt> | |||
|- | |||
| Figure 40 lines, ¼ width | |||
| <tt>0106</tt> | |||
|- | |||
| Figure 50 lines, ¼ width | |||
| <tt>0107</tt> | |||
|- | |||
| Figure 5 lines, ½ width | |||
| <tt>0110</tt> | |||
|- | |||
| Figure 10 lines, ½ width | |||
| <tt>0111</tt> | |||
|- | |||
| Figure 15 lines, ½ width | |||
| <tt>0112</tt> | |||
|- | |||
| Figure 20 lines, ½ width | |||
| <tt>0113</tt> | |||
|- | |||
| Figure 25 lines, ½ width | |||
| <tt>0114</tt> | |||
|- | |||
| Figure 30 lines, ½ width | |||
| <tt>0115</tt> | |||
|- | |||
| Figure 40 lines, ½ width | |||
| <tt>0116</tt> | |||
|- | |||
| Figure 50 lines, ½ width | |||
| <tt>0117</tt> | |||
|- | |||
| Figure 5 lines, ¾ width | |||
| <tt>0120</tt> | |||
|- | |||
| Figure 10 lines, ¾ width | |||
| <tt>0121</tt> | |||
|- | |||
| Figure 15 lines, ¾ width | |||
| <tt>0122</tt> | |||
|- | |||
| Figure 20 lines, ¾ width | |||
| <tt>0123</tt> | |||
|- | |||
| Figure 25 lines, ¾ width | |||
| <tt>0124</tt> | |||
|- | |||
| Figure 30 lines, ¾ width | |||
| <tt>0125</tt> | |||
|- | |||
| Figure 40 lines, ¾ width | |||
| <tt>0126</tt> | |||
|- | |||
| Figure 50 lines, ¾ width | |||
| <tt>0127</tt> | |||
|- | |||
| Figure 5 lines, full width | |||
| <tt>0130</tt> | |||
|- | |||
| Figure 10 lines, full width | |||
| <tt>0131</tt> | |||
|- | |||
| Figure 15 lines, full width | |||
| <tt>0132</tt> | |||
|- | |||
| Figure 20 lines, full width | |||
| <tt>0133</tt> | |||
|- | |||
| Figure 25 lines, full width | |||
| <tt>0134</tt> | |||
|- | |||
| Figure 30 lines, full width | |||
| <tt>0135</tt> | |||
|- | |||
| Figure 40 lines, full width | |||
| <tt>0136</tt> | |||
|- | |||
| Figure 50 lines, full width | |||
| <tt>0137</tt> | |||
|} | |} | ||
===Shadows=== | ===Shadows=== | ||
The RFC format employs a simple rendering hatchet called '''shadows'''. The basic action of a shadow is, from its beginning mark to its end, it hides the text encased within from rich renderings of the document; this is called a '''hard shadow'''. A softer variant of this exists which are called '''shadow bridges''' – these are used to interlink a hyperlink or internal link's display text to its target destination, providing a semantic connection and a hiding of punctuation that would be needed in plain text for legibility, such as spacing and parentheses. Shadows provide a concise and simple way to hide plain text boilerplate from rich renderings of RFC documents while still providing them for dumb formatting strippers to create plain text renditions as the authors intended them without having to do any high-level reconstructions. | The RFC format employs a simple rendering hatchet called '''shadows'''. The basic action of a shadow is, from its beginning mark to its end, it hides the text encased within from rich renderings of the document; this is called a '''hard shadow'''. A softer variant of this exists which are called '''shadow bridges''' – these are used to interlink a hyperlink or internal link's display text to its target destination, providing a semantic connection and a hiding of punctuation that would be needed in plain text for legibility, such as spacing and parentheses. Shadows provide a concise and simple way to hide plain text boilerplate from rich renderings of RFC documents while still providing them for dumb formatting strippers to create plain text renditions as the authors intended them without having to do any high-level reconstructions. | ||
===Figures=== | |||
The RFC format does not provide any direct means for embedding figure data (beyond the aforementioned ASCII tables which are not true 'figures' anyway). However, it does provide a comprehensive command set for accommodating figures into the rigid and portable geometry of RFC documents. True figures could then be emplaced into this 'saved space' in the course of the rendering process, or if such rendering is not practical, the space could be left empty in a visually acceptable way. Commands 0100-0137 are provisioned for this purpose and provide a simple matrix of figure sizes: line height may be 5, 10, 15, 20, 25, 30, 40 or 50 high and the figure may be ¼ width (17 columns), ½ width (35 columns), ¾ width (51 columns) or full width (68 columns). Figures are always centred, and in the case of ¼ and ¾ width figures will have their remainder column on the right, and their top is the line on which the figure command first appears. Rendering of the figure involves an implicit carriage return before centring and as usual must respect the overwriting behaviour of the [[TTY format]], making it possible to overlay text on top of figures directly in the document. Document authors should navigate as necessary around their inserted figure according to its prescribed size. | |||
==Rendering== | |||
Regardless of whether the target medium is print or digital, the RFC format is still meant to be a monospaced, paper-friendly medium that never exceeds 72 columns in width. Since RFC is page-aware, unlike TTY, it sets a maximum page height of 55 rows – this leaves 2 header rows with 1 spacer row and 1 footer row with 1 spacer row for 50 rows of content per page. | |||
[[Category:Sirius DOS components]][[Category:Byblos components]] | [[Category:Sirius DOS components]][[Category:Byblos components]] | ||
Latest revision as of 16:08, 3 April 2025
The RFC format is a renderer-agnostic structured document format built using ASCII C0 control codes as a superset of the TTY text format. It uses other C0 control codes not employed by printers (and thereby not known to TTY) to create rich document structure otherwise attained with advanced systems such as Troff and LaTeX. The format does this instead of providing a meta syntax like Markdown, RTF or HTML so that it can satisfy its design constraint of being legible as TTY formatted ASCII text if all superset control codes are blindly (as in, without any parsing knowledge required) stripped out.
The recommended file extension for these documents is .rfc. The magic number all RFC files should begin with is, in hexadecimal, 06 15 06 15, representing two interleaved pairs of ASCII ACK and NAK C0 control codes. The RFC format reserves all C0 control codes not used by the TTY format except for NUL (0x00), BEL (0x07), SUB (0x1A) and ESC (0x1B).
The RFC format is called as such because it is designed to mimic the production of the IETF's RFC XML. It is not technically related to any of the IETF's tools for producing or validating RFCs.
Embed encoding
Sixteen extraneous ASCII C0 control codes are hijacked as a binary encoding medium so that each 7-bit ASCII character provides 4 bits of arbitrary binary data:
| Character name | Code | Enc. | Dec. |
|---|---|---|---|
| Data Link Escape | DLE | 0x10 | 0x0 |
| Device Control 1 | DC1 | 0x11 | 0x1 |
| Device Control 2 | DC2 | 0x12 | 0x2 |
| Device Control 3 | DC3 | 0x13 | 0x3 |
| Device Control 4 | DC4 | 0x14 | 0x4 |
| Enquiry | ENQ | 0x05 | 0x5 |
| Synchronous Idle | SYN | 0x16 | 0x6 |
| End of Transmission Block | ETB | 0x17 | 0x7 |
| Cancel | CAN | 0x18 | 0x8 |
| End of Medium | EM | 0x19 | 0x9 |
| End of Text | ETX | 0x03 | 0xA |
| End of Transmission | EOT | 0x04 | 0xB |
| File Separator | FS | 0x1C | 0xC |
| Group Separator | GS | 0x1D | 0xD |
| Record Separator | RS | 0x1E | 0xE |
| Unit Separator | US | 0x1F | 0xF |
This 4-bit medium is then employed to harbour an MSB sentinel variable-length integer encoding format: each 7-bit ASCII character contains 3 bits of meaningful information, while the high fourth bit is used to indicate whether the control character immediately following the current one should be collated with it as a single number. As with most variable-width binary encodings, it is invalid and undefined when control sequences are not properly encoded; they should always have zero or more of the above sixteen control characters with the high bit set in direct sequence followed by one and only one such control characters where the high bit is LOW.
Rich format
The rich format is paragraph-oriented and does not recognise any semantic distinction between whitespace characters; Line Feed 0x0A, Carriage Return 0x0D and Space 0x20 are all equivalent to one another. Parsing will collapse multitudes of such spaces into one inside paragraphs and conjoin source lines spanning many physical lines into one logical line before rendering them into monotype again with 72 characters/line.
Header for metadata
If an RFC format parser encounters an ASCII Start of Heading SOH before seeing any visible characters (including spacing) in the stream, it will parse it as a heading metadata block. Fields are simple ASCII text separated by CRLF pairs, making them dumb-printable:
- Full document title
- Author name
- Author address
- Author telephone
- Author e-mail
- Copyright date
- Copyright assignment
- Licence
Omitting any of the fields is done by leaving them empty. They are parsed in order, and therefore there should always be exactly seven CRLF pairs in a heading metadata block. This block is then terminated by an ASCII Start of Text STX character, after which the normal document text and whatever command formatting it bears appears immediately until EOF. As a whole, this header is optional.
General formatting
The number encoded in this way is, once decoded and in memory, interpreted as a formatting command. Parsing at this point should be forgiving; for example, excessive closing commands should be ignored. Here is a table of commands with their encodings in octal:
| Command description | Number |
|---|---|
| No-op | 0000 |
| Begin heading, level 1 | 0001 |
| Begin heading, level 2 | 0002 |
| Begin heading, level 3 | 0003 |
| Begin heading, level 4 | 0004 |
| Begin heading, level 5 | 0005 |
| Begin heading, level 6 | 0006 |
| End heading (relative) | 0007 |
| Begin code block | 0010 |
| End code block | 0011 |
| Begin block quote | 0012 |
| End block quote | 0013 |
| Begin block quote credit | 0014 |
| End block quote credit | 0015 |
| Begin hard shadow | 0016 |
| End hard shadow | 0017 |
| Begin hyperlink display text | 0020 |
| End hyperlink display text | 0021 |
| Begin hyperlink shadow bridge | 0022 |
| End hyperlink shadow bridge | 0023 |
| Begin hyperlink URI | 0024 |
| End hyperlink URI | 0025 |
| Begin centred text | 0026 |
| Begin right-aligned text | 0027 |
| Begin justified text | 0030 |
| Revert to left-aligned text | 0031 |
| Render table of contents | 0032 |
| Begin internal link target ident | 0033 |
| End internal link target ident | 0034 |
| Begin internal link display text | 0035 |
| End internal link display text | 0036 |
| Begin internal link shadow bridge | 0037 |
| End internal link shadow bridge | 0040 |
| Begin internal link reference ident | 0041 |
| End internal link reference ident | 0042 |
| Paragraph break | 0043 |
| Line break | 0044 |
| Begin all capitals | 0045 |
| End all capitals | 0046 |
| Whole paragraph indent | 0047 |
| Page break to even | 0050 |
| Page break to odd | 0051 |
| 0052–0077 are undefined | |
| Figure 5 lines, ¼ width | 0100 |
| Figure 10 lines, ¼ width | 0101 |
| Figure 15 lines, ¼ width | 0102 |
| Figure 20 lines, ¼ width | 0103 |
| Figure 25 lines, ¼ width | 0104 |
| Figure 30 lines, ¼ width | 0105 |
| Figure 40 lines, ¼ width | 0106 |
| Figure 50 lines, ¼ width | 0107 |
| Figure 5 lines, ½ width | 0110 |
| Figure 10 lines, ½ width | 0111 |
| Figure 15 lines, ½ width | 0112 |
| Figure 20 lines, ½ width | 0113 |
| Figure 25 lines, ½ width | 0114 |
| Figure 30 lines, ½ width | 0115 |
| Figure 40 lines, ½ width | 0116 |
| Figure 50 lines, ½ width | 0117 |
| Figure 5 lines, ¾ width | 0120 |
| Figure 10 lines, ¾ width | 0121 |
| Figure 15 lines, ¾ width | 0122 |
| Figure 20 lines, ¾ width | 0123 |
| Figure 25 lines, ¾ width | 0124 |
| Figure 30 lines, ¾ width | 0125 |
| Figure 40 lines, ¾ width | 0126 |
| Figure 50 lines, ¾ width | 0127 |
| Figure 5 lines, full width | 0130 |
| Figure 10 lines, full width | 0131 |
| Figure 15 lines, full width | 0132 |
| Figure 20 lines, full width | 0133 |
| Figure 25 lines, full width | 0134 |
| Figure 30 lines, full width | 0135 |
| Figure 40 lines, full width | 0136 |
| Figure 50 lines, full width | 0137 |
Shadows
The RFC format employs a simple rendering hatchet called shadows. The basic action of a shadow is, from its beginning mark to its end, it hides the text encased within from rich renderings of the document; this is called a hard shadow. A softer variant of this exists which are called shadow bridges – these are used to interlink a hyperlink or internal link's display text to its target destination, providing a semantic connection and a hiding of punctuation that would be needed in plain text for legibility, such as spacing and parentheses. Shadows provide a concise and simple way to hide plain text boilerplate from rich renderings of RFC documents while still providing them for dumb formatting strippers to create plain text renditions as the authors intended them without having to do any high-level reconstructions.
Figures
The RFC format does not provide any direct means for embedding figure data (beyond the aforementioned ASCII tables which are not true 'figures' anyway). However, it does provide a comprehensive command set for accommodating figures into the rigid and portable geometry of RFC documents. True figures could then be emplaced into this 'saved space' in the course of the rendering process, or if such rendering is not practical, the space could be left empty in a visually acceptable way. Commands 0100-0137 are provisioned for this purpose and provide a simple matrix of figure sizes: line height may be 5, 10, 15, 20, 25, 30, 40 or 50 high and the figure may be ¼ width (17 columns), ½ width (35 columns), ¾ width (51 columns) or full width (68 columns). Figures are always centred, and in the case of ¼ and ¾ width figures will have their remainder column on the right, and their top is the line on which the figure command first appears. Rendering of the figure involves an implicit carriage return before centring and as usual must respect the overwriting behaviour of the TTY format, making it possible to overlay text on top of figures directly in the document. Document authors should navigate as necessary around their inserted figure according to its prescribed size.
Rendering
Regardless of whether the target medium is print or digital, the RFC format is still meant to be a monospaced, paper-friendly medium that never exceeds 72 columns in width. Since RFC is page-aware, unlike TTY, it sets a maximum page height of 55 rows – this leaves 2 header rows with 1 spacer row and 1 footer row with 1 spacer row for 50 rows of content per page.