Transliterating the alphabet into hexadecimal

author Gavin Matthews
date 1997-04-11
index terms pair: hexadecimal; transliterating
organization Harlequin
revision //info.ravenbrook.com/project/mps/save-errno-win32/design/guide.hex.trans.txt#1
status incomplete documentation
tag guide.hex.trans

Introduction

.scope: This document explains how to represent the alphabet as hexadecimal digits.

.readership: This document is intended for anyone devising arbitrary constants which may appear in hex-dumps.

.sources: This transliteration was supplied by Richard Kistruck [RHSK-1997-04-07] based on magic number encodings for object signatures used by Richard Brooksby [RB-1996-02-12], the existence of which was inspired by the structure marking used in the Multics operating system [THVV-1995].

Transliteration

.forward: The chosen transliteration is as follows:

ABCDEFGHIJKLMNOPQRSTUVWXYZ
ABCDEF9811C7340BC6520F3812

.backward: The backwards transliteration is as follows:

0 OU
1 IJY
2 TZ
3 MW
4 N
5 S
6 R
7 L
8 HX
9 G
A A
B BP
C CKQ
D D
E E
F FV

.pad: If padding is required (to fill a hex constant length), you should use 9's, because G is rare and can usually be inferred from context.

.punc: There is no formal scheme for spaces, or punctuation. It is suggested that you use 9 (as .pad).

Justification

.letters: The hexadecimal letters (A-F) are all formed by similarity of sound. B and P sound similar, as do F and V, and C, K, & Q can all sound similar.

.numbers: The numbers (0-9) are all formed by similarity of shape (but see .trans.t). Nevertheless, 1=IJY retains some similarity of sound.

.trans.t: T is an exception to .numbers, but is such a common letter that it deserves it.

Notes

.change: This transliteration differs from the old transliteration used for signatures (see design.mps.sig), as follows: J:6->1; L:1->7; N:9->4; R:4->6; W:8->3; X:5->8; Y:E->I.

.problem.mw: There is a known problem that M and W are both common, map to the same digit (3), and are hard to distinguish in context.

.find.c: It is possible to find all 8-digit hexadecimal constants and how many times they're used in C files, using the following Perl script:

perl5 -n -e 'BEGIN { %C=(); } if(/0x([0-9A-Fa-f]{8})/) { $C{$1} = +[] if(
!defined($C{$1})); push(@{$C{$1}}, $ARGV); } END { foreach $H (sort(keys(%C)))
{ printf "%3d %s %s\n", scalar(@{$C{$H}}), $H, join(", ", @{@C{$H}}); } }' *.c
*.h

.comment: It is a good idea to add a comment to any constant declaration indicating the English version and which letters were selected (by capitalisation), e.g.:

#define SpaceSig        ((Sig)0x5195BACE) /* SIGnature SPACE */

References

[RB-1996-02-12]"Signature magic numbers" (e-mail message); Richard Brooksby; Harlequin; 1996-12-02 12:05:30Z.
[RHSK-1997-04-07]"Alpha-to-Hex v1.0 beta"; Richard Kistruck; Ravenbrook; 1997-04-07 14:42:02+0100; <https://info.ravenbrook.com/project/mps/mail/1997/04/07/13-44/0.txt>.
[THVV-1995]"Structure Marking"; Tom Van Vleck; multicians.org; <http://www.multicians.org/thvv/marking.html>.

Document History

2013-05-10 RB Converted to reStructuredText and imported to MPS design.