.. highlight:: none
.. index::
pair: hexadecimal; transliterating
.. _design-guide.hex.trans:
Transliterating the alphabet into hexadecimal
=============================================
.. mps:prefix:: guide.hex.trans
Introduction
------------
:mps:tag:`scope` This document explains how to represent the alphabet as
hexadecimal digits.
:mps:tag:`readership` This document is intended for anyone devising
arbitrary constants which may appear in hex-dumps.
:mps:tag:`sources` This transliteration was supplied by Richard Kistruck
[RHSK-1997-04-07]_ based on magic number encodings for object signatures
used by Richard Brooksby [RB-1996-02-12]_, the existence of which was
inspired by the structure marking used in the Multics operating system
[THVV-1995]_.
Transliteration
---------------
:mps:tag:`forward` The chosen transliteration is as follows::
ABCDEFGHIJKLMNOPQRSTUVWXYZ
ABCDEF9811C7340BC6520F3812
:mps:tag:`backward` The backwards transliteration is as follows::
0 OU
1 IJY
2 TZ
3 MW
4 N
5 S
6 R
7 L
8 HX
9 G
A A
B BP
C CKQ
D D
E E
F FV
:mps:tag:`pad` If padding is required (to fill a hex constant length), you
should use 9's, because G is rare and can usually be inferred from
context.
:mps:tag:`punc` There is no formal scheme for spaces, or punctuation. It is
suggested that you use 9 (as :mps:ref:`.pad`).
Justification
--------------
:mps:tag:`letters` The hexadecimal letters (A-F) are all formed by
similarity of sound. B and P sound similar, as do F and V, and C, K, &
Q can all sound similar.
:mps:tag:`numbers` The numbers (0-9) are all formed by similarity of shape
(but see :mps:ref:`.trans.t`). Nevertheless, 1=IJY retains some similarity of
sound.
:mps:tag:`trans.t` T is an exception to :mps:ref:`.numbers`, but is such a common
letter that it deserves it.
Notes
-----
:mps:tag:`change` This transliteration differs from the old transliteration
used for signatures (see design.mps.sig_), as follows: J:6->1;
L:1->7; N:9->4; R:4->6; W:8->3; X:5->8; Y:E->I.
.. _design.mps.sig: sig.html
:mps:tag:`problem.mw` There is a known problem that M and W are both common,
map to the same digit (3), and are hard to distinguish in context.
:mps:tag:`find.c` It is possible to find all 8-digit hexadecimal constants
and how many times they're used in C files, using the following Perl
script::
perl5 -n -e 'BEGIN { %C=(); } if(/0x([0-9A-Fa-f]{8})/) { $C{$1} = +[] if(
!defined($C{$1})); push(@{$C{$1}}, $ARGV); } END { foreach $H (sort(keys(%C)))
{ printf "%3d %s %s\n", scalar(@{$C{$H}}), $H, join(", ", @{@C{$H}}); } }' *.c
*.h
:mps:tag:`comment` It is a good idea to add a comment to any constant
declaration indicating the English version and which letters were
selected (by capitalisation), e.g.::
#define SpaceSig ((Sig)0x5195BACE) /* SIGnature SPACE */
References
----------
.. [RB-1996-02-12]
"Signature magic numbers" (e-mail message);
`Richard Brooksby`_;
Harlequin;
1996-12-02 12:05:30Z.
.. _`Richard Brooksby`: mailto:rb@ravenbrook.com
.. [RHSK-1997-04-07]
"Alpha-to-Hex v1.0 beta";
Richard Kistruck;
Ravenbrook;
1997-04-07 14:42:02+0100;
<https://info.ravenbrook.com/project/mps/mail/1997/04/07/13-44/0.txt>.
.. [THVV-1995]
"Structure Marking";
Tom Van Vleck;
multicians.org_;
<http://www.multicians.org/thvv/marking.html>.
.. _multicians.org: http://www.multicians.org/