.. highlight:: none .. index:: pair: hexadecimal; transliterating .. _design-guide.hex.trans: Transliterating the alphabet into hexadecimal ============================================= .. mps:prefix:: guide.hex.trans Introduction ------------ :mps:tag:`scope` This document explains how to represent the alphabet as hexadecimal digits. :mps:tag:`readership` This document is intended for anyone devising arbitrary constants which may appear in hex-dumps. :mps:tag:`sources` This transliteration was supplied by Richard Kistruck [RHSK-1997-04-07]_ based on magic number encodings for object signatures used by Richard Brooksby [RB-1996-02-12]_, the existence of which was inspired by the structure marking used in the Multics operating system [THVV-1995]_. Transliteration --------------- :mps:tag:`forward` The chosen transliteration is as follows:: ABCDEFGHIJKLMNOPQRSTUVWXYZ ABCDEF9811C7340BC6520F3812 :mps:tag:`backward` The backwards transliteration is as follows:: 0 OU 1 IJY 2 TZ 3 MW 4 N 5 S 6 R 7 L 8 HX 9 G A A B BP C CKQ D D E E F FV :mps:tag:`pad` If padding is required (to fill a hex constant length), you should use 9's, because G is rare and can usually be inferred from context. :mps:tag:`punc` There is no formal scheme for spaces, or punctuation. It is suggested that you use 9 (as :mps:ref:`.pad`). Justification -------------- :mps:tag:`letters` The hexadecimal letters (A-F) are all formed by similarity of sound. B and P sound similar, as do F and V, and C, K, & Q can all sound similar. :mps:tag:`numbers` The numbers (0-9) are all formed by similarity of shape (but see :mps:ref:`.trans.t`). Nevertheless, 1=IJY retains some similarity of sound. :mps:tag:`trans.t` T is an exception to :mps:ref:`.numbers`, but is such a common letter that it deserves it. Notes ----- :mps:tag:`change` This transliteration differs from the old transliteration used for signatures (see design.mps.sig_), as follows: J:6->1; L:1->7; N:9->4; R:4->6; W:8->3; X:5->8; Y:E->I. .. _design.mps.sig: sig.html :mps:tag:`problem.mw` There is a known problem that M and W are both common, map to the same digit (3), and are hard to distinguish in context. :mps:tag:`find.c` It is possible to find all 8-digit hexadecimal constants and how many times they're used in C files, using the following Perl script:: perl5 -n -e 'BEGIN { %C=(); } if(/0x([0-9A-Fa-f]{8})/) { $C{$1} = +[] if( !defined($C{$1})); push(@{$C{$1}}, $ARGV); } END { foreach $H (sort(keys(%C))) { printf "%3d %s %s\n", scalar(@{$C{$H}}), $H, join(", ", @{@C{$H}}); } }' *.c *.h :mps:tag:`comment` It is a good idea to add a comment to any constant declaration indicating the English version and which letters were selected (by capitalisation), e.g.:: #define SpaceSig ((Sig)0x5195BACE) /* SIGnature SPACE */ References ---------- .. [RB-1996-02-12] "Signature magic numbers" (e-mail message); `Richard Brooksby`_; Harlequin; 1996-12-02 12:05:30Z. .. _`Richard Brooksby`: mailto:rb@ravenbrook.com .. [RHSK-1997-04-07] "Alpha-to-Hex v1.0 beta"; Richard Kistruck; Ravenbrook; 1997-04-07 14:42:02+0100; . .. [THVV-1995] "Structure Marking"; Tom Van Vleck; multicians.org_; . .. _multicians.org: http://www.multicians.org/