L J Robertson, ZL2LJR.
Sept 2010
Morse code is the only “two-way, human-readable digital communication", and is increasingly recognised as an 'assistive technology' in addition to its well-publicised capability for ‘getting through’. To increase the utility of Morse code as an 'human-machine interface technology' one specific feature is lacking; a facility to allow extensions to Morse. This note proposes that only two additional Morse characters, to represent the characters "<" and ">", are sufficient to add very large functionality, with unlimited extensibility, to Morse code. If these symbols can be represented, data transmitted via Morse code can be structured (either for code extension, machine-interface, data-definition, or presentational purposes) using SGML/XML tags; This paper sets out this proposal, with justifications. A specific request for recognition of two additional Morse characters is presented for consideration.


Others have pointed out that Morse is a "human-readable digital communication format". I believe that this issue is far more important than is commonly realised.

While codes such as DTMF could be considered as bridges between the analogue and digital worlds, they are quite specialised and of limited usefulness as bridges - it is barely practical for a person to generated and decode DTMF tone-combinations directly. It is true that modes such as HellSchreiber (FeldHell) allow digitally transmitted signals to be read by humans with minimal equipment, but the direct generation and reading of HellSchreiber characters is still not practical for the unaided human. Morse can therefore be considered as the only general-purpose, two-way digital communication method that can be both generated and read directly by human beings.

I would submit then, that it is particularly important to ensure that Morse has provision for the broadest possible application, and has the capability to maximise the 'assistive' technology function for which it is recognised.

Apart from basic punctuation, there is currently no provision for structuring the data transmitted by Morse code, nor for extending the data set. Morse is therefore currently limited to transmitting raw data between human operators.

This submission shows that by adding two 5-element codes to the Morse alphabed, not only is the code itself made infinitely extensible (e.g. it can unambiguously represent any character set), but it becomes possible for Morse to form a direct two-way bridge between an unassisted human operator, and an ICT application.

Extensions / changes to Morse code are able to be approved by the IATU, who will consider submissions from their member bodies (NZART in New Zealand). I would therefore request that NZART consider this document, and advise whether they would be prepared to pass the attached recommendations to the IATU.


2.1 Lack of alternative human machine interfaces.
At present the "digital world" is ONLY accessible to humans via ICT technology: automated speech recognition is unreliable and rarely available at best. Amateur and other radio users have utilised DTMF and CTCSS approaches to enable limited-capability "one analogue signal = one command" type of controls for digital equipment, but these approaches are not general-purpose, nor are they widely standardised, and they are not practically accessible to an unaided human operator.

2.2 Lack of extension facilities in international Morse alphabet.
Extensions to Morse code have been significant adminitrative exercises, as shown by the recent addition of the "Commat" (@) symbol. A small number of language-sets have been adopted (eg Japanese), but these have introduced some ambiguities in the meaning of Morse elements. At present Morse code possesses no inherent capability for extension; additional characters must be associated with new codes or combinations, (eg the recent 'Commat') introduced on an ad-hoc basis, and generally at the cost of increased Morse character length. This process is laborious and it is difficult to achieve the necessary breadth of acceptance.

If Morse codes for "<" and ">" can be agreed upon, then user-defined schemas can be constructed to allow a small and finite set of Morse codes to be unambiguously reassigned to any language or symbol-set. One could, for example, assign the Morse characters a...z to the pictographic symbols of language_ZZ, and these could be transmitted unambiguously using existing Morse codes via a string such as a d e g c t d .

Such an approach also allows the present ambiguities between different language-assignations of Morse codes to be avoided (For example, Japanese Morse assigns a language character to both the Dit Dah Dit Dit Dit character, which has another assigned usage in International Morse, and Japanese Morse assigns a Japanese character to the Dah Dah Dit Dit Dah character used in World Morse).

2.3 Data structuring - Capability and value
2.3.1 Capacity for extension is valuable and needed At present, the international Morse alphabet codifies a stream of characters; it contains no capacity for some characters/strings to be given any context, or have any meaning other than that of the raw character presented. A sub-string of Morse characters cannot therefore be distinguished as "an instruction" or as "a message address", or as "a checksum on a message" or "a name in language xxxx".

2.3.2 Capacity for data structuring does not currently exist Not only does international Morse alphabet currently lack any facility for structuring data, but all of the symbols for which Morse representations exist, are commonly used within 'normal' text. Therefore, to use any (combinations of) the existing Morse character codes to indicate data structure would invite ambiguity and confusion.

2.4 Data structuring requirments
2.4.1 Minimum requirements If alternative meanings are to be assigned to an arbitrary substring, or meta-data is to be added to such a substring, then a means of identifying the start and end of the defined substring is required.

The issue of minimum requirements for data structuring have been explored elsewhere. If alternative meanings for individual characters are to be considered, a single pre-pended "escape" character (which applies to the single character following the "escape" will allow exactly as many 'structures' as the original number of characters in the code-set. This approach is necessarily limiting; by contrast, if arbitrary length "tags" can be used, there is no practical limit to the extensibility of the code.

2.4.2 Options for identifying arbitrary-length data strings as "tags" Select a concatenation of current Morse characters, and assign a special meeting to these. For example, it would be possible to assign "T A G S" and "T A G E" sequences (without the normal inter-character spacing) to represent "Tag Start" and "Tag End". Assuming that the sequence is fairly explicit, this approach is usable and does not require any additional character codes - but unless a simple letter combination is identified, the approach imposes a considerable overhead of transmission and at least a theoretical possibility of ambiguity. Select a single "Shift" Morse code/sequence, and use this in the same was as the common keyboard "Caps Lock", to toggle between 'normal' and 'shifted' modes. This approach requires only one new character, but relies upon the single character to "toggle" the start and end of the defined sequence - this introduces a possibility of ambiguity. Codify a 'start tag' and an 'end tag' character. This approach required two characters, but no more. It avoids ambiguity and allows remote testing of “well-formed-ness”. This is the approach used in SGML, XML and the many subsets of these markup languages.

2.4.3 Compatibility issues
The XML approach to data structuring has become increasingly popular because of the flexibility, extensibility and readability offered. Not only is the XML approach used for structuring data, it is also used as a means of mixing data and scripting language instructions (in languages such as PHP), and for adding presentational features (HTML).
It would be difficult indeed to modify Morse code so that “any valid XML document could be represented in Morse code” – since the W3 XML specification states that the “legal characters are tab, carriage return, line feed, and the legal characters of Unicode and ISO/IEC 10646”.
It is, however, ONLY necessary to have the "<", ">", "/", "(", and ")" characters in order to construct a well-formed XML document; of these, only the "<" and ">" characters are not currently represented in Morse code.


3.1 Selection of approach
I wish to suggest that two Morse characters be designated to represent the "<" and ">" characters.

3.2 Selection of Morse characters
Reviewing the Dichotomic table representation of current international Morse code, it can be seen that there are a small number of unassigned 5-element codes, and a larger number of unassigned 6-element combinations.

A review of the usages of 5-element characters for morse-code representation of non-english languages is shown in the following table.

From this table it is clear that there are a number of 5-element characters that are unused except for the Japanese character set.

It would be possible to go to a 6-element character, however two important points are noted: firstly there are already many cases in which 1-to-5-element characters are assigned different meanings in different language-sets, and secondly International Morse already has a “Pro-Sign” (Dah Dit Dit Dah Dah Dah) for “change to Wabun code”, and Japanese Morse code already contains a character (Dit Dit Dit Dah Dit) that signifies “change to world standard Morse” - so there is no need for ambiguity over whether Japanese or non-Japanese character-sets are in use.

For these two reasons it is proposed that two of the presently unassigned 5-element characters are assigned the characters "<" and">".

Unassigned 5-element chars International World_std Russian Greek Hebrew Arabic Japanese Korean US Navy
* * - * - No No No No No No Yes No
* - * * * Yes No No No No No Yes No No
* - * - - No No No No No No Yes No No
- * * - - No No No No No No Yes No No
- * - * - Yes No No No No No Yes No No
- * - - - No No No No No No Yes No No
- - * * - No Yes No No No No Yes No No
- - - * - No No No No No No Yes No No
Note: (International Morse code designates “Dah Dit Dah Dit Dah“ as the "Start signal”, and “Dit Dah Dit Dit Dit” as the “Wait” signal

It is desirable to have characters that are similar, so it is proposed that the pair "Dit Dah Dit Dah Dah" and "Dah Dit Dit Dah Dah" be selected to represent the proposed “<” and “>” characters.


4.1 What is achieved?
Morse's capability as a “human readable and writable digital code” is considerably enhanced. A functional method is provided for 'bridging' between an unassisted human operator, and ICT applications.

Morse’s capability as an 'assistive technology' is greatly enhanced.

By designating codes to enable an extensible mark-up capability, arbitrary further extension of Morse code is provided for, without requiring any additional Morse characters.

4.2 Are key features of Morse retained?
The essential qualities of International Morse (ability for creation and reading directly by person), small bandwidth, good readability under poor transmission conditions etc are retained.

Unlimited capability for character representation is added without ever needing to resort to Morse characters longer than 5 elements.

4.3 Recommendation.
That the International Morse Code be extended by designating the character "<" as "Dit Dah Dit Dah Dah", and the character ">" by "Dah Dit Dit Dah Dah".


General – Morse code. Seen online at “ “ in March 2009

Modern Morse Code in Rehabilitation and Education: New Applications in Assistive Technology (Paperback) by Thomas Wayne King (Author). Seen online at in March 2009

Morse as “Human readable digital code”. Seen online in March 2009

Also Seen online at in March 2009

XML data model: Seen online at “ “ in March 2009. For a more descriptive approach, see “ “ Seen online in March 2009

Copyright (C) Lindsay J Robertson, 2011
=== === === === === === === === === === === ===