1. What Is a Control Character?
In computing, a control character or non-printing character is a code point (a number) in a character set that does not represent a written symbol. They are used to keep track of where a sequence should end, to signal the end of a line of text (EOL), to alert the receipt of a message, and more. They have no graphical representation but they do have a symbolic representation like NUL for null, SOH for start of heading etc.
Control characters, in ASCII are standardized. These characters occupy the range of ASCII 0 to ASCII 31, and ASCII 127. They were introduced to perform various operations like helping in the proper interpretation of the data, controlling peripheral devices, or managing data transmission. Their functionality is essential in maintaining the smooth flow of data and enabling effective communication between devices.2. The Historical Journey of ASCII Control Characters
The earliest seeds of ASCII control characters can be traced back to Baudot code, a character set used in telegraphy as early as the late 19th century. The Baudot code introduced control characters like NUL and DEL. NUL was used as a filler character and DEL was used to indicate an error or erase the previous charac
The importance of control characters was further emphasized in the Murray code, a variant of Baudot code developed by Donald Murray in the early 20th century. Murray introduced the concepts of the Carriage Return (CR) and Line Feed (LF), which were control functions needed for the operation of his newly invented typewriter-like machines. These characters allowed precise control over the position of the print head, marking a significant step forward in the usability of automated typing systems.
Building on these foundations, the creators of ASCII included a full range of control characters to manage data flow and control peripherals. Early teletype machines, for example, used CR and LF for moving the print head and advancing the paper. Other control characters like SOH, STX, and EOT were used to delimit messages and signal the end of transmissions. ACK and NAK provided a mechanism for simple error detection and recovery.
Over time, as technology advanced and communication protocols evolved, some ASCII control characters became less commonly used or were replaced by more complex mechanisms. However, many still play essential roles in modern computing and communication systems. ASCII control characters serve as a testament to the early history of computing, the innovation of pioneers in the field, and the enduring principles that underpin digital communication.3. Grouping of ASCII Control Characters
The ASCII control characters can be grouped into six categories, each with unique functionalities:
- Communication control characters: This group primarily regulates and supervises data flow during transmission. They ensure that the information exchange between devices or systems is accurate and systematic, providing crucial functionalities such as marking the beginning and end of a message or data block, soliciting and acknowledging responses, and managing special transmission instructions.
- Formatting characters: These characters are fundamental for the presentation of text, whether it's on a screen or a printout. They contribute to maintaining the visual structure and layout of the text, governing actions like moving to a new line, tabulation, returning to the start of the line, and controlling space between characters.
- Device control characters: These characters are designed to manage peripheral devices directly. Although less frequently used in contemporary systems due to the advent of higher-level device control protocols, they can be crucial in some specialized or legacy environments where direct device control is required.
- Information separators: These characters facilitate the organization of data by marking boundaries within a data stream. They function as delimiters to separate and group pieces of information, making the data more manageable and readable, especially in complex data structures.
- Error control characters: This group of characters plays an instrumental role in error detection and management in data communication. They are designed to signal error conditions, cancel erroneous transmissions, and manage substitutions for incorrect or invalid data.
- Special characters: This group encompasses characters with unique purposes, including padding data, initiating special sequences, or erasing data. These characters often serve as useful tools in handling exceptional or specific situations in data processing or communication.
4. The ASCII Control Characters
The ASCII control characters form an integral part of the ASCII character set, primarily used to control peripherals and manage data streams. These characters, known as non-printing characters, are part of the original ASCII character set and occupy the range from ASCII 0 to ASCII 31, and ASCII 127. The uniqueness of these characters lies not in what they display, but in the control functions they perform. These functions range from formatting text, managing device controls, signaling error conditions, to delimiting data structures. To better understand their functionalities, it's important to know that the control characters are grouped into six categories: communication control characters, formatting characters, device control characters, information separators, error control characters, and special characters. In the following, we'll provide a detailed exploration of each control character and the group it belongs to, thereby illustrating their specific roles and the diversity they bring to the ASCII encoding standard.
Communication control characters
- Start of Heading (SOH) - ASCII 1:This character marks the start of a message heading. It's primarily used in communication protocols to signal the beginning of a metadata header in a message block.
- Start of Text (STX) - ASCII 2:STX signals the commencement of the actual data content in a message block following the header. Its primary purpose is to denote the boundary between metadata and actual data.
- End of Text (ETX) - ASCII 3:This character marks the termination of a block of text within a message, signaling that the data following it should be treated differently or ignored by the receiving system.
- End of Transmission (EOT) - ASCII 4:EOT is traditionally used to signify the end of a transmission, implying that no more data will follow. It is useful in many communication protocols to ensure that the receiving end knows when to stop waiting for more data.
- Enquiry (ENQ) - ASCII 5:This character is sent by a host system to solicit a response from a remote station, such as requesting the remote station to send its status or other specific information.
- Acknowledge (ACK) - ASCII 6:ACK is sent by a receiver to acknowledge that it has successfully received a transmission. This forms a critical part of many communication protocols to ensure reliable data transmission.
- Data Link Escape (DLE) - ASCII 16:This character is used to signal that the following characters are part of a control sequence, rather than normal text.
- Negative Acknowledge (NAK) - ASCII 21:NAK signals that a transmission error has been detected, usually prompting a retransmission request.
- Synchronous Idle (SYN) - ASCII 22:SYN is used to maintain synchronization in systems that require regular communication, even when no data is being sent.
- End of Transmission Block (ETB) - ASCII 23:This character signals the end of a block of data in systems that transmit data in block mode.
- End of Medium (ETB) - ASCII 25:EM marks the end of the used portion of a data storage medium, or the end of the used area of a data block.
Formatting characters
- Backspace (BS) - ASCII 8:This character commands the cursor to move one space to the left, enabling the deletion of the last entered character or overwriting it with a new character.
- Horizontal Tab (HT) - ASCII 9:HT moves the cursor horizontally to the right by a fixed number of spaces (commonly 8), facilitating the alignment of text in columns.
- Line Feed (LF) - ASCII 10:LF commands the cursor to move to the start of the next line. In Unix-like systems, it's used as a newline character to break lines of text.
- Vertical Tab (VT) - ASCII 11:VT works similarly to HT, but moves the cursor vertically to the next preset line instead.
- Form Feed (FF) - ASCII 12:FF causes the printer to eject the current page and to continue printing at the top of a new one. In the context of screen displays, it usually clears the screen.
- Carriage Return (CR) - ASCII 13:CR moves the cursor to the beginning of the current line, without advancing to the next line. It's often used with LF to achieve proper newlines in certain systems (like Windows).
Device control characters
- Device Control One (XON) (DC1) - ASCII 17
Device Control Two (DC2) - ASCII 18
Device Control Three (XOFF) (DC3) - ASCII 19
Device Control Four (DC4) - ASCII 20:These characters were designed for controlling devices like printers and magnetic tape machines. DC1 and DC3, also known as XON and XOFF, are still used in software flow control. - Shift Out (SO) - ASCII 14
Shift In (SI) - ASCII 15:These characters are used to switch between different character sets, allowing the use of more symbols than would otherwise be available.
Information separators
- File Separator (FS) - ASCII 28
Group Separator (GS) - ASCII 29
Record Separator (RS) - ASCII 30
Unit Separator (US) - ASCII 31:These characters are used to structure and organize data within a message or a file, by marking boundaries and hierarchies among data elements.
Error control characters
- Bell, Alert (BEL) - ASCII 7:When this character is processed, it triggers an alert, usually in the form of a bell or beep sound. It's used in interfaces to attract user attention.
- Cancel (CAN) - ASCII 24:This character is used to signal that the current operation should be cancelled or aborted.
- Substitute (SUB) - ASCII 26:This character is used to replace a character that has been found to be invalid or in error.
Special characters
- Null (NUL) - ASCII 0:The Null character was initially used to fill or pad unused or meaningless data spaces. This padding ensured data blocks remained at a fixed size for standardized processing, and it could also be used to signify the end of a string in certain programming languages.
- Escape (ESC) - ASCII 27:This character starts an escape sequence, which allows a set of additional control functions to be accessed.
- Delete (DEL) - ASCII 127:This character was historically used to erase a character on a paper tape. In modern systems, it's often used to delete the character in the current cursor position in text fields.
5. Conclusion
In conclusion, the ASCII control characters are an essential component of the ASCII encoding standard, laying the foundation for precise device control and intricate data stream management. As we have seen, these characters have a rich history, reflecting the needs and constraints of early computing and telecommunications systems. Despite technological advancements and the evolution of communication protocols, the roles and functions of these characters have remained relevant, underscoring their significance.
ASCII control characters form an essential aspect of the ASCII encoding standard, enabling device control and stream manipulation. Understanding these characters can pave the way to better comprehend computer communication systems, thereby facilitating more effective and efficient utilization of these systems.6. Reference
- ASCII format for network interchange
- American National Standards Institute (ANSI). (1968). American Standard Code for Information Interchange
- Unicode Consortium. (2021). The Unicode Standard, Version 15.0.
- Yergeau, F. (1996). RFC 1345 - Character Mnemonics & Character Sets. Internet Engineering Task Force.
- Control Chareacter on Wikipedia