Cover V13, i10
oct2004.tar

Trapping Special Characters in the Korn Shell

Ed Schaefer and John Spurgeon

Have you ever needed to know whether a user pressed the up arrow key or some other unprintable character from within a shell script? The Korn shell provides no command for detecting whether a user has pressed a special character (arrow keys, function keys, and control key sequences). With a little programming, and by setting certain terminal driver options with the stty command, you can detect these keys. In this column, we'll:

  • Discuss special key composition and how to use the stty command in a shell script to break down the composition.
  • Present shell function GetKey, which determines whether a special key has been pressed (the unprintable control characters are embedded in the script).
  • Present shell function NewGetKey (a rewrite of GetKey), which does not use embedded, unprintable control characters, thus improving script portability.
  • Conclude by presenting a "C" utility -- keycode -- that, with the press of a key, displays the key and its value in decimal, octal, and hexadecimal.
Special Keys Composition

A control key combination such as CTRL-p is self-explanatory -- while holding down the control key, simply press the "p" key, and the shell interprets the combination. But the arrow keys and the function keys are more complex escape sequences. When an arrow or function key is pressed, the shell actually senses three sequences.

For the arrow keys:

1. The binary ESC sequence
2. The left bracket ([) key
3. Based on which arrow key is pressed:

    a. The up key produces the A key.
    b. The down key produces the B key.
    c. The right key produces the C key.
    d. The left key produces the D key.

For the four function keys:

1. The binary ESC sequence
2. The upper-case O key
3. Based on which arrow key is pressed:

    a. The F1 function key produces the P key.
    b. The F2 function key produces the Q key.
    c. The F3 function key produces the R key.
    d. The F4 function key produces the S key.

The terminal driver controls input between a shell script and the Unix kernel. By default, the control is canonical, line-based mode. That is, input is collected and no communication happens until the user presses the carriage return. To parse the arrow and function keys, we must set the driver to "raw" mode, using the sty command, to inspect each sequence one character at a time.

Using the stty and dd Commands

In our Solaris 7 system, the following command:

stty -icanon -echo min 1 time 0 -isig
turns off canonical mode, turns off echoing, accepts one character from the driver, waits zero time units between characters, and turns off the special meaning of characters INTR, QUIT, SWTCH, and SUSP.

Once raw mode is on, the dd (convert and copy a file) command retrieves one character from standard input:

readchar=$(dd bs=1 count=1 2>/dev/null)
This is a common method for returning one character to the shell. (See "Returning a Single Character in a Unix Shell Script": http://www.samag.com/documents/s=1205/sam9704f/9704f.htm).

The GetKey Function

Using the stty/dd command combination, the GetKey shell function (Listing 1) returns a string identifying the special character. Pressing any normal key returns just that key's value.

The original GetKey function is from Heiner Steven's confection of Unix scripts located at Shelldorado.com. But Heiner's script only supported the arrow keys; we added the control and the function keys.

Consider the high points of this script:

1. GetKey saves the original terminal settings with this command:

typeset oldstty="$(stty -g)" 
and restores the original settings with this command:

stty $oldstty 
2. When GetKey identifies a special character, it returns a character string representing that character (i.e., the TAB key returns the string "TAB"). To keep the script short, only the arrow keys, function keys, CTRL-p, CTRL-d, TAB, and DELETE key are supported. The other control keys can easily be added to the script.

3. When the binary ESCAPE key is identified, the script assumes that either an arrow key or a function key was pressed, and reads the next two characters. Consequently, the script can't detect a lone escape key, and the user can emulate entering an arrow or function key by pressing the correct three characters.

With GetKey in raw mode, the second character will be a "[" for an arrow key or a "O" for a function key, respectively. Finally, obtain the third character, and return the proper arrow or function key string.

4. Note that the control characters and the escape key are the actual key representations. In the vi editor, embed the actual control character by pressing CTRL-v and then pressing the required character.

The NewGetKey Function

Embedding the actual control characters in a script is problematic. As long as we use the scripts in the tarball, there's no problem. However, reading the text in an article may lead to confusion, and cutting-and-pasting text may cause portability problems. We need a different method to emulate control characters.

Each key in the character set can be identified by its octal value. Consulting an ASCII table shows the escape character maps to octal value \033. Perhaps within the script, we can change the case statement to this:

   case "$readchar" in
       '\033') # escape sequence
       .
       .
Unfortunately, the shell doesn't support this syntax, but other external Unix tools do. The following statement is true if the "myvar" variable is the escape key:

if [[ -z $(echo "$myvar"|tr -d '\033') ]]
then
   echo "myvar is the escape key"
fi
Echo the variable to the tr (translate command) and delete the escape key. If the variable contains just the escape key, the resulting null check (-z) returns true.

The double quotes surrounding the "myvar" variable are important. This statement fails miserably if myvar is whitespace:

if [[ -z $(echo $myvar|tr -d '\011') ]]
then # this statement fails and is never true
   echo "myvar is a TAB key"
fi
The TAB key, CTRL-i, is whitespace (space, TAB, and newline). Since the shell uses whitespace as an argument delimiter, myvar is not recognized as an argument to the echo command without double quotes.

In addition to tr, the sed command may also be used:

if [[ -z $(echo "$myvar"|sed 's/'"$(echo '\011')"'//') ]]
then
   echo "myvar is a TAB key"
fi
The NewGetKey function (Listing 2), a rewrite of GetKey, eliminates the need for the embedded control keys. Using the tr command as previously described, the special_char_str function evaluates the passed argument and, if it is a special character, passes back a special character string identifying the character.

The Keycode "C" Utility

The keycode "C" utility (Listing 3) is a handy replacement for an ASCII table. Pressing a key displays the key and its value in decimal, octal, and hexadecimal. For example, execute the program and press the "d" key:

Enter keys to display in dec, octal, and hex.  press control-d to quit
keycode>       d     100 0144  0x64 
Analogous to the stty command, the "C" ioctl function places the terminal driver in raw mode. This example displays pressing the up arrow key:

keycode> <ESC>        27 033   0x1b
keycode> [ 91 0133 0x5b
keycode> A 65 0101 0x41
Conclusion

This column has presented how to control the terminal driver and interpret unprintable, special characters such as the arrow keys. The crux of this topic is setting terminal options with the stty command. As far as portability is concerned, stty is one of the more persnickety commands among the Unix variants.

In addition to Solaris 7, these two scripts and one "C" utility (Listing 3) also execute under the Bash shell using Red Hat Linux 7.1. Please let us know if you have problems executing them on any other systems.

References

Schaefer, Ed. "Returning a Single Character in a Unix Shell Script", Sys Admin, April 1997: http://www.samag.com/documents/s=1205/sam9704f/9704f.htm

Steven, Heiner. Heiner's Shelldorado -- Unix shell scripting resource: http://www.shelldorado.com.

Bolsky, Morris, David Korn. The New KornShell Command and Programming Language, 1995. Upper Saddle River, NJ: Prentice Hall PTR.

Ed Schaefer is a frequent contributor to Sys Admin. He is a software developer and DBA for Intel's Factory Integrated Information Systems, FIIS, in Aloha, Oregon. Ed also hosts the monthly Shell Corner column on UnixReview.com. He can be reached at: shellcorner@comcast.net.

John Spurgeon is a software developer and systems administrator for Intel's Factory Integrated Information Systems, FIIS, in Aloha, Oregon. Outside of work, he enjoys turfgrass management, triathlons, and spending time with his family.