Characters of code 0 to 31 in ASCII are control characters. When sent to a terminal, they're used to do special things. For instance, \a
(BEL, 0x7) rings the terminal's bell. \b
(BS, 0x8) moves the cursor backward. \n
(LF, 0xa) moves the cursor one row down, \t
(TAB 0x9) moves the cursor to the next tabulation...
\r
(CR, 0xd) moves the cursor to the first column.
When you run at a shell prompt in a terminal:
printf 'foo\nbar\n'
printf
writes foo\nbar\n
to /dev/tty<something>
, the tty line discipline of that device translates that to foo\r\nbar\r\n
, which is why you see bar
on the next line after foo
.
printf 'foo\rbar\n'
Would have the terminal overwrite foo
with bar
.
If your file contains control characters, you could either remove them, or give them a textual representation (for instance ^M
or \r
for the CR 0xd character) if you want to check for their presence.
You may not want to do that for the LF and TAB characters though. So:
LC_ALL=C tr -d '\0-\10\13-\37\177' < file # to remove them
cat -v < file # to display as ^M
sed -n l < file # to display as \r (also converts TAB to \t)
# and marks the end of lines with $
Note that those sed
and cat
ones would also transform non-ASCII characters. You could do instead:
LC_ALL=C sed "$(printf 's/[^\t -\176\200-\377]/^&/g')" < file |
LC_ALL=C tr '\0-\10\13-\37\177' '@-HK-_?'
To only convert the ASCII control characters (except TAB and LF) to their ^X
visual form (note though that not all sed
implementations support input files with NUL characters in them).