Unit separators
Unit Separators are part of the original ASCII control codes. Ostensibly unused, even though Unix treats everything as text data streams (the universal format).
| Type | Escape Code | Level |
|---|---|---|
| File |
\x1C
|
4 |
| Group |
\x1D
|
3 |
| Record |
\x1E
|
2 |
| Unit |
\x1F
|
1 |
The two most common separators I see are commas and tabs for csvs and tsv respectively. But ASCII also specifies a few more.
Awk supports setting unit and record separators with
-F and
RS option.
Utilities
- csv-to-usv Instead of Ascii code, converts between csv and the unicode equivalent. Controversial, but it was done because the ASCII escape codes are never visualized, making them hard to work with
Alternatives
The
\xFF separator byte can never appear in UTF-8, so it
makes sense to use that.