Originally Posted by
Adak
Any character set with the lower 7 bit values the same as the original ASCII set, can be called ASCII
No, Adak. It can only be called ASCII-compatible.
If you output data that only uses ASCII characters, that is codes 0..127, then your output is ASCII.
If you output data that has codes > 127, it is no longer ASCII output, but something else. Your example program outputs CP437 or CP850 data.
However, the principle behind the logic you showed is practical, if you switch from characters to strings. One can make a simple assumption that Windows users use CP437 or CP850, and everybody else has UTF-8.
It's not perfect, but it is practical, since the other common alternatives (like windows-1252 "Western European" or ISO-8859 variants) do not have any box drawing characters anyway.
For example, box.h:
Code:
#ifndef BOX_H
#define BOX_H
#define CP437 437
#define CP850 850
#define UTF8 8
#ifndef CHARSET
#ifdef _WIN32
#define CHARSET CP437
#else
#define CHARSET UTF8
#endif
#endif
#if (CHARSET == CP437 || CHARSET == CP850)
/* Windows code page 437 box drawing characters */
#define BOX_DLR "\315" /* ═ */
#define BOX_DUD "\272" /* ║ */
#define BOX_DUL "\274" /* ╝ */
#define BOX_DUR "\310" /* ╚ */
#define BOX_DDL "\273" /* ╗ */
#define BOX_DDR "\311" /* ╔ */
#define BOX_DUDL "\271" /* ╣ */
#define BOX_DUDR "\314" /* ╠ */
#define BOX_DULR "\312" /* ╩ */
#define BOX_DDLR "\313" /* ╦ */
#define BOX_DUDLR "\316" /* ╬ */
#define BOX_DU_SL "\275" /* ╜, not in CP850 */
#define BOX_DU_SR "\323" /* ╙, not in CP850 */
#define BOX_DD_SL "\267" /* ╖, not in CP850 */
#define BOX_DD_SR "\326" /* ╓, not in CP850 */
#define BOX_DL_SU "\276" /* ╛, not in CP850 */
#define BOX_DL_SD "\270" /* ╕, not in CP850 */
#define BOX_DR_SU "\324" /* ╘, not in CP850 */
#define BOX_DR_SD "\325" /* ╒, not in CP850 */
#define BOX_DU_SLR "\320" /* ╨, not in CP850 */
#define BOX_DD_SLR "\322" /* ╥, not in CP850 */
#define BOX_DL_SUD "\265" /* ╡, not in CP850 */
#define BOX_DR_SUD "\306" /* ╞, not in CP850 */
#define BOX_DLR_SU "\317" /* ╧, not in CP850 */
#define BOX_DLR_SD "\321" /* ╤, not in CP850 */
#define BOX_DLR_SUD "\330" /* ╪, not in CP850 */
#define BOX_DUD_SL "\266" /* ╢, not in CP850 */
#define BOX_DUD_SR "\307" /* ╟, not in CP850 */
#define BOX_DUD_SLR "\327" /* ╫, not in CP850 */
#define BOX_SLR "\304" /* ─ */
#define BOX_SUD "\263" /* │ */
#define BOX_SUL "\331" /* ┘ */
#define BOX_SUR "\300" /* └ */
#define BOX_SDL "\277" /* ┐ */
#define BOX_SDR "\332" /* ┌ */
#define BOX_SULR "\301" /* ┴ */
#define BOX_SDLR "\302" /* ┬ */
#define BOX_SUDL "\264" /* ┤ */
#define BOX_SUDR "\303" /* ├ */
#define BOX_SUDLR "\305" /* ┼ */
#elif CHARSET == UTF8
/* UTF-8 box drawing characters */
#define BOX_DLR "\342\225\220" /* ═ */
#define BOX_DUD "\342\225\221" /* ║ */
#define BOX_DUL "\342\225\235" /* ╝ */
#define BOX_DUR "\342\225\232" /* ╚ */
#define BOX_DDL "\342\225\227" /* ╗ */
#define BOX_DDR "\342\225\224" /* ╔ */
#define BOX_DUDL "\342\225\243" /* ╣ */
#define BOX_DUDR "\342\225\240" /* ╠ */
#define BOX_DULR "\342\225\251" /* ╩ */
#define BOX_DDLR "\342\225\246" /* ╦ */
#define BOX_DUDLR "\342\225\254" /* ╬ */
#define BOX_DU_SL "\342\225\234" /* ╜ */
#define BOX_DU_SR "\342\225\231" /* ╙ */
#define BOX_DD_SL "\342\225\226" /* ╖ */
#define BOX_DD_SR "\342\225\223" /* ╓ */
#define BOX_DL_SU "\342\225\233" /* ╛ */
#define BOX_DL_SD "\342\225\225" /* ╕ */
#define BOX_DR_SU "\342\225\230" /* ╘ */
#define BOX_DR_SD "\342\225\222" /* ╒ */
#define BOX_DU_SLR "\342\225\250" /* ╨ */
#define BOX_DD_SLR "\342\225\245" /* ╥ */
#define BOX_DL_SUD "\342\225\241" /* ╡ */
#define BOX_DR_SUD "\342\225\236" /* ╞ */
#define BOX_DLR_SU "\342\225\247" /* ╧ */
#define BOX_DLR_SD "\342\225\244" /* ╤ */
#define BOX_DLR_SUD "\342\225\252" /* ╪ */
#define BOX_DUD_SL "\342\225\242" /* ╢ */
#define BOX_DUD_SR "\342\225\237" /* ╟ */
#define BOX_DUD_SLR "\342\225\253" /* ╫ */
#define BOX_SLR "\342\224\200" /* ─ */
#define BOX_SUD "\342\224\202" /* │ */
#define BOX_SUL "\342\224\230" /* ┘ */
#define BOX_SUR "\342\224\224" /* └ */
#define BOX_SDL "\342\224\220" /* ┐ */
#define BOX_SDR "\342\224\214" /* ┌ */
#define BOX_SULR "\342\224\264" /* ┴ */
#define BOX_SDLR "\342\224\254" /* ┬ */
#define BOX_SUDL "\342\224\244" /* ┤ */
#define BOX_SUDR "\342\224\234" /* ├ */
#define BOX_SUDLR "\342\224\274" /* ┼ */
#else
#error : Box drawing characters are not defined for this charset.
""
#endif
#endif /* BOX_H */
You'd then use it very simply:
Code:
#include <stdio.h>
#include "box.h"
int main(void)
{
fputs(BOX_DDR BOX_DLR BOX_DLR BOX_DLR BOX_DDL "\n", stdout);
fputs(BOX_DUD "Box" BOX_DUD "\n", stdout);
fputs(BOX_DUR BOX_DLR BOX_DLR BOX_DLR BOX_DUL "\n", stdout);
return 0;
}
If you wish to override the detection, use -DCHARSET=CP437 or -DCHARSET=UTF8 when compiling.
For a C introduction course, having that header file available would be useful -- both for better output, but also as an illustration to the preprocessor, and compile-time options. I think.
Here is the simple Bash script I used to generate the definitions for any character set, in case you want to support some other character sets at compilation time:
Code:
#!/bin/bash
if [ $# -lt 1 ] || [ "$1" = "-h" ] || [ "$1" = "--help" ]; then
exec >&2
printf '\n'
printf 'Usage: %s [ -h | --help ]\n' "$0"
printf ' %s CHARSET(s)...\n' "$0"
printf '\n'
printf 'This script will output the box drawing strings in the\n'
printf 'specified charset(s) as preprocessor macro definitions.\n'
printf '\n'
exit 0
fi
while [ $# -gt 0 ]; do
CHARSET="$1"
shift 1
printf '/* Charset %s */\n' "$CHARSET"
while read NAME ORIGINAL ; do
printf '#define %-11s ' "$NAME"
STRING="$(printf '%s' "$ORIGINAL" | iconv -t "$CHARSET" 2>/dev/null | od -vt o1 | sed -e 's|^[0-9A-Fa-f]*||; /^ *$/d; s| |\\|g')" || exit $?
if [ -n "$STRING" ]; then
printf '"%s"\t/* %s */\n' "$STRING" "$ORIGINAL"
else
printf '""\t/* %s not supported in %s */\n' "$ORIGINAL" "$CHARSET"
fi
done << END
BOX_DLR ═
BOX_DUD ║
BOX_DUL ╝
BOX_DUR ╚
BOX_DDL ╗
BOX_DDR ╔
BOX_DUDL ╣
BOX_DUDR ╠
BOX_DULR ╩
BOX_DDLR ╦
BOX_DUDLR ╬
BOX_DU_SL ╜
BOX_DU_SR ╙
BOX_DD_SL ╖
BOX_DD_SR ╓
BOX_DL_SU ╛
BOX_DL_SD ╕
BOX_DR_SU ╘
BOX_DR_SD ╒
BOX_DU_SLR ╨
BOX_DD_SLR ╥
BOX_DL_SUD ╡
BOX_DR_SUD ╞
BOX_DLR_SU ╧
BOX_DLR_SD ╤
BOX_DLR_SUD ╪
BOX_DUD_SL ╢
BOX_DUD_SR ╟
BOX_DUD_SLR ╫
BOX_SLR ─
BOX_SUD │
BOX_SUL ┘
BOX_SUR └
BOX_SDL ┐
BOX_SDR ┌
BOX_SULR ┴
BOX_SDLR ┬
BOX_SUDL ┤
BOX_SUDR ├
BOX_SUDLR ┼
END
printf '\n'
done
If you write a library or just utility functions to draw boxes, with a little more effort you can define static const sets of strings, and switch between them at runtime. You could even have one using just basic ASCII characters (like -, +, |, #), that'd be ugly, but still work for users that don't really have any box-drawing characters available at that moment. You could try autodetecting, but a command-line option (with a default set at compile time based on the OS) would be sufficient, in my opinion.