From RogueBasin
Jump to navigation Jump to search

The version of the popular ncurses library that handles wide characters, or Unicode, is surprisingly difficult to get working with C programs. This article is intended to be a checklist for developers so that they can effectively use the library.

As with most development articles, this will be a bit too specific in terms of platform. This article was written with respect to a Linux development platform running Debian Linux. To the extent that your platform is different, there are likely to be important things I don't know about getting development with this library working.

First, you have to be using a UTF-8 locale (Mine is en_US.UTF-8; I imagine others will have different choices). Type 'locale' at a shell prompt to be sure.

Second, you have to have a term program that can display non-ASCII characters. Most of them can handle that these days, but there are still a few holdouts. rxvt-unicode and konsole, popular term programs on Linux, are both good.

Third, you have to use a console font which contains glyphs for the non-ASCII characters that you use. Again, most default console fonts can handle that these days, but it's still another gotcha, and if you routinely pick some random blambot font to use on the console you're likely to miss out.

Try typing a non-ASCII character at the console prompt just to make sure you see it. If you don't know how to type non-ASCII characters from the keyboard, that's beyond the scope of what's covered here and you'll need to go and read some documentation and possibly set some keyboard preferences. Anyway, if you see it, then you've got the first, second, and third things covered.

Fourth, you have to have ncurses configured to deal with wide characters. For most linux distributions, that means: Your ncurses distribution is based on version 5.4 or later (mine is 5.7) but *NOT* on version 11. I have no idea where version 11 came from, but it's definitely a fork based on a pre-5.4 ncurses version, and hasn't got the Unicode extensions. Also, you must have the 'ncursesw' versions, which are configured for wide characters. How this works depends on your distribution, but for Debian, you have to get both the 'ncursesw' package to run ncurses programs that use wide characters and the 'ncursesw-dev' package to compile them. The current versions are ncursesw5 and ncursesw5-dev. But there's an apparent packaging mistake where the wide-character dev package, ncursesw5-dev, does not contain any documentation for the wide-character functions! If you want the man pages for the wide-character curses functions, you have to also install ncurses5-dev, which comes with a "wrong" version of ncurses that *doesn't* have the wide-character functions. Don't think too much about why anyone would do this; you'll only break your head. The short version of the story is that you pretty much have to install ncurses5, ncurses5-dev, ncursesw5, and ncursesw5-dev, all at the same time, and then just be very very careful about not ever using the library versions that don't actually have the wide character functions in them.

Fifth, your program has to call "setlocale" immediately after it starts up, before it starts curses or does any I/O. If it doesn't call setlocale, your program will remain in the 'C' locale, which assumes that the terminal cannot display any characters outside the ASCII set. If you do any input or output, or start curses before calling setlocale, you will force your runtime to commit to some settings before it knows the locale, and then setlocale when you do call it won't have all of the desired effects. Your program is likely to print ASCII transliterations for characters outside the ASCII range if this happens.

Sixth, you have to #define _XOPEN_SOURCE_EXTENDED in your source before any library #include statements. The wide character curses functions are part of a standard called the XOPEN standard, and preprocessing conditionals check this symbol to see whether your program expects to use that standard. If this symbol is found, and you've included the RIGHT headers (see item Seven) then macroexpansion will configure the headers you include to actually have definitions for the documented wide-character functions. But it's not just the 'curses' headers that depend on it; you will get bugs and linking problems with other libraries if you have this symbol defined for some includes but not others, so it's very important to put it before ALL include statements.

Unfortunately, it's not mentioned in the man pages of any of the functions that won't link if you don't do it. You have to hunt through a bunch of not-very-related man pages before you find the only page that mentions it.

Seventh, you have to include the right header file rather than the one the documentation tells you to include. This isn't a joke. The man page tells you that you have to include "curses.h" to get any of the wide-character functions working, but the header that actually contains the wide-character function definitions is "ncursesw/curses.h". I hope this gets fixed soon but it's been this way for at least a couple of years so some idiot may think this isn't a bug.

Eighth, you have to use the -lncursesw compiler option (as opposed to the -lncurses option) when you're linking your executable. But DON'T use -Werror or -Wall at the same time on any gcc invocation where you're using -lncursesw. Although these are standard build options which I otherwise recommend because they help you keep your code clean, there's a bug somewhere in the curses headers which causes gcc to fail to find definitions for the wide curses functions when you use these options with -lncursesw.

Once you cover these eight points, you should be able to develop a roguelike game in C with Unicode characters using the ncursesw library. Best of luck.