UNICODE and Printer driver

What is UNICODE?

Shortly UNICODE just a standard, where each character in each language has it’s own number. As result, one string may contain characters from many languages. In Regular, old style code pages, each character also has its own number, but one string may have characters for one language only. This is mostly because of limitation of regular, old code pages – no more than 256 characters. Usually¬† much less – we should exclude numbers, spaces and system symbols.

UNICODE may have up to 65536 symbols. This is great, it makes life much simpler. We may just write what we want and in relation to printing, we can print what we want. Without taking care about code pages and splitting strings.

OK, this is some background. I only want to tell few words regarding UNICODE and printer drivers. When we started priPrinter project, we checked text output on all European languages, Japanese, Russian, etc. Text may come to driver as regular text or as glyph indices. Basically there is no big difference for us, just use different flags or another function. Everything was easy. But we received few letters from our customers that some local symbols are not handled properly. This was for Arabic, Devanagari, Gujarati and Thai languages. Problem here, that this is really hard for us to see the problem Рthese characters for European eye looks like icons,symbols and hard to see any problem. We found that some characters or some combination of characters should be handled in very special way. Sometimes, this is true for text form, sometimes for glyph indices. For instance Thai characters 0E4D 0E32 should be replaced to 0E33. This was pretty strange for us and unexpected. Even text in notepad may be correct (it may contain that 0E33), but printer driver receives 0E4D 0E32 and we should change it back to 0E33 since regular functions can handle 0E4D 0E32.

SARA_AMThis was just one example, and solution is kind of easy. Arabic, Devanagari, Gujarati have similar problems but they should be handled differently. Of course, when we discovered this problem in one language, we are prepared for another languages. What is this, some weakness in Windows UNICODE engine, our own problem? Who knows, but we know that even obvious things may give you some unusual problems.

Post a comment