Superp Posted March 7, 2022 at 10:18 AM Share Posted March 7, 2022 at 10:18 AM The OLED 128x64 Bricklet 2.0 has an embedded font. Code points 32 through 126 overlap with ASCII. Is there a table (somewhere) mapping UTF to the other codepoints, where there is a match? E.g. "½" => 171 ? Quote Link to comment Share on other sites More sharing options...
rtrbt Posted March 7, 2022 at 11:07 AM Share Posted March 7, 2022 at 11:07 AM Hi, The embedded font is Code page 437. For the openHAB bindings I've used the following mapping to convert from UTF-16: static final List<Integer> CP437 = Arrays.asList(0x0000, 0x263A, 0x263B, 0x2665, 0x2666, 0x2663, 0x2660, 0x2022, 0x25D8, 0x25CB, 0x25D9, 0x2642, 0x2640, 0x266A, 0x266B, 0x263C, 0x25BA, 0x25C4, 0x2195, 0x203C, 0x00B6, 0x00A7, 0x25AC, 0x21A8, 0x2191, 0x2193, 0x2192, 0x2190, 0x221F, 0x2194, 0x25B2, 0x25BC, 0x0020, 0x0021, 0x0022, 0x0023, 0x0024, 0x0025, 0x0026, 0x0027, 0x0028, 0x0029, 0x002A, 0x002B, 0x002C, 0x002D, 0x002E, 0x002F, 0x0030, 0x0031, 0x0032, 0x0033, 0x0034, 0x0035, 0x0036, 0x0037, 0x0038, 0x0039, 0x003A, 0x003B, 0x003C, 0x003D, 0x003E, 0x003F, 0x0040, 0x0041, 0x0042, 0x0043, 0x0044, 0x0045, 0x0046, 0x0047, 0x0048, 0x0049, 0x004A, 0x004B, 0x004C, 0x004D, 0x004E, 0x004F, 0x0050, 0x0051, 0x0052, 0x0053, 0x0054, 0x0055, 0x0056, 0x0057, 0x0058, 0x0059, 0x005A, 0x005B, 0x005C, 0x005D, 0x005E, 0x005F, 0x0060, 0x0061, 0x0062, 0x0063, 0x0064, 0x0065, 0x0066, 0x0067, 0x0068, 0x0069, 0x006A, 0x006B, 0x006C, 0x006D, 0x006E, 0x006F, 0x0070, 0x0071, 0x0072, 0x0073, 0x0074, 0x0075, 0x0076, 0x0077, 0x0078, 0x0079, 0x007A, 0x007B, 0x007C, 0x007D, 0x007E, 0x2302, 0x00C7, 0x00FC, 0x00E9, 0x00E2, 0x00E4, 0x00E0, 0x00E5, 0x00E7, 0x00EA, 0x00EB, 0x00E8, 0x00EF, 0x00EE, 0x00EC, 0x00C4, 0x00C5, 0x00C9, 0x00E6, 0x00C6, 0x00F4, 0x00F6, 0x00F2, 0x00FB, 0x00F9, 0x00FF, 0x00D6, 0x00DC, 0x00A2, 0x00A3, 0x00A5, 0x20A7, 0x0192, 0x00E1, 0x00ED, 0x00F3, 0x00FA, 0x00F1, 0x00D1, 0x00AA, 0x00BA, 0x00BF, 0x2310, 0x00AC, 0x00BD, 0x00BC, 0x00A1, 0x00AB, 0x00BB, 0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0x2561, 0x2562, 0x2556, 0x2555, 0x2563, 0x2551, 0x2557, 0x255D, 0x255C, 0x255B, 0x2510, 0x2514, 0x2534, 0x252C, 0x251C, 0x2500, 0x253C, 0x255E, 0x255F, 0x255A, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256C, 0x2567, 0x2568, 0x2564, 0x2565, 0x2559, 0x2558, 0x2552, 0x2553, 0x256B, 0x256A, 0x2518, 0x250C, 0x2588, 0x2584, 0x258C, 0x2590, 0x2580, 0x03B1, 0x00DF, 0x0393, 0x03C0, 0x03A3, 0x03C3, 0x00B5, 0x03C4, 0x03A6, 0x0398, 0x03A9, 0x03B4, 0x221E, 0x03C6, 0x03B5, 0x2229, 0x2261, 0x00B1, 0x2265, 0x2264, 0x2320, 0x2321, 0x00F7, 0x2248, 0x00B0, 0x2219, 0x00B7, 0x221A, 0x207F, 0x00B2, 0x25A0, 0x00A0); public static String utf16ToCP437(String utf16) { StringBuilder result = new StringBuilder(); utf16.codePoints().map(c -> CP437.indexOf(c)).map(i -> i == -1 ? 0xDB : i) .forEach(c -> result.append((char) c)); return result.toString(); } The lookup table should work in any language. Many programming language standard libraries can do this conversion. For example in Python: 'test ½'.encode('cp437', 'replace') will return b'test \xab' Using 'replace' will insert encoded '?' chars if a non-encodeable unicode character is encountered. 1 Quote Link to comment Share on other sites More sharing options...
Superp Posted March 7, 2022 at 12:10 PM Author Share Posted March 7, 2022 at 12:10 PM (edited) Genius. This saves me a lot of work. I completely missed that it is the character set of the IBM PC. In Ruby: 'test ½'.encode('cp437') => "test \xAB" 'test ½'.encode('cp437', :replace => '?') => "test \xAB" 'test ‹'.encode('cp437', :replace => '?') => "test ?" Thanks! Edited March 7, 2022 at 12:22 PM by Superp clarify invalid chars in example Quote Link to comment Share on other sites More sharing options...
Superp Posted March 17, 2022 at 07:32 AM Author Share Posted March 17, 2022 at 07:32 AM ...but things are not that simple. The IBM437 encoding in Ruby (and some other languages) does not include code points 0..31 and 127, which traditionally were control characters like bell and tab. This means you can either: Build your own complete lookup table with 256 code points, duplicating the encoding already available on your system, but adding 0..31 and 127. Not dry. Use the encoding, and miss •, ○, ♫, →, ♥, ⌂ and other useful characters. Use the encoding, with a fallback table for 0..31 and 127. I opted to do 3. Commit here. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.