character-encoding - w3toppers.com

UTF-8 character encoding battles json_encode() [duplicate]

// Create an empty array for the encoded resultset $rows = array(); // Loop over the db resultset and put encoded values into $rows while($row = mysql_fetch_assoc($result)) { $rows[] = array_map(‘utf8_encode’, $row); } // Output $rows echo json_encode($rows);

All inclusive Charset to avoid “java.nio.charset.MalformedInputException: Input length = 1”?

You probably want to have a list of supported encodings. For each file, try each encoding in turn, maybe starting with UTF-8. Every time you catch the MalformedInputException, try the next encoding.

What is the range of Unicode Printable Characters?

See, http://en.wikipedia.org/wiki/Unicode_control_characters You might want to look especially at C0 and C1 control character http://en.wikipedia.org/wiki/C0_and_C1_control_codes The wiki says, the C0 control character is in the range U+0000—U+001F and U+007F (which is the same range as ASCII) and C1 control character is in the range U+0080—U+009F other than C-control character, Unicode also has hundreds of formatting … Read more

C++ Visual Studio character encoding issues

Before I go any further, I should mention that what you are doing is not c/c++ compliant. The specification states in 2.2 what character sets are valid in source code. It ain’t much in there, and all the characters used are in ascii. So… Everything below is about a specific implementation (as it happens, VC2008 … Read more

Save all files in Visual Studio project as UTF-8

Since you’re already in Visual Studio, why not just simply write the code? foreach (var f in new DirectoryInfo(@”…”).GetFiles(“*.cs”, SearchOption.AllDirectories)) { string s = File.ReadAllText(f.FullName); File.WriteAllText (f.FullName, s, Encoding.UTF8); } Only three lines of code! I’m sure you can write this in less than a minute 🙂

PHP messing with HTML Charset Encoding

You have probably come to mix encoding types. For example. A page that is sent as iso-8859-1, but get UTF-8 text encoding from MySQL or XML would typically fail. To solve this problem you must keep control on input ecodings type in relation to the type of encoding you have chosen to use internal. If … Read more

UTF-8 safe equivalent of ord or charCodeAt() in PHP

mbstring version: function utf8_char_code_at($str, $index) { $char = mb_substr($str, $index, 1, ‘UTF-8’); if (mb_check_encoding($char, ‘UTF-8’)) { $ret = mb_convert_encoding($char, ‘UTF-32BE’, ‘UTF-8’); return hexdec(bin2hex($ret)); } else { return null; } } using htmlspecialchars and htmlspecialchars_decode for getting one character: function utf8_char_code_at($str, $index) { $char=””; $str_index = 0; $str = utf8_scrub($str); $len = strlen($str); for ($i = … Read more

python 3 open() default encoding

The default UTF-8 encoding of Python 3 only extends to byte->str conversions. open() instead uses your environment to choose an appropriate encoding: From the Python 3 docs for open(): encoding is the name of the encoding used to decode or encode the file. This should only be used in text mode. The default encoding is … Read more

How to detect the right encoding for read.csv?

First of all based on more general question on StackOverflow it is not possible to detect encoding of file in 100% certainty. I’ve struggle this many times and come to non-automatic solution: Use iconvlist to get all possible encodings: codepages <- setNames(iconvlist(), iconvlist()) Then read data using each of them x <- lapply(codepages, function(enc) try(read.table(“encoding.asc”, … Read more

Detect file encoding in PHP

Try using the mb_detect_encoding function. This function will examine your string and attempt to “guess” what its encoding is. You can then convert it as desired. As brulak suggested, however, you’re probably better off converting to UTF-8 rather than from, to preserve the data you’re transmitting.