In short, because this has been discussed a thousand times before:
- PHP holds a string, say
"漢字"
, encoded in UTF-8. The bytes for this areE6 BC A2 E5 AD 97
. - It sends this string over a database connection which is set to
latin1
. - The database receives the bytes
E6 BC A2 E5 AD 97
, thinking those representlatin1
characters. - The database stores the characters
æ¼¢å
(the characters thatE6 BC A2 E5 AD 97
maps to inlatin1
). - The same process reversed makes PHP receive the same bytes, which it then treats as UTF-8. The roundtrip works fine for PHP, even though the database doesn’t treat the characters as it should.
So the problem here was that the database connection was set incorrectly when the data was entered into the database. You’ll have to convert the data in the database to the correct characters. Try this:
SELECT CONVERT(BINARY CONVERT(field_name USING latin1) USING utf8) FROM table_name
Maybe utf8
isn’t what you need here, experiment. If that works, change this into an UPDATE
statement to update the data permanently.