Replacing invalid UTF-8 characters by question marks, mbstring.substitute_character seems ignored

You can use mb_convert_encoding() or htmlspecialchars()‘s ENT_SUBSTITUTE option since PHP 5.4. Of cource you can use preg_match() too. If you use intl, you can use UConverter since PHP 5.5. Recommended substitute character for invalid byte sequence is U+FFFD. see “3.1.2 Substituting for Ill-Formed Subsequences” in UTR #36: Unicode Security Considerations for the details. When using … Read more

Multibyte trim in PHP?

The standard trim function trims a handful of space and space-like characters. These are defined as ASCII characters, which means certain specific bytes from 0 to 0100 0000. Proper UTF-8 input will never contain multi-byte characters that is made up of bytes 0xxx xxxx. All the bytes in proper UTF-8 multibyte characters start with 1xxx … Read more

How to install PHP mbstring on CentOS 6.2

do the following: sudo nano /etc/yum.repos.d/CentOS-Base.repo under the section updates, comment out the mirrorlist line (put a # in front of the line), then on a new line write: baseurl=http://centos.intergenia.de/$releasever/updates/$basearch/ now try: yum install php-mbstring (afterwards you’ll probably want to uncomment the mirrorlist and comment out the baseurl)