preg_match and UTF-8 in PHP

Although the u modifier makes both the pattern and subject be interpreted as UTF-8, the captured offsets are still counted in bytes.

You can use mb_strlen to get the length in UTF-8 characters rather than bytes:

$str = "\xC2\xA1Hola!";
preg_match('/H/u', $str, $a_matches, PREG_OFFSET_CAPTURE);
echo mb_strlen(substr($str, 0, $a_matches[0][1]));

More Related Contents:

How to decode Unicode escape sequences like “\u00ed” to proper UTF-8 encoded characters?
Matching Unicode letter characters in PCRE/PHP
Reference: Why are my “special” Unicode characters encoded weird using json_encode?
Fixing broken UTF-8 encoding
PHP: Convert unicode codepoint to UTF-8
Storing and displaying unicode string (हिन्दी) using PHP and MySQL
UTF-8 to Unicode Code Points
can I get the unicode value of a character or vise versa with php?
Trim unicode whitespace in PHP 5.2
strlen() and UTF-8 encoding
php regex word boundary matching in utf-8
UTF8 Filenames in PHP and Different Unicode Encodings
Trim unicode whitespace in PHP
How can I output a UTF-8 CSV in PHP that Excel will read properly?
PHP regular expressions: No ending delimiter ‘^’ found in
How to remove multiple UTF-8 BOM sequences
PHP: Convert any string to UTF-8 without knowing the original character set, or at least try
MySQL and PHP: UTF-8 with Cyrillic characters [duplicate]
Convert utf8-characters to iso-88591 and back in PHP
How to extract text from the PDF document? [closed]
Any way to return PHP `json_encode` with encode UTF-8 and not Unicode? [duplicate]
How to get the character from unicode code point in PHP?
Convert all types of smart quotes with PHP
Multibyte trim in PHP?
Getting data with UTF-8 charset from MSSQL server using PHP FreeTDS extension
is PHP str_word_count() multibyte safe?
How to skip invalid characters in XML file using PHP
How can I sort an array of UTF-8 strings in PHP?
How do I match accented characters with PHP preg?
How to keep json_encode() from dropping strings with invalid characters

More Related Contents:

Leave a Comment Cancel reply