How to change PHP’s eregi to preg_match [duplicate]

Perl-style regex patterns always need to be delimited. The very first character in the string is considered the delimiter, so something like this: function validate_email($email) { if (!preg_match(“/^[[:alnum:]][a-z0-9_.-]*@[a-z0-9.-]+\.[a-z]{2,4}$/i”, $email)) { echo ‘bad email’; } else { echo ‘good email’; } } The reason your initial attempt didn’t work is because it was trying to use … Read more

replace a part of a string with REGEXP in sqlite3

Sqlite by default does not provide regex_replace function. You need to load it as an extension. Here is how i managed to do it. Download this C code for the extension (icu_replace) Compile it using gcc –shared -fPIC -I sqlite-autoconf-3071100 icu_replace.c -o icu_replace.so And in sqlite3 runn following command post above mentioned command has run … Read more

Extra backslash needed in PHP regexp pattern

You need 4 backslashes to represent 1 in regex because: 2 backslashes are used for unescaping in a string (“\\\\” -> \\) 1 backslash is used for unescaping in the regex engine (\\ -> \) From the PHP doc, escaping any other character will result in the backslash being printed too1 Hence for \\\[, 1 … Read more

Unicode Regex; Invalid XML characters

I know this isn’t exactly an answer to your question, but it’s helpful to have it here: Regular Expression to match valid XML Characters: [\u0009\u000a\u000d\u0020-\uD7FF\uE000-\uFFFD] So to remove invalid chars from XML, you’d do something like // filters control characters but allows only properly-formed surrogate sequences private static Regex _invalidXMLChars = new Regex( @”(?<![\uD800-\uDBFF])[\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|[\x00-\x08\x0B\x0C\x0E-\x1F\x7F-\x9F\uFEFF\uFFFE\uFFFF]”, RegexOptions.Compiled); … Read more

What’s the technical reason for “lookbehind assertion MUST be fixed length” in regex?

Lookahead and lookbehind aren’t nearly as similar as their names imply. The lookahead expression works exactly the same as it would if it were a standalone regex, except it’s anchored at the current match position and it doesn’t consume what it matches. Lookbehind is a whole different story. Starting at the current match position, it … Read more

Is it possible to define a pattern and reuse it to capture multiple groups?

To reuse a pattern, you could use (?n) where n is the number of the group to repeat. For example, your actual pattern : (PAT),(PAT), … ,(PAT) can be replaced by: (PAT),(?1), … ,(?1) (?1) is the same pattern as (PAT)whatever PAT is. You may have multiple patterns: (PAT1),(PAT2),(PAT1),(PAT2),(PAT1),(PAT2),(PAT1),(PAT2) may be reduced to: (PAT1),(PAT2),(?1),(?2),(?1),(?2),(?1),(?2) or: … Read more

Match a^n b^n c^n (e.g. “aaabbbccc”) using regular expressions (PCRE)

Inspired by NullUserExceptions answer (which he already deleted as it failed for one case) I think I have found a solution myself: $regex = ‘~^ (?=(a(?-1)?b)c) a+(b(?-1)?c) $~x’; var_dump(preg_match($regex, ‘aabbcc’)); // 1 var_dump(preg_match($regex, ‘aaabbbccc’)); // 1 var_dump(preg_match($regex, ‘aaabbbcc’)); // 0 var_dump(preg_match($regex, ‘aaaccc’)); // 0 var_dump(preg_match($regex, ‘aabcc’)); // 0 var_dump(preg_match($regex, ‘abbcc’)); // 0 Try it yourself: … Read more