regex - w3toppers.com

Regex how to match an optional character

Use [A-Z]? to make the letter optional. {1} is redundant. (Of course you could also write [A-Z]{0,1} which would mean the same, but that’s what the ? is there for.) You could improve your regex to ^([0-9]{5})+\s+([A-Z]?)\s+([A-Z])([0-9]{3})([0-9]{3})([A-Z]{3})([A-Z]{3})\s+([A-Z])[0-9]{3}([0-9]{4})([0-9]{2})([0-9]{2}) And, since in most regex dialects, \d is the same as [0-9]: ^(\d{5})+\s+([A-Z]?)\s+([A-Z])(\d{3})(\d{3})([A-Z]{3})([A-Z]{3})\s+([A-Z])\d{3}(\d{4})(\d{2})(\d{2}) But: do you really need … Read more

Regular expression to allow spaces between words

tl;dr Just add a space in your character class. ^[a-zA-Z0-9_ ]*$ Now, if you want to be strict… The above isn’t exactly correct. Due to the fact that * means zero or more, it would match all of the following cases that one would not usually mean to match: An empty string, “”. A … Read more

How to use regex with find command?

find . -regextype sed -regex “.*/[a-f0-9\-]\{36\}\.jpg” Note that you need to specify .*/ in the beginning because find matches the whole path. Example: susam@nifty:~/so$ find . -name “*.jpg” ./foo-111.jpg ./test/81397018-b84a-11e0-9d2a-001b77dc0bed.jpg ./81397018-b84a-11e0-9d2a-001b77dc0bed.jpg susam@nifty:~/so$ susam@nifty:~/so$ find . -regextype sed -regex “.*/[a-f0-9\-]\{36\}\.jpg” ./test/81397018-b84a-11e0-9d2a-001b77dc0bed.jpg ./81397018-b84a-11e0-9d2a-001b77dc0bed.jpg My version of find: $ find –version find (GNU findutils) 4.4.2 Copyright (C) 2007 … Read more

How do I grep for all non-ASCII characters?

You can use the command: grep –color=”auto” -P -n “[\x80-\xFF]” file.xml This will give you the line number, and will highlight non-ascii chars in red. In some systems, depending on your settings, the above will not work, so you can grep by the inverse grep –color=”auto” -P -n “[^\x00-\x7F]” file.xml Note also, that the important … Read more

How to use conditionals when replacing in Notepad++ via regex

The syntax in the conditional replacement is (?{GROUP_MATCHED?}REPLACEMENT_IF_YES:REPLACEMENT_IF_NO) The { and } are necessary to avoid ambiguity when you deal with groups higher than 9 and with named capture groups. Since Notepad++ uses Boost-Extended Format String Syntax, see this Boost documentation: The character ? begins a conditional expression, the general form is: ?Ntrue-expression:false-expression where N … Read more

Replace single backslash in R

One quite universal solution is gsub(“\\\\”, “”, str) Thanks to the comment above.

Extract info inside all parenthesis in R

Here is an example: > gsub(“[\$\$]”, “”, regmatches(j, gregexpr(“\$.*?\$”, j))[[1]]) [1] “wonder” “groan” “Laugh” I think this should work well: > regmatches(j, gregexpr(“(?=\$).*?(?<=\$)”, j, perl=T))[[1]] [1] “(wonder)” “(groan)” “(Laugh)” but the results includes parenthesis… why? This works: regmatches(j, gregexpr(“(?<=\$).*?(?=\$)”, j, perl=T))[[1]] Thanks @MartinMorgan for the comment.

Decimal number regular expression, where digit after decimal is optional

Use the following: /^\d*\.?\d*$/ ^ – Beginning of the line; \d* – 0 or more digits; \.? – An optional dot (escaped, because in regex, . is a special character); \d* – 0 or more digits (the decimal part); $ – End of the line. This allows for .5 decimal rather than requiring the leading … Read more

Regex Last occurrence?

Your negative lookahead solution would e.g. be this: \$?:.(?!\$)+$ See it here on Regexr

How to check that a string is a palindrome using regular expressions?

The answer to this question is that “it is impossible”. More specifically, the interviewer is wondering if you paid attention in your computational theory class. In your computational theory class you learned about finite state machines. A finite state machine is composed of nodes and edges. Each edge is annotated with a letter from a … Read more