Remove Characters from URL with htaccess

The URI is url-decoded before it’s sent through the rewrite engine, so you want to match the actual characters and not their encoded counterparts:

RewriteRule ^(.*),(.*)$ /$1$2 [L]
RewriteRule ^(.*):(.*)$ /$1$2 [L]
RewriteRule ^(.*)\'(.*)$ /$1$2 [L]
RewriteRule ^(.*)\"(.*)$ /$1$2 [L]
# etc...

RewriteCond %{ENV:REDIRECT_STATUS} 200
RewriteRule ^(.*)$ http://www.mysite.com/$1 [L,R=301]

The redirect status lets mod rewrite know that if any of the above rules got applied (thus making the internal redirect status value = 200) then we need to redirect, but we won’t reach that part of the rules until it’s cleared all of the special character checks.

You’d want these rules all before any of the redirects so that the rules can loop and remove multiple instances of any of those characters. Then, once there are no more special characters, the rewrite engine can trickle down to where your redirects are.

I’d suggest that you remove the mod_alias RedirectMatch directive and replace it with a rewrite rule. Sometimes combining the 2 modules and having both of them affect a single URI can lead to unexpected results. so before all of the above rules, you’d have:

RewriteRule ^Shop/(.*)$ /$1 [L]

adding the removal of /Shop/ in the chain of special characters. Then your last rule would follow:

RewriteCond %{REQUEST_URI} [A-Z]
RewriteCond %{REQUEST_FILENAME} !\.(?:png|gif|ico|swf|jpg|jpeg|js|css|php|pdf)$
RewriteRule (.*) ${lc:http://www.mysite.com/$1} [R=301,L]

Leave a Comment