Want to extract values from text file using regex

I’ve got a pattern that does what you want, but it isn’t pretty:

^"((?:\d\d?\d?\.){3}\d\d?\d?)" ((?:\d\d?\d?\.){3}\d\d?\d?) (-) (-) (\[\d\d\/\w+\/\d{4}(?::\d\d){3} -\d{4}\]) "(.*?)" (\d{3})

To break it down a bit (because it’s nasty):

^ makes it start at the beginning of the string.

((?:\d\d?\d?\.){3}\d\d?\d?) will match and capture the first IP address, with each element being composed of between 1 and 3 digits. The same pattern is then used to match the second IP address as well.

(-) will capture the hyphens – not sure why you want it, but it’s in your desired input.

(\[\d\d\/\w+\/\d{4}(?::\d\d){3} -\d{4}\]) captures the timestamp (the bit in the square brackets).

"(.*?)" will match and capture the text string.

Finally, (\d{3}) will capture the HTTP status code.

Taken together, this pattern will match the stuff you want from the string you provided.

Leave a Comment