How to extract information between two unique words in a large text file

You can use regular expressions for that.

>>> st = "alpha here is my text bravo"
>>> import re
>>> re.findall(r'alpha(.*?)bravo',st)
[' here is my text ']

My test.txt file

alpha here is my line
yipee
bravo

Now using open to read the file and than applying regular expressions.

>>> f = open('test.txt','r')
>>> data = f.read()
>>> x = re.findall(r'alpha(.*?)bravo',data,re.DOTALL)
>>> x
[' here is my line\nyipee\n']
>>> "".join(x).replace('\n',' ')
' here is my line yipee '
>>>

More Related Contents:

Only extracting text from this element, not its children
Searching text in a PDF using Python?
Parsing text file (Python)
How to parse command line args in python [closed]
How to insert a space delimited text file into mysql using Python? [closed]
How do I parse a string to a float or int?
Extracting an attribute value with beautifulsoup
How to use Stanford Parser in NLTK using Python
How to read a text file into a list or an array with Python
How to bind self events in Tkinter Text widget after it will binded by Text widget?
How to efficiently parse fixed width files?
Equation parsing in Python
How can I find all matches to a regular expression in Python?
How to convert comma-delimited string to list in Python?
Expand Python Search Path to Other Source
Python/Json:Expecting property name enclosed in double quotes
How to find a particular JSON value by key?
Extracting a URL in Python
How to extract top-level domain name (TLD) from URL
sscanf in Python
How do you have shared log files under Windows?
Count distinct words from a Pandas Data Frame
How can I print bold text in Python?
Script to change ip address on windows
How can I remove extra whitespace from strings when parsing a csv file in Pandas?
Python – Finding word frequencies of list of words in text file
Get document DOCTYPE with BeautifulSoup
python – regex search and findall
Python string operation, extract text between html tags
automatically position text box in plot

More Related Contents:

Leave a Comment Cancel reply