parsing a fasta file using a generator ( python )

Have you considered using BioPython. They have a sequence reader that can read fasta files. And if you are interested in coding one yourself, you can take a look at BioPython’s code.

Edit: Code added

def read_fasta(fp):
    name, seq = None, []
    for line in fp:
        line = line.rstrip()
        if line.startswith(">"):
            if name: yield (name, ''.join(seq))
            name, seq = line, []
        else:
            seq.append(line)
    if name: yield (name, ''.join(seq))

with open('f.fasta') as fp:
    for name, seq in read_fasta(fp):
        print(name, seq)

Leave a Comment