fasta
Remove line breaks in a FASTA file
This awk program: % awk ‘!/^>/ { printf “%s”, $0; n = “\n” } /^>/ { print n $0; n = “” } END { printf “%s”, n } ‘ input.fasta Will yield: >accession1 ATGGCCCATGGGATCCTAGC >accession2 GATATCCATGAAACGGCTTA Explanation: On lines that don’t start with a >, print the line without a line break and store … Read more
parsing a fasta file using a generator ( python )
Have you considered using BioPython. They have a sequence reader that can read fasta files. And if you are interested in coding one yourself, you can take a look at BioPython’s code. Edit: Code added def read_fasta(fp): name, seq = None, [] for line in fp: line = line.rstrip() if line.startswith(“>”): if name: yield (name, … Read more