how to detect invalid utf8 unicode/binary in a text file
Assuming you have your locale set to UTF-8 (see locale output), this works well to recognize invalid UTF-8 sequences: grep -axv ‘.*’ file.txt Explanation (from grep man page): -a, –text: treats file as text, essential prevents grep to abort once finding an invalid byte sequence (not being utf8) -v, –invert-match: inverts the output showing lines … Read more