Generating CSV file for Excel, how to have a newline inside a value

You should have space characters at the start of fields ONLY where the space characters are part of the data. Excel will not strip off leading spaces. You will get unwanted spaces in your headings and data fields. Worse, the " that should be “protecting” that line-break in the third column will be ignored because it is not at the start of the field.

If you have non-ASCII characters (encoded in UTF-8) in the file, you should have a UTF-8 BOM (3 bytes, hex EF BB BF) at the start of the file. Otherwise Excel will interpret the data according to your locale’s default encoding (e.g. cp1252) instead of utf-8, and your non-ASCII characters will be trashed.

Following comments apply to Excel 2003, 2007 and 2013; not tested on Excel 2000

If you open the file by double-clicking on its name in Windows Explorer, everything works OK.

If you open it from within Excel, the results vary:

  1. You have only ASCII characters in the file (and no BOM): works.
  2. You have non-ASCII characters (encoded in UTF-8) in the file, with a UTF-8 BOM at the start: it recognises that your data is encoded in UTF-8 but it ignores the csv extension and drops you into the Text Import not-a-Wizard, unfortunately with the result that you get the line-break problem.

Options include:

  1. Train the users not to open the files from within Excel 🙁
  2. Consider writing an XLS file directly … there are packages/libraries available for doing that in Python/Perl/PHP/.NET/etc

Leave a Comment