By HTML specifications, browsers are required to canonicalize line breaks in user input to CR LF (\r\n
), and I don’t think any browser gets this wrong. Reference: clause 17.13.4 Form content types in the HTML 4.01 spec.
In HTML5 drafts, the situation is more complicated, since they also deal with the processes inside a browser, not just the data that gets sent to a server-side form handler when the form is submitted. According to them (and browser practice), the textarea
element value exists in three variants:
- the raw value as entered by the user, unnormalized; it may contain CR, LF, or CR LF pair;
- the internal value, called “API value”, where line breaks are normalized to LF (only);
- the submission value, where line breaks are normalized to CR LF pairs, as per Internet conventions.