UTF-8 file output in R

The problem is due to some R-Windows special behaviour (using the default system coding / or using some system write functions; I do not know the specifics but the behaviour is actually known)

To write text UTF8 encoding on Windows one has to use the useBytes=T options in functions like writeLines or readLines:

txt <- "在"
writeLines(txt, "test.txt", useBytes=T)

readLines("test.txt", encoding="UTF-8")
[1] "在"

Find here a really well written article by Kevin Ushey: http://kevinushey.github.io/blog/2018/02/21/string-encoding-and-r/ going into much more detail.

Leave a Comment