VBA Output to file using UTF-16

Your point about UTF-8 not being able to store all characters you need is invalid.
UTF-8 is able to store every character defined in the Unicode standard.
The only difference is that, for text in certain languages, UTF-8 can take more space to store its codepoints than, say, UTF-16. The opposite is also true: for certain other languages, such as English, using UTF-8 saves space.

VB6 and VBA, although store strings in memory in Unicode, implicitly switch to ANSI (using the current system code page) when doing file IO. The resulting file you get is NOT in UTF-8. It is in your current system codepage, which, as you can discover in this helpful article, looks just like UTF-8 if you’re from USA.

Try:

Dim s As String
s = "<?xml version='1.0' encoding='utf-16'?>"
s = s & ChrW$(&H43F&) & ChrW$(&H440&) & ChrW$(&H43E&) & ChrW$(&H432&) & ChrW$(&H435&) & ChrW$(&H440&) & ChrW$(&H43A&) & ChrW$(&H430&)

Dim b() As Byte
b = s

Open "Unicode.txt" For Binary Access Write As #1
Put #1, , b
Close #1

And if you absolutely must have UTF-8, you can make yourself some:

Option Explicit

Private Declare Function WideCharToMultiByte Lib "kernel32.dll" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long, ByRef lpMultiByteStr As Byte, ByVal cchMultiByte As Long, ByVal lpDefaultChar As String, ByRef lpUsedDefaultChar As Long) As Long

Private Const CP_UTF8 As Long = 65001
Private Const ERROR_INSUFFICIENT_BUFFER As Long = 122&


Public Function ToUTF8(s As String) As Byte()

  If Len(s) = 0 Then Exit Function


  Dim ccb As Long
  ccb = WideCharToMultiByte(CP_UTF8, 0, StrPtr(s), Len(s), ByVal 0&, 0, vbNullString, ByVal 0&)

  If ccb = 0 Then
    Err.Raise 5, , "Internal error."
  End If

  Dim b() As Byte
  ReDim b(1 To ccb)

  If WideCharToMultiByte(CP_UTF8, 0, StrPtr(s), Len(s), b(LBound(b)), ccb, vbNullString, ByVal 0&) = 0 Then
    Err.Raise 5, , "Internal error."
  Else
    ToUTF8 = b
  End If

End Function
Sub Test()
  Dim s As String
  s = "<?xml version='1.0' encoding='utf-8'?>"
  s = s & ChrW$(&H43F&) & ChrW$(&H440&) & ChrW$(&H43E&) & ChrW$(&H432&) & ChrW$(&H435&) & ChrW$(&H440&) & ChrW$(&H43A&) & ChrW$(&H430&)

  Dim b() As Byte
  b = ToUTF8(s)

  Open "utf-8.txt" For Binary Access Write As #1
  Put #1, , b
  Close #1
End Sub

Leave a Comment