How can I get the Unicode code point(s) of a Character?

From what I can gather in the documentation, they want you to get Character values from a String because it gives context. Is this Character encoded with UTF8, UTF16, or 21-bit code points (scalars)?

If you look at how a Character is defined in the Swift framework, it is actually an enum value. This is probably done due to the various representations from String.utf8, String.utf16, and String.unicodeScalars.

It seems they do not expect you to work with Character values but rather Strings and you as the programmer decide how to get these from the String itself, allowing encoding to be preserved.

That said, if you need to get the code points in a concise manner, I would recommend an extension like such:

extension Character
{
    func unicodeScalarCodePoint() -> UInt32
    {
        let characterString = String(self)
        let scalars = characterString.unicodeScalars

        return scalars[scalars.startIndex].value
    }
}

Then you can use it like so:

let char : Character = "A"
char.unicodeScalarCodePoint()

In summary, string and character encoding is a tricky thing when you factor in all the possibilities. In order to allow each possibility to be represented, they went with this scheme.

Also remember this is a 1.0 release, I’m sure they will expand Swift’s syntactical sugar soon.

Leave a Comment