Update for Swift 4 (Xcode 9)
As of Swift 4, a “Emoji sequence” is treated as a single grapheme
cluster (according to the Unicode 9 standard):
let s = "a๐๐ผb๐จโโค๏ธโ๐โ๐จ"
print(s.count) // 4
so the other workarounds are not needed anymore.
(Old answer for Swift 3 and earlier:)
A possible option is to enumerate and count the
“composed character sequences” in the string:
let s = "a๐๐ผb๐จโโค๏ธโ๐โ๐จ"
var count = 0
s.enumerateSubstringsInRange(s.startIndex..<s.endIndex,
options: .ByComposedCharacterSequences) {
(char, _, _, _) in
if let char = char {
count += 1
}
}
print(count) // 4
Another option is to find the range of the composed character
sequences at a given index:
let s = "๐จโโค๏ธโ๐โ๐จ"
if s.rangeOfComposedCharacterSequenceAtIndex(s.startIndex) == s.characters.indices {
print("This is a single composed character")
}
As String
extension methods:
// Swift 2.2:
extension String {
var composedCharacterCount: Int {
var count = 0
enumerateSubstringsInRange(characters.indices, options: .ByComposedCharacterSequences) {
(_, _, _, _) in count += 1
}
return count
}
var isSingleComposedCharacter: Bool {
return rangeOfComposedCharacterSequenceAtIndex(startIndex) == characters.indices
}
}
// Swift 3:
extension String {
var composedCharacterCount: Int {
var count = 0
enumerateSubstrings(in: startIndex..<endIndex, options: .byComposedCharacterSequences) {
(_, _, _, _) in count += 1
}
return count
}
var isSingleComposedCharacter: Bool {
return rangeOfComposedCharacterSequence(at: startIndex) == startIndex..<endIndex
}
}
Examples:
"๐๐ผ".composedCharacterCount // 1
"๐๐ผ".characters.count // 2
"๐จโโค๏ธโ๐โ๐จ".composedCharacterCount // 1
"๐จโโค๏ธโ๐โ๐จ".characters.count // 4
"๐ฉ๐ช๐จ๐ฆ".composedCharacterCount // 2
"๐ฉ๐ช๐จ๐ฆ".characters.count // 1
As you see, the number of Swift characters (extended grapheme clusters) can be more or less than
the number of composed character sequences.