How can you strip non-ASCII characters from a string? (in C#)
string s = “søme string”; s = Regex.Replace(s, @”[^\u0000-\u007F]+”, string.Empty);
string s = “søme string”; s = Regex.Replace(s, @”[^\u0000-\u007F]+”, string.Empty);
7 bit ASCII? If your Tardis just landed in 1963, and you just want the 7 bit printable ASCII chars, you can rip out everything from 0-31 and 127-255 with this: $string = preg_replace(‘/[\x00-\x1F\x7F-\xFF]/’, ”, $string); It matches anything in range 0-31, 127-255 and removes it. 8 bit extended ASCII? You fell into a Hot … Read more
Your ”.join() expression is filtering, removing anything non-ASCII; you could use a conditional expression instead: return ”.join([i if ord(i) < 128 else ‘ ‘ for i in text]) This handles characters one by one and would still use one space per character replaced. Your regular expression should just replace consecutive non-ASCII characters with a space: … Read more
From here: The function ord() gets the int value of the char. And in case you want to convert back after playing with the number, function chr() does the trick. >>> ord(‘a’) 97 >>> chr(97) ‘a’ >>> chr(ord(‘a’) + 3) ‘d’ >>> In Python 2, there was also the unichr function, returning the Unicode character … Read more
ANSI encoding is a slightly generic term used to refer to the standard code page on a system, usually Windows. It is more properly referred to as Windows-1252 on Western/U.S. systems. (It can represent certain other Windows code pages on other systems.) This is essentially an extension of the ASCII character set in that it … Read more
For ASCII characters in the range [ -~] on Python 2: >>> import binascii >>> bin(int(binascii.hexlify(‘hello’), 16)) ‘0b110100001100101011011000110110001101111’ In reverse: >>> n = int(‘0b110100001100101011011000110110001101111’, 2) >>> binascii.unhexlify(‘%x’ % n) ‘hello’ In Python 3.2+: >>> bin(int.from_bytes(‘hello’.encode(), ‘big’)) ‘0b110100001100101011011000110110001101111’ In reverse: >>> n = int(‘0b110100001100101011011000110110001101111’, 2) >>> n.to_bytes((n.bit_length() + 7) // 8, ‘big’).decode() ‘hello’ To support all … Read more
When it comes to using ASCII art with the grid-template-areas property, there is an important limitation currently in place: Named grid areas must be rectangular. In other words, tetris-shaped grid areas of the same name are not allowed. This behavior is defined in two parts of the spec. 7.3. Named Areas: the grid-template-areas property If … Read more
My favorite way to read a small file is to use a BufferedReader and a StringBuilder. It is very simple and to the point (though not particularly effective, but good enough for most cases): BufferedReader br = new BufferedReader(new FileReader(“file.txt”)); try { StringBuilder sb = new StringBuilder(); String line = br.readLine(); while (line != null) … Read more
A simple implementation of the Caesar Cipher is to use a string of valid characters and the remainder operator, %. char Encrypt_Via_Caesar_Cipher(char letter, unsigned int shift) { static const std::string vocabulary = “0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ”; const std::string::size_type position = vocabulary.find(letter); char c = letter; if (position != std::string::npos) { const std::string::size_type length = vocabulary.length(); c = vocabulary[(position … Read more