Text - ASCII vs. ANSI

ASCII and ANSI refer to character encodings.

A character encoding defines a mapping between a code point (an integer number) and a character.


ASCII is a 7-bit character encoding. 7-bits can encode 128 different characters.
ANSI is a term used by Microsoft to mean an 8-bit string (based on ASCII).
Or more specifically, an 8-bit string (based on ASCII) encoded with the current codepage. 8-bits can encode 256 different characters.

So, technically, ANSI isn't a single character encoding.

ANSI is a generic term for an 8-bit character encoding that is based on ASCII.

Let's look at a Windows API function to see how ANSI strings are used.

MessageBox(string text, string caption)

Before windows supported Unicode, there was only one version of MessageBox. The version above that takes an ANSI string. An 8-bit string. Since we don't pass a codepage to the function, Windows has to assume that the strings 'text' and 'caption' are encoded with the current codepage. This is a sensible optimization.

When Microsoft added support for Unicode, all functions that take a string were split in two. We got an ANSI version for backwards compatibility, and a WIDE version for Unicode.

MessageBoxA(string text, string caption)
MessageBoxW(string text, string caption)

The Windows API contains thousands of functions like MessageBox, which is why, within Microsoft documentation, the term ANSI became a synonym for an 8-bit string.


Ads by Google


Ask a question, send a comment, or report a problem - click here to contact me.

© Richard McGrath