Reality → Tech → Computers → Digitization
A bit ('binary' 'digit') is the smallest possible unit of information with just one of two possible states (e.g., 'off' or 'on', resp. '0' or '1'), while a byte is commonly defined as a group of 8 bits [1] . Computer performance increases vastly with the number of bits processed in parallel [2] . Any information, regardless of the medium, can be encoded by bits. Text (including numbers and symbols) can be encoded with ASCII or Unicode standards [3] , audio by sampling sound waves [4] , and images and video by rasterization [5] . The digitized data is then amenable to electronic manipulation by a set of machine-specific instructions that, like data, are also encoded in bits. The binary numeral system lends itself to easy arithmetic operations and their automation. Negative numbers can be represented by the two's complement method without the need for coding a minus sign, and very large or very small numbers can be expressed in floating point (scientific, or exponential) notation. Boolean algebra can be electronically realized in logic gates [6] .
The 8-bit byte (an octet) has a strong historical and practical background. The highly successful, pathbreaking IBM 360 mainframe of 1964 was the first computer with 8-bit addressable memory, and the trendsetting 8-bit Intel 8080 of 1974 was the first mass-produced microprocessor. The complete set of ASCII characters can be encoded with 7 bits, leaving the 8th bit available for control. While modern computers are 64-bit machines, 8-bit microprocessors continue to be used as embedded systems in monitors, keyboards, and mouses. The 8-bit byte is also used as a unit to describe memory and storage capacities (which are measured in orders of magnitude of bytes (B), e.g., MB, GB, or TB).
In binary code, the capacity to represent data rises dramatically with the number of bits used in unison, e.g., the number of different sequences of '0' and '1' available for encoding is 256 for 8 bits, 65,000 for 16 bits, 4 billion for 32 bits, and 18 million trillion for 64 bits (see Sheet).
ASCII encodes 128 characters consisting of 10 numbers (0-9), 52 letters (lower and upper case Latin alphabet), 33 punctuation marks, 32 control symbols, and 1 blank space. The 128 characters can be encoded with 7 binary bits (which yield 27 = 128 values). In hexadecimal (base 16) notation, the 128 characters can be encoded with 4 hexadecimal digits (which yield 164 = 128 values). The symbols used to describe the 16 values of a hexadecimal digit consist of the numbers 0-9 followed by 6 letters (A-F). Unicode, with a capacity to encode more than 1 million characters, provides standards for the encoding of all scripts of the world (with challenging aspects for representation of Chinese ideographs and input methods).
An analog sound signal can be digitized by sampling its amplitude at regular intervals not longer than half the wavelength of its shortest (highest frequency) constituent wave. Since the human ear cannot hear frequencies higher than about 20 kHz, a common sampling rate for high quality recording has been set at 48 kHz.
An image can be rasterized into pixels whose composite color is encoded through quantization of its red, green, and blue components. A video commonly runs at 30 frames (images) per second, each frame being pixelated and each pixel color quantized. Clever compression algorithms and hardware designs significantly reduce (to about 10%) the enormous amount of data for transmission and storage, but normally not for processing, of high-quality video and audio.
Transistors can be connected in countless ways to form all kinds of logic gates and any combination thereof. NAND gates became a standard because their combination allows implementation of all other gate types at technical and economical advantages. The tedious design work to create logic circuits is aided by special computer programs.