Data, Bits and Binary
[  Home  ]
[  Introduction  ]
[  The Web  ]
[  HTML  ]
[  Bookmarks  ]
[  Word  ]
[  Excel  ]
[  Access  ]
[  Tori Amos  ]
[  Robert Heinlein  ]
[  Jeff  ]

About Data

Our PCs are data processors. The PC's function is simple: to process data, and the processing is done electronically inside the CPU and between the other components. That sounds simple, but what are data, and how are they processed electronically in a PC?


Analog Data

The signals, which we send each other to communicate, are data. Our daily data have many forms: sound, letters, numbers, and other characters (handwritten or printed), photos, graphics, film. All these data are in their nature analog, which means that they are varied in their type. In this form, they are unusable in a PC. The PC can only process concise, simple data formats. Such data can be processed very effectively.

Digital Data

The PC is an electric unit. Therefore, it can only deal with data, which are associated with electricity. That is accomplished using electric switches, which are either off or on. You can compare with regular household switches. If the switch if off, the PC reads numeral 0. If it is on, it is read as numeral one. See the illustration below:

[
 

With our electric switches, we can write 0 or 1. We can now start our data processing!

The PC is filled with these switches (in the form of transistors). There are literally millions of those in the electronic components. Each represents either a 0 or a 1, so we can process data with millions of 0s and 1s.


Bits

Each 0 or 1 is called a bit. Bit is an abbreviation of the expression BInary digiT. It is called binary, since it is derived from the binary number system:

0 1 bit
1 1 bit
0110 4 bits
01101011 8 bits

The Binary Number System

The binary number system is made up of digits, just like our common decimal system (10 digit system). But, while the decimal system uses digits 0 through 9, the binary system only uses digits 0 and 1.

If you are interested in understanding the binary number system, then here is a brief course. Try and see if you can follow the system. See how numbers are constructed in the binary system, using only 0s and 1s:

Decimal system Binary system (ASCII code)
0 00110000
1 00110001
2 00110010
3 00110011
4 00110100
5 00110101
6 00110110
7 00110111
8 00111000
9 00111001
 

Here is the full list.
Decimal system Binary system (ASCII code)
A 01000001
B 01000010
C 01000011
D 01000100
E 01000101
F 01000110
G 01000111
H 01001000
I 01001001
J 01001010
K 01001010
L 01001100
M 01001101
N 01001110
O 01001111
P 01010000
Q 01010001
R 01010010
S 01010011
T 01010100
U 01010101
V 01010110
W 01010111
X 01011000
Y 01011001
Z 01011010
a 01100001
b 01100010
c 01100011
d 01100100
e 01100101
f 01100110
g 01100111
h 01101000
i 01101001
j 01101010
k 01101011
l 01101100
m 01101101
n 01101110
o 01101111
p 01110000
q 01110001
r 01110010
s 01110011
t 01110100
u 01110101
v 01110110
w 01110111
x 01111000
y 01111001
z 01111010
Period 00101110
Comma 00101100
Space 00100000

Digital Data

We have seen that the PC appears capable of handling data, if it can receive them as 0s and 1s. This data format is called digital. If we can translate our daily data from their analog format to digital format, they will appear as chains of 0s and 1s, then the PC can handle them.

So, we must be able to digitize our data. Pour text, sounds, and pictures into a funnel, from where they emerge as 0s and 1s:


(This) Bytes!

The most basic data processing is word processing. Let us use that as an example. When we do word processing, we work at a keyboard similar to a typewriter. There are 101 keys, where we find the entire alphabet A, B, C, etc. We also find the digits from 0 to 9 and all the other characters we need:,.-;():_?!"#*%&etc..

All these characters must be digitized. They must be expressed in 0s and 1s. Bits are organized in groups of 8. A group of 8 bits is called a byte.

8 bits = 1 byte, that is the system.

Then, what can we do with bytes? First, let us see how many different bytes we can construct. A byte is an 8 digit number. We link 0s and 1s in a pattern. How many different ones can we make? Here is one: 01110101, and here is another: 10010101.

We can calculate that you can make 2 x 2 x 2 x 2 x 2 x 2 x 2 x 2 different patterns, since each of the 8 bits can have 2 values.

28 (two in the power of eight) is 256. Then there are 256 different bytes!

Now we assign a byte to each letter and other characters. And since we have 256 patterns to choose from, there is plenty of room for all.


Now that we have all of that out of the way, let us now look at the two most important systems developed for representing symbols with binary numbers or bits, EBCDIC and ASCII, and a newer standard, Unicode.

EBCDIC

Among the first complete languages for representing symbols with bits was the Binary Codedd Decimal (BCD) system. IBM defined BCD for one of its early computers. BCD codes consisted of 6-bit words, which allowed a maximumof 64 possible symbols. BCD computers could only work with uppercase letters and very few other symbols. This system was not adequate for long.

The need to represent lowercase in addition to uppercase alphabetic characters required 52 codes for a complete alphabet alone and led to IBM's development of the EBCDIC system. EBCDIC, pronounced "EB-si-dic," is an acronym for Extended Binary Decimal Interchange Code.

EBCDIC is an 8-bit code that defines 256 symbols. EBCDIC is still commonly used in IBM mainframe and mid-range systems, but is rarely used in personal computers. By the time small computers were being developed, the American National Standards Institute (ANSI) had swung into action to define a standard for computers.

ASCII

The ANSI organization's solution to representing symbols with bits of data was the ASCII character set. Today, the ASCII character set is by far the most common. Initially, ASCII (which stands for American Standard Code for Information Interchange) was an 8-nit code, but the eighth bit served a special purpose and was called the parity bit. So, effectively, the original ASCII was a 7-bit code that defined 128 symbols.

Later, parity bits became unimportant, so IBM took charge again and developed an enhanced version of ASCII that made use of the eighth bit, allowing ASCII to describe 256 symbols. When IBM did this, they did not change any of the original 128 codes, which allowed programs and software designed to work with the original ASCII to continue to work with data in the new character set.

Unicode

A new standard for data representation, Unicode, will provide two bytes for representing symbols. With two bytes, a Unicode character set could be any one of more than 65,000 different characters or symbols -- enough for every character and symbol in the world, including the vast Chinese, Korean, and Japanese character sets. If a single character set becomes available to cover all the languages in the entire world, computer programs and data will be interchangeable.

 
[  Go back  ] [  Go next  ]
© Copyright 1999 -- Jeffrey M. Johnson
Last Updated 10/7/99