Digital Storage Basics
Number systems
The easiest way to understand digital storage begins with something you know: digits and the base-10 or decimal number system. Our base-10 number system likely grew up because we have 10 fingers. A decimal digit is a single place that can hold numerical values between 0 and 9. Digits are normally combined together in groups to create larger numbers. For example, the number 6,357 has four digits. It is understood that in the number 6,357, the 7 is filling the "1s place," while the 5 is filling the 10s place, the 3 is filling the 100s place and the 6 is filling the 1,000s place. So you could express things this way if you wanted to be explicit.
The Bs Binary, Bits and Bytes
Binary means that something can either be one way or the other. For example, an answer could be yes or no. An answer could be true or false. In digital logic and number systems, binary means that something can have a value of either 0 or 1. The word is a shortening of the words "Binary digIT." It refers to the digits in a binary number. When counting on the binary system, starting at zero and going through 20, counting in decimal and binary looks like this:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
= = = = = = = = = = = = = = = = = = = = =
0 1 10 11 100 101 110 111 1000 1001 1010 1011 1100 1101 1110 1111 10000 10001 10010 10011 10100
Bits are rarely seen alone in computers. They are almost always bundled together into 8-bit collections, and these collections are called bytes. Similar to how we group eggs into a collection called a dozen. With 8 bits in a byte, you can represent 256 values ranging from 0 to 255, as shown here:
0 = 00000000 1 = 00000001 2 = 00000010 ... 254 = 11111110 255 = 11111111
Bytes: ASCII Bytes are frequently used to hold individual characters in a text document. In the ASCII character set, each binary value between 0 and 127 is given a specific character. Most computers extend the ASCII character set to use the full range of 256 characters available in a byte. The upper 128 characters handle special things like accented characters from common foreign languages. You can see the 127 standard ASCII codes below. Computers store text documents, both on disk and in memory, using these codes. For example, if you use Notepad in Windows to create a text file containing the words, "Four score and seven years ago," Notepad would use 1 byte of memory per character (including 1 byte for each space character between the words -- ASCII character 32). When Notepad stores the sentence in a file on disk, the file will also contain 1 byte per character and per space. Try this experiment: Open up a new file in Notepad and insert the sentence, "Four score and seven years ago" in it. Save the file to disk under the name getty.txt. Then use the explorer and look at the size of the file. You will find that the file has a size of 30 bytes on disk: 1 byte for each character. If you add another word to the end of the sentence and re-save it, the file size will jump to the appropriate number of bytes. Each character consumes a byte. If you were to look at the file as a computer looks at it, you would find that each byte contains not a letter but a number -- the number is the ASCII code corresponding to the character (see below). So on disk, the numbers for the file look like this:
F o u r a n d s e v e n 70 111 117 114 32 97 110 100 32 115 101 118 101 110
By looking in the ASCII table, you can see a one-to-one correspondence between each character and the ASCII code used. Note the use of 32 for a space -- 32 is the ASCII code for a space. We could expand these decimal numbers out to binary numbers (so 32 = 00100000) if we wanted to be technically correct -- that is how the computer really deals with things.
Standard ASCII Character Set The first 32 values (0 through 31) are codes for things like carriage return and line feed. The space character is the 33rd value, followed by punctuation, digits, uppercase characters and lowercase characters.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
NUL SOH STX ETX EOT ENQ ACK BEL BS TAB LF VT FF CR SO SI DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
! " # $ % & ' ( ) * + , . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
@ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _
96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127
` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ DEL
Lots of Bytes When you start talking about lots of bytes, you get into prefixes like kilo, mega and giga, as in kilobyte, megabyte and gigabyte (also shortened to K, M and G, as in Kbytes, Mbytes and Gbytes or KB, MB and GB). The following table shows the binary multipliers: Name Abbr. Size
Kilo Mega Giga Tera Peta Exa Zetta Yotta
K M G T P E Z Y
2^10 = 1,024 2^20 = 1,048,576 2^30 = 1,073,741,824 2^40 = 1,099,511,627,776 2^50 = 1,125,899,906,842,624 2^60 = 1,152,921,504,606,846,976 2^70 = 1,180,591,620,717,411,303,424 2^80 = 1,208,925,819,614,629,174,706,176
You can see in this chart that kilo is about a thousand, mega is about a million, giga is about a billion, and so on. So when someone says, "This computer has a 2 gig hard drive," what he or she means is that the hard drive stores 2 gigabytes, or approximately 2 billion bytes, or exactly 2,147,483,648 bytes. How could you possibly need 2 gigabytes of space? When you consider that one CD holds 650 megabytes, you can see that just three CDs worth of data will fill the whole thing! Terabyte databases are fairly common these days, and there are probably a few petabyte databases floating around the Pentagon by now Some common examples: A double-spaced typewritten page, with 10 characters per inch, has about 1750 bytes. A single-space typewritten page of the same information would be double or about 3,500 bytes or 3.5 Kilobytes, or KB. A text only page, in the New York Times that is, with no pictures, with 6 columns of text, 155 lines per column and 35 characters per line ends up being about 32,550 characters or 32.5 KB. The average hardcover novel with about 300 pages has 1 million characters, which would take 1 MB to store on a computer. For some real examples, Catcher in the Rye is 400 KB. Gone with the Wind is about 2.5MB. As of 2000, the United States Library of Congress has about 20 million books for a total of 20 trillion characters or 20 TB.
Digital Pictures All forms of digital media are really just bytes of information. Digital cameras capture images dot by dot (also known as pixels). The total number of pixels determines the image resolution. The more pixels a camera has, the more detail it can capture and the larger pictures can be without becoming blurry or "grainy." Some typical resolutions include:
256x256 - Found on very cheap cameras, this resolution is so low that the picture quality is almost always unacceptable. This is 65,000 total pixels. 640x480 - This is the low end on most "real" cameras. This resolution is ideal for e-mailing pictures or posting pictures on a Web site. 1216x912 - This is a "megapixel" image size -- 1,109,000 total pixels -- good for printing pictures. 1600x1200 - With almost 2 million total pixels, this is "high resolution." You can print a 4x5 inch print taken at this resolution with the same quality that you would get from a photo lab. 2240x1680 - Found on 4 megapixel cameras -- the current standard -- this allows even larger printed photos, with good quality for prints up to 16x20 inches. 4064x2704 - A top-of-the-line digital camera with 11.1 megapixels takes pictures at this resolution. At this setting, you can create 13.5x9 inch prints with no loss of picture quality.
No matter what type of storage they use, all digital cameras need lots of room for pictures. They usually store images in one of two formats -- TIFF, which is uncompressed, and JPEG, which is compressed. Most cameras use the JPEG file format for storing pictures, and they sometimes offer quality settings (such as medium or high). The following chart will give you an idea of the file sizes you might expect with different picture sizes. TIFF JPEG JPEG
Image Size 640x480 800x600 1024x768 1600x1200
(uncompressed)
1.0 MB 1.5 MB 2.5 MB 6.0 MB
(high quality)
300 KB 500 KB 800 KB 1.7 MB
(medium quality)
90 KB 130 KB 200 KB 420 KB
Music Like pictures, music can be stored on a computer or on storage media in a digital format. CDs and DVDs hold music movies in a number of formats. MP3 files are now a very popular format for music which also uses compression. A typical format is to sample the music 44,000 per second, with 128 bits per sample. A minute of music is about 1MB, so a song like Little Wing by Government Mule, which is 4:00 minutes long is 3.67 MB. So, to store 500 songs, you need about 2 GB, and to store 1000 songs, about 4 GB.
Storage Devices There are many types of devices that can store digital data. Computer disk drives, CDs, DVDs, Flash drives, iPods etc. all can store some or all of the files types we have discussed - text files, pictures, programs, music, etc. Some common examples Computer Disk Drive CD ROM disk DVD ROM disk, single side DVD ROM disk, double sided Flash drive MP3 Player iPOD nano ranging from 20 GB to 400 GB or more, 80 GB is typical 650 700 MB 4GB 8GB 128 MB, 256 MB, 512, MB, 1 GB 1 GB 2GB, 4GB and 8GB