4.2. The Storage Spectrum

Present-day computers actually use a variety of storage technologies. Each technology is geared toward a specific function, with speeds and capacities to match. These technologies are:

In terms of capabilities and cost, these technologies form a spectrum. For example, CPU registers are:

However, at the other end of the spectrum, off-line backup storage is:

By using different technologies with different capabilities, it is possible to fine-tune system design for maximum performance at the lowest possible cost. The following sections explore each technology in the spectrum.

4.2.1. CPU Registers

Every present-day CPU design includes registers for a variety of purposes, from storing the address of the currently-executed instruction to more general-purpose data storage and manipulation. CPU registers run at the same speed as the rest of the CPU; otherwise, they would be a serious bottleneck to overall system performance. The reason for this is that nearly all operations performed by the CPU involve the registers in one way or another.

The number of CPU registers (and their uses) are strictly dependent on the architectural design of the CPU itself. There is no way to change the number of CPU registers, short of migrating to a CPU with a different architecture. For these reasons, the number of CPU registers can be considered a constant, as they are unchangeable without great pain.

4.2.2. Cache Memory

The purpose of cache memory is to act as a buffer between the very limited, very high-speed CPU registers and the relatively slower and much larger main system memory — usually referred to as RAM[1]. Cache memory has an operating speed similar to the CPU itself, so that when the CPU accesses data in cache, the CPU is not kept waiting for the data.

Cache memory is configured such that, whenever data is to be read from RAM, the system hardware first checks to see if the desired data is in cache. If the data is in cache, it is quickly retrieved, and used by the CPU. However, if the data is not in cache, the data is read from RAM and, while being transferred to the CPU, is also placed in cache (in case it will be needed again). From the perspective of the CPU, all this is done transparently, so that the only difference between accessing data in cache and accessing data in RAM is the amount of time it takes for the data to be returned.

In terms of storage capacity, cache is much smaller than RAM. Therefore, not every byte in RAM can have its own location in cache. As such, it is necessary to split cache up into sections that can be used to cache different areas of RAM, and to have a mechanism that allows each area of cache to cache different areas of RAM at different times. However, given the sequential and localized nature of storage access, a small amount of cache can effectively speed access to a large amount of RAM.

When writing data from the CPU, things get a bit more complicated. There are two different approaches that can be used. In both cases, the data is first written to cache. However, since the purpose of cache is to function as a very fast copy of the contents of selected portions of RAM, any time a piece of data changes its value, that new value must be written to both cache memory and RAM. Otherwise, the data in cache and the data in RAM will no longer match.

The two approaches differ in how this is done. One approach, known as write-through cache, immediately writes the modified data to RAM. Write-back cache, however, delays the writing of modified data back to RAM. The reason for doing this is to reduce the number of times a frequently-modified piece of data will be written back to RAM.

Write-through cache is a bit simpler to implement; for this reason it is most common. Write-back cache is a bit trickier to implement, in addition to storing the actual data, it is necessary to maintain some sort of flag that flags the cached data as clean (the data in cache is the same as the data in RAM), or dirty (the data in cache has been modified, meaning that the data in RAM is no longer current). Because of this, it is also necessary to implement a way of periodically flushing dirty cache entries back to RAM.

4.2.2.1. Cache Levels

Cache subsystems in present-day computer designs may be multi-level; that is, there might be more than one set of cache between the CPU and main memory. The cache levels are often numbered, with lower numbers being closer to the CPU. Many systems have two cache levels:

  • L1 cache is often located directly on the CPU chip itself and runs at the same speed as the CPU

  • L2 cache is often part of the CPU module, runs at CPU speeds (or nearly so), and is usually a bit larger and slower than L1 cache

Some systems (normally high-performance servers) also have L3 cache, which is usually part of the system motherboard. As might be expected, L3 cache would be larger (and most likely slower) than L2 cache.

In either case, the goal of all cache subsystems — whether single- or multi-level — is to reduce the average access time to the RAM.

4.2.3. Main Memory — RAM

RAM makes up the bulk of electronic storage on present-day computers. It is used as storage for both data and programs while those data and programs are in use. The speed of RAM in most systems today lies between the speeds of cache memory and that of hard drives, and is much closer to the former than the latter.

The basic operation of RAM is actually quite straightforward. At the lowest level, there are the RAM chips — integrated circuits that do the actual "remembering." These chips have four types of connections to the outside world:

Here are the steps required to store data in RAM:

  1. The data to be stored is presented to the data connections.

  2. The address at which the data is to be stored is presented to the address connections.

  3. The read/write connection to set to write mode.

Retrieving data is just as simple:

  1. The address of the desired data is presented to the address connections.

  2. The read/write connection is set to read mode.

  3. The desired data is read from the data connections.

While these steps are simple, they take place at very high speeds, with the time spent at each step measured in nanoseconds.

Nearly all RAM chips created today are sold as modules. Each module consists of a number of individual RAM chips attached to a small circuit board. The mechanical and electrical layout of the module adheres to various industry standards, making it possible to purchase memory from a variety of vendors.

NoteNote
 

The main benefit to a system that uses industry-standard RAM modules is that it tends to keep the cost of RAM low, due to the ability to purchase the modules from more than just the system manufacturer.

Although most computers use industry-standard RAM modules, there are exceptions. Most notable are laptops (and even here some standardization is starting to take hold) and high-end servers. However, even in these instances, it is likely that you will be able to find third-party RAM modules, assuming the system is relatively popular and is not a completely new design.

4.2.4. Hard Drives

All the technologies that have been discussed so far are volatile in nature. In other words, data contained in volatile storage is lost when the power is turned off.

Hard drives, on the other hand, are non-volatile — the data they contain remains there, even after the power is removed. Because of this, hard drives occupy a special place in the storage spectrum. Their non-volatile nature makes them ideal for storing programs and data for longer-term use. Another unique aspect to hard drives is that, unlike RAM and cache memory, it is not possible to execute programs directly when they are stored on hard drives; instead, they must first be read into RAM.

Also different from cache and RAM is the speed of data storage and retrieval; hard drives are at least an order of magnitude slower than the all-electronic technologies used for cache and RAM. The difference in speed is due mainly to their electromechanical nature. Here are the four distinct phases that take place during each data transfer to/from a hard drive. The times shown reflect how long it would take a typical high-performance drive, on average, to complete each phase:

Of these, only the last phase is not dependent on any mechanical operation.

NoteNote
 

Although there is much more to learn about hard drives, disk storage technologies are discussed in more depth in Chapter 5 Managing Storage. For the time being, it is only necessary to realize the huge speed difference between RAM and disk-based technologies and that their storage capacity usually exceeds that of RAM by a factor of at least 10, and often by 100 or more.

4.2.5. Off-Line Backup Storage

Off-line backup storage takes a step beyond hard drive storage in terms of capacity (higher) and speed (slower). Here, capacities are effectively limited only by your ability to procure and store the removable media.

The actual technologies used in these devices can vary widely. Here are the more popular types:

Of course, having removable media means that access times become even longer, particularly when the desired data is on media that is not currently in the storage device. This situation is alleviated somewhat by the use of robotic devices to automatically load and unload media, but the media storage capacities of such devices are finite. Even in the best of cases, access times are measured in seconds, which is a far cry even from the slow multi-millisecond access times for a high-performance hard drive.

Now that we have briefly studied the various storage technologies in use today, let us explore basic virtual memory concepts.

Notes

[1]

While "RAM" is an acronym for "Random Access Memory," and a term that could easily apply to any storage technology that allowed the non-sequential access of stored data, when system administrators talk about RAM they invariably mean main system memory.