13 min read
Note: This article references commands, behaviors, and outputs generated by Linux-based operating systems, such as CentOS or Ubuntu. Some information will not be relevant to other operating systems, such as Windows.
When describing computer instruction sets in a previous article, I mentioned how various x86-64 instructions are used to load data into processor registers. When data is loaded into the appropriate registers, the processor can be instructed to perform a computation, the results of which will also be stored in registers.
Registers are essentially a form of computer memory, a part of the circuitry of computer processors, and each core in a processor has its own set of registers. Computer processors use registers for a variety of purposes, such as holding the numeric results of a mathematical calculation or holding the memory address of the program instruction that is being executed (i.e., the stack pointer). Registers can have varying sizes, such as 32 or 64 bits, and individual processors will generally have multiple registers of multiple sizes.
A core can access its registers without delay while executing instructions. In contrast, instructions that load from or store data from anywhere other than a register encounter varying delays, regardless of whether the location is logically and physically close to the core, such as the on-chip L1 cache, or far away, such as main memory.
In this article, we'll take a look at computer memory architecture and the implications for applications.
Throughout history, there have been many designs for memory devices. For the purposes of this article, we focus on synchronous dynamic random access memory (SDRAM) and synchronous static random access memory (SSRAM), the two technologies used extensively in modern general purpose computers.
Note that asynchronous versions of both technologies exist, and thus DRAM and SRAM are common terms used when describing computer memory, but in practice, almost all components of modern general purpose computers are clocked and hence run in a synchronous manner. I'll use the more generic DRAM and SRAM abbreviations and switch to the more specific SDRAM and SSRAM abbreviations when the content applies solely to the synchronous implementations.
The main building block of both technologies is the memory cell, a circuit designed to store a single bit, a logical 0 or 1 value. A memory cell can generally be described as having two lines, one used to transmit the value to be read/written and one used to activate the cell. However, you may want to think of a memory cell as having four lines, one used to pass input to the cell, one used to read output from the cell, one used to select whether the cell is to be read or written, and one used to select the cell.
When thinking of a memory cell as having four lines, to read the cell's value, the read/write line would be set to read, the select line would be activated, and then the cell's value would be present on the output line. Similarly, to write to the cell, the read/write line would be set to write, the value to be written would be placed on the input line, and the select line would be activated. This may be helpful for conceptualizing memory cell function, but it is not how SRAM and DRAM actually work.
In actuality, SRAM cells are generally designed with 3 lines (word, bit, and inverse bit) and DRAM cells with 2 lines (word and bit). SRAM memory cells are comprised entirely of transistors, while DRAM memory cells are comprised of a transistor and capacitor. The differing designs result in different methods for reading and writing the cells and different performance characteristics. In both cases, the cells have a "word" line that activates the cell. DRAM cells then have a single "bit" line that carries the cell's logical value. SRAM cells also have a bit line carrying the cell's logical value and an inverse bit line carrying the opposite of that value (e.g., the inverse bit line carries a 1 if the bit line carries a 0).
To read the value stored by a DRAM cell, the word line is activated, which allows energy stored in the capacitor to flow onto the bit line. When the word line is activated, if the capacitor has sufficient stored energy, the voltage present on the bit line will be sufficient to be interpreted as a 1. If the capacitor does not have sufficient energy, then the state of the bit line would be interpreted as a 0. The very act of reading a 1 from the DRAM cell requires discharging the energy stored in the capacitor. Thus, after reading a 1, the capacitor needs to be recharged such that it can place a sufficient voltage on the bit line during a subsequent read. In other words, after reading a DRAM cell, its value needs to be written back to it.
To store a 0 in a DRAM cell, the word line is activated, and the energy in the capacitor is allowed to discharge on the bit line. To store a 1, the word line is activated, and voltage is introduced to the bit line from outside the cell. While the word line is active, the capacitor is storing energy based on the voltage from outside the cell.
Capacitors take time to charge and discharge. Thus, the speed with which a DRAM cell can be read is most significantly affected by the time it takes the capacitor to discharge energy such that sufficient voltage is reached on the bit line. And the speed with which it can be written is most significantly affected by the time it takes the capacitor to absorb an amount of energy sufficient to produce the required voltage for a subsequent read.
The unique properties of transistors are of significant importance to understand how values are stored in SRAM cells. Transistors can be used in two ways, either as a signal amplifier or a switch. The most common SRAM cell design used for general purpose computer memory is comprised of six transistors. Two of those transistors are used as signal amplifiers and comprise the components connected directly to the word line. The remaining four transistors are organized as two inverters, each of which is comprised of two transistors. The two inverters are connected to each other such that the output line of each transistor is connected to an input line of the other.
For instance, the output of the first inverter can be a logical 1, which flows to the input of the second inverter. Given that input, the second inverter will produce an output of a logical 0, which flows back as input to the first inverter. This creates a stable state and represents the mechanism the SRAM cell uses to store its logical value.
To read the value stored by an SRAM cell, a charge is first placed on the bit and inverse bit lines from outside the cell, then a moderate charge is placed on the word line, and then the external charge being placed on the bit and inverse bit lines is removed. Based on the states of the inverters, the bit line will either maintain its charged state to represent a 1 or the voltage will fall towards 0, with the inverse bit line demonstrating the opposite behavior of the bit line.
The inverters in an SRAM cell enter a stable state representing a logical 0 or 1 by maintaining a minimum voltage on the input of one of the inverters. To store a value in an SRAM cell, a higher voltage must be placed on the input of one of the inverters such that it can overpower the minimum voltage that was maintaining the original state.
For instance, in an SRAM cell storing a logical value of 0, the input of inverter A may be the voltage representing a logical 1, such as 1.5v, which means the output of inverter A would essentially be 0 volts. Following which, the input of inverter B would be 0 volts and the output of inverter B would be 1.5v. In order to store a value of 1 in the cell, a significantly higher voltage needs to be placed on the input of inverter B compared to that of inverter A, such as 5v, which drives the output of inverter B to 0 volts. As the state of inverter B changes, the charge on the line connecting the output of inverter B to the input of inverter A falls to 0 volts, which results in the output of inverter A rising from 0 volts to something higher. After a long enough time, the higher voltage that was introduced to the input of inverter B can be removed and the inverters will maintain the new stable state representing a logical value of 1.
To store a value in an SRAM cell, the charge level representing the desired value is placed on the bit line, and the opposite charge level is placed on the inverse bit line (i.e., 0v on the bit line and 1.5v on the inverse bit line). Then, the word line is activated with a sufficiently high voltage to overpower the voltage already present on the inverters. The word line is deactivated after a sufficient amount of time has passed to be certain the inverters entered the desired stable state.
Ideally, memory can be read and written without delay, can be produced with extreme density allowing large amounts of data to be stored on very small chips, will use very little energy, and will maintain the integrity of stored data indefinitely. In practice, some of these goals must be prioritized over others and trade-offs must be made. While SRAM and DRAM cells serve the same logical function, storing a 0 or a 1, the difference in circuit design has significant effects.
Compared to SRAM, a DRAM cell can take more than 5x as long to access due to the nature of the capacitors used in the circuitry. Despite the slower speed, DRAM is used for the vast majority of memory in modern general purpose computers due to its significantly greater bit density. An SRAM cell typically takes more than 50x the physical space compared to a DRAM cell. Imagine if the DRAM module in your typical computer was 50x its size. At that scale, there are very real physical/electrical properties that would eliminate the speed benefits of SRAM.
It's desirable to use SRAM in a computer, as it can be accessed significantly faster than DRAM, but the total amount of SRAM has to be limited based on the point at which the physical size it consumes will negate its speed benefits. Thus, a comparatively small amount of SRAM is incorporated directly in CPU chips and acts as a cache for a portion of the much larger amount of DRAM used as main memory. SRAM caching will be discussed in more detail later in this series.
Stay ahead of the bleeding edge...get the best of Big Data in your inbox.