Chapter 14: Future Directions in Computer Architectures – Modern Computer Architecture and Organization

Chapter 14: Future Directions in Computer Architectures

This chapter anticipates the road ahead for computer architecture design. We will review the significant technological advances and ongoing trends that have led us to the current state of computer architectures. We will then extrapolate from current trends and identify some of the directions that computing system designs are likely to take in the future. We will also examine some potentially disruptive technologies that may alter the evolution of future computer architectures.

This chapter offers some suggested approaches for the professional development of the computer architect. By following these recommendations, you should be able to maintain a skill set that remains relevant and tolerant of future advances, whatever they turn out to be.

After completing this chapter, you will understand the historical evolution of computer architecture that led to its current state and will be familiar with ongoing trends in computer design that are likely to indicate future technological directions. You will have a basic level of knowledge of some potentially disruptive technologies that might substantially alter future computer architectures. You will also have learned some useful techniques for maintaining an ongoing, current skill set in the field of computer architecture.

The following topics will be presented in this chapter:

  • The ongoing evolution of computer architectures
  • Extrapolating current trends into the future
  • Potentially disruptive technologies
  • Building a future-tolerant skill set

The ongoing evolution of computer architectures

Chapter 1, Introducing Computer Architecture, presented a brief history of automated computing devices from the mechanical design of Babbage's Analytical Engine to the advent of the x86 architecture that continues to serve as the basis for most modern personal computers. This progress has relied on several groundbreaking technological achievements, most notably the invention of the transistor and the development of integrated circuit manufacturing processes.

Through the decades since the introduction of the Intel 4004 in 1971, processors have grown dramatically in terms of the sheer number of transistors and other circuit components integrated on a single-circuit die. In concert with the growth in the number of circuit elements per chip, the clock speed of modern devices has increased by several orders of magnitude.

This increase in processor capability and instruction execution speed has unleashed the growth of software development as an enormous, worldwide industry. In the early days of digital computers, software was developed by small teams of highly trained specialists in a research setting. Today, powerful personal computers are available at a comparatively low cost, and software development tools such as programming language compilers and interpreters are widely available, often for free. As processors have increased in capability, the availability of widespread computing power has created a strong demand for software to run on those devices.

Modern processors have evolved to coalesce far more functionality into the processor's integrated circuit than early devices, such as the 6502. The 6502, in essence, contains the minimum component set required to perform useful processing: a control unit, a register set, an ALU, and an external bus for accessing instructions, data, and peripherals.

The most sophisticated modern processors targeted at business and home users incorporate basic functionally similar to the capabilities of the 6502, along with substantial added features and extensions, such as the following:

  • Up to 16 processor cores, each supporting simultaneous multithreading
  • Multilevel instruction and data cache memory
  • A µ-op cache to avoid the processing delay associated with instruction-decode operations
  • A memory-management unit supporting paged virtual memory
  • Integrated multichannel high-speed serial I/O capability
  • An integrated graphics processor generating digital video output

To summarize the technological evolution from the 6502 processor to the modern x64 processor, modern processors provide multiple 64-bit cores operating in parallel compared to the 6502's single 8-bit core, and they implement numerous additional features specifically designed to accelerate execution speed.

In addition to the raw computing capability of modern PC processors, the x86/x64 instruction set provides instructions to implement a wide variety of operations, ranging from simple to extremely complex. Modern RISC processors, such as ARM and RISC-V, on the other hand, implement intentionally slimmed-down instruction sets, with the goal of breaking complex operations into sequences of simpler steps, each of which executes at very high speed while working within a larger register set.

The high-level configurations of computer architectures have, arguably, not undergone drastic disruption since the days of the 6502. With each extension of the processor architecture's instruction set or the introduction of additional caching technology, these changes have incrementally expanded the functionality available to software developers or increased the speed at which algorithms execute. The expansion to multiple cores and to multithreading within a single core allows multiple independent execution threads to execute simultaneously rather than running in a time-sliced manner on a single core.

Much of the incrementalism during this evolution has been intentional, to avoid introducing changes in processor architectures that would inhibit backward compatibility with the immense universe of already-developed operating system and application software. The net result has been a series of processor generations that gradually become faster and more capable over time, but do not implement any disruptive breaks from past technology.

In the next section, we will attempt to extrapolate from the current generation of high-performance computing systems discussed in Chapter 13, Domain-Specific Computer Architectures, to predict the advances in computer architectures likely to occur in the next one to two decades.

Extrapolating from current trends

The capabilities of current-generation processor technology are beginning to push up against some significant physical limits that we can expect to constrain the rate of growth going forward. These limits certainly will not lead to an abrupt end of improvements in circuit density and clock speed; rather, capability improvements for future processor generations may take place in directions that differ from traditional semiconductor capability improvement patterns. To look more closely at future processor performance growth expectations, we begin by returning to Moore's law and examining its applicability to the future of semiconductor technology.

Moore's law revisited

The revised version of Moore's law, published by Gordon Moore in 1975, predicted the number of integrated circuit components per device would double roughly every two years. This law has demonstrated remarkable predictive accuracy for several decades, but as of 2015, according to Intel, the growth rate had slowed to doubling approximately every two and a half years. This indicates the rate of growth in integrated circuit density has begun to slow, but it certainly has not ended, and is not expected to end in the foreseeable future.

Integrated circuit technology will continue to improve, resulting in denser and more highly capable devices for many years to come. We can, however, expect the rate of growth in circuit density to decrease over time because of the physical limits associated with the construction of single-digit nanometer-scale circuit components.

The slower rate of increase in circuit density does not mean that the trend is near an end. As of 2020, current mass-produced integrated circuit technology is based on circuit features with dimensions as small as 10 nm. Work is in progress to develop the next generation of circuit technology with 7 nm feature sizes. A future generation with feature sizes of 5 nm is in the planning stages. Although these increased circuit densities are likely to be realized at some point, each technological advance comes with increasing cost and technical challenges that result in delays in deployment to production lines. The most advanced integrated circuit production technologies are so costly to develop and difficult to implement that only a handful of massive semiconductor companies have the financial resources and technical expertise to bring such processes online.

Given the ongoing decline in the rate of improvement in circuit density, semiconductor manufacturers have begun to focus on alternative methods for packing smaller components together on a chip. Traditionally, integrated circuits have been viewed as primarily two-dimensional entities constructed in layers, as follows:

  • Different types of material are laid out in a sequence of masking operations to create doped regions of transistors, as well as other circuit components, such as capacitors and diodes
  • Conductive traces serving as wires are deposited on the devices as additional layers

Communication between circuit elements within a two-dimensional device layout involves electrical interactions between components placed some distance from each other on the chip's surface. The chip is small, so the time that the electrical signal takes to propagate between components is usually not significant.

You may wonder if it is possible to organize the components of an integrated circuit in a manner other than effectively spreading them around on a flat surface. It is indeed possible to stack components on top of one another on an integrated circuit die. We will look at this design approach in the next section.

The third dimension

By developing techniques for stacking components atop one another on a single integrated circuit die, semiconductor manufacturers have taken a step toward extending Moore's law. One of the early targets for stacked-component integrated circuit configurations is the ubiquitous n-channel and p-channel MOS transistor pair in CMOS circuit designs.

Intel publicly described advances achieved by its researchers in the area of stacked CMOS transistor pairs in early 2020. Not only has the company shown an ability to stack devices on a silicon die, it has also demonstrated how to use differing fabrication technologies in each device layer to achieve maximum performance from the transistor pair.

Silicon n-channel transistors exhibit good performance characteristics, but p-channel transistors constructed on silicon have a relatively slower switching speed. P-channel transistors implemented with a germanium transistor channel instead of silicon provide increased switching speed, improving the performance of the CMOS pair. In a demonstration of Intel's mixed-technology device integration, silicon n-channel transistors were constructed on a base silicon die with germanium p-channel devices stacked on top of them. If this technique can be scaled to support integrated circuit production, it holds the promise of continued increases in device density and improved clock speeds.

Another density-increasing approach is to combine multiple separately constructed integrated circuit dies in a vertical stack, with connections between the layers for power and communication. You can think of this technique as a method of soldering integrated circuit dies on top of each other in a manner that is similar to the way surface-mounted components are soldered onto a circuit board.

Separately fabricated integrated circuits combined within a single package are referred to as chiplets. Chiplets can be laid out side by side on a silicon base or they can be stacked atop one another, depending on the needs of the device. This approach allows each of the chiplets in a complex device to be constructed using the most appropriate technology for that component. For example, one fabrication method may be most appropriate for a core processor, while a different process might be more suitable for a memory chiplet integrated with the processor. An integrated cellular radio interface in the same device package may be constructed using yet another process.

The use of the vertical dimension in the construction of individual integrated circuits and in the construction of complex devices composed of multiple chiplets within a single package enables a higher level of system-on-chip (SoC) integration and higher overall performance. As these techniques continue to be refined and rolled out into production lines, we can expect the increasing circuit complexity and functionality predicted by Moore's law to continue in future years, though perhaps at a reduced growth rate.

The next trend we will examine is the ongoing growth of the use of highly specialized processing devices in place of general-purpose processors.

Increased device specialization

In previous chapters, we explored a few specialized processing technologies targeted at application areas such as digital signal processing, three-dimensional graphical image generation, and neural network processing. It is certainly possible for all of the computations performed by these devices to be carried out by ordinary, general-purpose processors. The important difference in the processing performed by these specialized devices is the increased execution speed, with throughput that is sometimes hundreds or even thousands of times faster than an ordinary processor could achieve.

The growing importance of machine learning and autonomous technologies will continue to drive innovation in the computer architectures that underpin future digital systems. As automobiles and other complex systems gain autonomous features that either augment or replace functionality traditionally performed by human operators, the underlying processing architectures will continue to evolve to provide higher levels of performance tailored to specific tasks while minimizing power consumption.

Specialized processors will take advantage of the advances discussed earlier in this chapter while optimizing individual device designs for particular application niches. The trend toward increased specialization of processing devices will continue and may even accelerate in the coming years.

This discussion has focused on the continuation of ongoing trends into future years. The next section will examine the possibility that a technological force may arise that substantially alters the path from continued incremental improvements in computer architecture to something that is entirely different.

Potentially disruptive technologies

So far, this chapter has focused on trends currently in progress and the potential effects of their extension into the future. As with the introduction of the transistor, we saw that it is always possible that some new technology will appear that creates a drastic break with past experience and leads the future of computing technology in a new direction.

In this section, we will attempt to identify some potential sources of such technological advances in the coming years.

Quantum physics

Charles Babbage's Analytical Engine tried to take the capabilities of purely mechanical computing devices to an extreme that had not been achieved previously. His attempt, while ambitious, was ultimately unsuccessful. The development of practical automated computing devices had to wait until the introduction of vacuum tube technology provided a suitable basis for the implementation of complex digital logic.

Later, the invention of the transistor moved computing technology onto a trajectory of increasing capability and sophistication that ultimately brought us to the state of computing we enjoy today. Ever since the introduction of the Intel 4004, advances in computing technology have taken the form of incremental improvements to what is fundamentally the same underlying silicon transistor technology.

Transistor operation is based on the properties of semiconducting materials, such as silicon, and the application of those properties to implement digital switching circuits. Digital circuits constructed with semiconductors generally perform operations using discrete binary data values. These devices are designed to generate reliably repeatable results when given the same input on a subsequent execution of the same sequence of instructions.

As an alternative to this approach, numerous research efforts are underway around the world exploring the possibility of employing aspects of quantum physics in computing technology. Quantum physics describes the behavior of matter at the level of individual atoms and subatomic particles. The behavior of particles at the subatomic level differs in significant and surprising ways from the familiar behaviors of the macro-scale objects we interact with every day under the laws of classical physics. The laws of quantum physics have been discovered and described in theories since the mid-1800s.

Quantum physics is rigorously defined by a set of theories that have demonstrated remarkable predictive powers. For example, Wolfgang Pauli postulated the existence of the neutrino particle within the framework of quantum physics in 1930. Neutrinos are comparatively tiny subatomic particles that have barely any interaction with other particles, making them extremely difficult to detect. Neutrinos were not proven to exist by scientific experiments until the 1950s.

Several other types of subatomic particle have been predicted by theory and ultimately shown to exist in experiments. Quantum physics, including the strange behaviors exhibited in the subatomic world, offers a promising new direction for future computer architectures.

Physical parameters associated with human-scale objects, such as the speed of a moving vehicle, seem to vary in a continuous manner as a car accelerates or slows. The electrons within an atom, on the other hand, can only exist at specific, discrete energy levels. The energy level of an electron in an atom corresponds roughly to the speed of a particle moving in an orbit around a central body in classical physics.

There is no possibility for an electron in an atom to be between two energy levels. It is always precisely in one discrete energy level or another. These discrete energy levels lead to the use of the term quantum to describe such phenomena.

Spintronics

In addition to the energy level of an electron in an atom, electrons exhibit a property analogous to the spinning of an object in classical physics. As with the energy level, this spin state is quantized. Researchers have demonstrated the ability to control and measure the spin behavior of electrons in a manner that may prove suitable for use in practical digital switching circuits. The use of electron spin as a component of a digital switching circuit is referred to as spintronics, combining the terms spin and electronics.

This technology uses the quantum spin state of electrons to hold information in a manner similar to the charge state of capacitors in traditional electronics. The spin of an elementary atomic particle is a type of angular momentum conceptually similar to the momentum of a spinning basketball balanced on a fingertip.

There are some significant differences in the spin behavior of electrons compared to basketballs. Electrons do not actually rotate; however, their spin behavior obeys the mathematical laws of angular momentum in a quantized form. A basketball can be made to spin at an arbitrarily selected rotational speed, while electrons can only exhibit spin at one discrete, quantized level. The spin of an elementary particle is determined by its particle type, and electrons always have a spin of , which represents a quantum number.

The spin of a basketball can be fully characterized by the combination of its rotational speed and the axis about which the rotation is taking place. A spinning ball balanced on a fingertip rotates about the vertical axis. The entirety of the ball's rotational motion can be described by a vector pointing along the axis of rotation (in this case, upward) with a magnitude equal to its rotational speed.

Electrons always have the same spin value of , defining the angular momentum vector length, so the only way to differentiate the spin of one electron from another is the direction of the spin vector. Practical devices have been created that can enable the alignment of electron spin vectors in two different orientations, referred to as up and down.

Electron spin generates a tiny magnetic field. Materials in which most electron spins are aligned directionally produce a magnetic field with the same orientation as the aligned electrons. The effect of these aligned electrons is apparent in common devices, such as refrigerator magnets.

The magnetic field produced by electron spin cannot be explained by classical physics. Magnetism is purely an effect of quantum physics.

A switching device called a spin valve can be constructed from a channel with a magnetic layer at each end. The magnetic layers function as gates. If the gates are of the same spin polarity, a current consisting of spin-polarized electrons can flow through the device. If the gates have opposite polarities, the current is blocked. A spin valve can be switched on and off by reversing the polarity of one of the magnets by applying current to it with the opposite spin direction.

Switching electron spin directions can be much faster while consuming much less power than the process of charging and discharging capacitors that underlies the functioning of today's CMOS digital devices. This is the key feature providing a glimpse of the potential for spintronics to eventually augment or replace CMOS circuitry in high-performance digital devices.

Spintronics is an area of ongoing, active research. The commercialization and production of digital devices that outperform today's CMOS processors is not likely to occur for several years, if the technology turns out to be viable at all.

Spintronics relies on the laws of quantum physics to perform digital switching. Quantum computing, the subject of the next section, directly exploits quantum-mechanical phenomena to perform analog and digital processing.

Quantum computing

Quantum computing holds the promise of dramatic execution speed improvements for certain classes of problems. Quantum computing uses quantum-mechanical phenomena to perform processing, and can employ analog or digital approaches to solve problems.

Digital quantum computing uses quantum logic gates to perform computing operations. Quantum logic gates are based on circuits called quantum bits, or qubits. Qubits are analogous in some ways to the bits in traditional digital computers, but there are significant differences. Traditional bits can take on only the values 0 and 1. A qubit can be in the 0 or 1 quantum state; however, it can also be in a superposition of the 0 and 1 states. The principle of quantum superposition states that any two quantum states can be added together and the result is a valid quantum state.

Whenever the value of a qubit is read, the result returned is always either 0 or 1. This is due to the collapse of the superposition of quantum states to a single state. If, prior to the readout, the qubit held the quantum value corresponding to the binary value 0 or 1, the output of the read operation will equal the binary value. If, on the other hand, the qubit contained a superposition of states, the value returned by the read operation will be a probabilistic function of the superposition of states.

In other words, the likelihood of receiving a 0 or 1 as the result of reading the qubit depends on the characteristics of its quantum state. The value returned by the read operation will not be predictable. The reason for this unpredictability is not simply a lack of knowledge; in quantum physics, a particle simply does not have a defined state until a measurement has been taken. This is one of the counterintuitive and, frankly, mind-bending features of quantum physics.

A qubit state that is close to the binary value 1 will have a higher probability of returning a value of 1 when read than one that is closer to the binary value of 0. Performing a read operation on multiple qubits that all begin in identical quantum states will not always produce the same result because of the probabilistic nature of the read operation.

Qubit circuits can demonstrate and exploit the properties of quantum entanglement, a central principle of quantum physics. Quantum entanglement occurs when multiple particles are linked in a manner that causes the measurement of one of the particles to affect the measurement of the linked particles. The most surprising aspect of this linkage is that it remains in effect even when the particles are separated by great distances. The entanglement effect appears to propagate instantaneously, unrestricted by the speed of light. While this behavior may seem like science fiction, it has been demonstrated experimentally and has even been used in the communication technology of the NASA Lunar Atmosphere Dust and Environment Explorer (LADEE) that orbited the moon from 2013–2014.

Quantum computers are capable of exploiting entanglement in information processing. If you work through the examples at the end of this chapter, you will have an opportunity to develop a program for a quantum computer that exhibits the effects of quantum entanglement, and you will run this program on an actual quantum computer.

The somewhat unpredictable nature of the results returned by reading a qubit would seem to argue against using this technology as the basis for a digital computing system. This partial unpredictability is one reason why quantum computers are envisioned as useful for only certain classes of problems. Most customers would not appreciate a bank using a computer that calculates different account balances each time the computation is run because of quantum uncertainty.

Two key application categories currently envisioned for quantum computers are as follows:

  • Quantum cryptography: Quantum cryptography uses digital quantum computing techniques to break modern cryptographic codes. Many cryptographic algorithms in use today are based on the assumption that it is computationally infeasible to determine the factors of a large number (containing perhaps hundreds of decimal digits) that is the product of two large prime numbers. Factoring such a number on modern computers, even with a supercomputer or relying on thousands of processors operating in parallel in a cloud environment, cannot be expected to produce a correct result in a reasonable period of time.

    Shor's algorithm, developed by Peter Shor in 1994, describes the steps a quantum computer must perform to identify the prime factors of a given number. A quantum computer running Shor's algorithm can potentially factor a very large number in a much shorter time than ordinary computers, thereby rendering modern cryptographic systems based on public key cryptography vulnerable to such attacks. To date, quantum computing has only demonstrated the ability to factor relatively small numbers, such as 21, but the potential threat is recognized by organizations and governments that require high levels of communication security. The future may bring quantum computing systems capable of cracking the codes we use today for securing websites and online banking.

    However, there is probably little reason to be concerned about the security of your bank account against quantum attacks. An assortment of quantum-computing-resistant public key encryption algorithms are being researched. Collectively, these algorithms are referred to as post-quantum cryptography. We can expect a large-scale transition to quantum-resistant cryptographic algorithms in the event that the quantum threat to current cryptography methods becomes real.

  • Adiabatic quantum computation: This is an analog quantum computing approach that holds the promise of efficiently solving a wide variety of practical optimization problems. Imagine that you are in a rectangular region of hilly terrain surrounded by a fence. You need to find the lowest point within the fenced boundary. In this scenario, it is very foggy, and you cannot see the surrounding terrain. The only clue you have is the slope of the surface under your feet. You can follow the slope downward, but when you reach a level area, you can't be sure if you're in a local basin or have truly found the lowest point in the entire bounded region.

    This is an example of a simple two-dimensional optimization problem. The goal is to find the x and y coordinates of the lowest altitude in the entire region, called the global minimum, without being sidetracked and getting stuck in a basin at a higher altitude, which is referred to as a local minimum. You don't need anything as fancy as quantum computing to find the lowest point in a hilly region, but many real-world optimization problems have a larger number of inputs, perhaps 20 to 30, that must all be adjusted in the search for the global minimum. The computational power required to solve such problems is beyond the capability of even today's fastest supercomputers.

    The quantum computing approach to solving such problems begins by setting up a configuration of qubits containing the superposition of all possible solutions to the problem, then slowly reducing the superposition effect. By constraining the state of the quantum circuit configuration during this process, it is possible to ensure that the solution that remains after superposition has been removed, and all of the quantum bits are resolved to discrete 0 or 1 values, is the global minimum. The term adiabatic in the name of this approach refers to an analogy between the process of removing the superposition and a thermodynamic system that neither loses nor gains heat as it operates.

    Adiabatic quantum optimization is an area of active research. It remains to be seen what level of capability this technology can ultimately bring to the solution of complex optimization problems.

The term quantum supremacy describes the transition point at which quantum computing exceeds the capability of traditional digital computing in a particular problem domain. There is spirited debate among researchers as to whether quantum supremacy has been achieved by any of the major organizations developing quantum computing technologies; when this point may be reached at a future date; or whether such a transition is ever going to occur.

A number of substantial barriers stand in the way of the widespread deployment of quantum computing in a manner similar to the ubiquitous use of CMOS-based computing devices by users around the world today. Some of the most pressing issues to be addressed are as follows:

  • Increasing the number of qubits in a computer to support the solution of large, complex problems
  • Providing the ability to initialize qubits to arbitrary values
  • Providing mechanisms to reliably read the state of qubits
  • Eliminating the effects of quantum decoherence
  • The components required for quantum computers are hard to find and are very expensive

Quantum decoherence refers to the loss of phase coherence in a quantum system. For a quantum computer to function properly, phase coherence must be maintained within the system. Quantum decoherence results from interference from the outside world in the internal operation of the quantum system, or from interference generated internally within the system. A quantum system that remains perfectly isolated can maintain phase coherence indefinitely. Disturbing the system, for example by reading its state, disrupts the coherence and may lead to decoherence. The management and correction of decoherence effects is referred to as quantum error correction. The effective management of decoherence is one of the greatest challenges in quantum computing.

Current quantum computer designs rely on exotic materials such as Helium-3, which is produced by nuclear reactors, and they require superconducting cables. Quantum-computing systems must be cooled to temperatures near absolute zero during operation. Current quantum computers are mostly laboratory-based systems that require a dedicated staff of experts for their construction and operation. This situation is somewhat analogous to the early days of vacuum-tube-based computers. One major difference from the vacuum tube days is that today we have the Internet, which provides ordinary users with a degree of access to quantum-computing capabilities.

Current quantum-computing systems contain at most a few dozen qubits and are mainly accessible only to the commercial, academic, and government organizations that fund their development. There are, however, some unique opportunities for students and individuals to gain access to real quantum computers.

One example is the IBM Quantum Experience at https://www.ibm.com/quantum-computing/. With this free collection of resources, IBM provides a set of tools, including a quantum algorithm development environment called Qisket, available at https://www.qiskit.org/. Using the Qisket tools, developers can learn to code quantum algorithms and can even submit programs for execution in batch mode on a real quantum computer. The exercises at the end of this chapter suggest steps you can take to get started in this domain.

Quantum computing shows great promise for addressing particular categories of problems, though the widespread commercialization of the technology is most likely several years away.

The next technology we will examine is the carbon nanotube, which has the potential to move digital processing at least partially away from the world of silicon.

Carbon nanotubes

The carbon nanotube field-effect transistor (CNTFET) is a transistor that uses either a single carbon nanotube or an array of carbon nanotubes as the gate channel rather than the silicon channel of the traditional MOSFET. A carbon nanotube is a tubular structure constructed from carbon atoms with a diameter of approximately 1 nanometer.

Carbon nanotubes are exceptionally good electrical conductors, exhibit high tensile strength, and conduct heat very well. A carbon nanotube can sustain current densities over 1,000 times greater than metals such as copper. Unlike in metals, electrical current can propagate only along the axis of the nanotube.

Compared to MOSFETs, CNTFETs have the following advantages:

  • Higher drive current.
  • Substantially reduced power dissipation.
  • Resilience to high temperatures.
  • Excellent heat dissipation, allowing for high-density packing of the devices.
  • The performance characteristics of n-channel and p-channel CNTFET devices match closely. In CMOS devices, on the other hand, there can be substantial variation between the performance of the n-channel and p-channel transistors. This limits overall circuit performance to the capabilities of the lower performing devices.

As with the other emerging technologies discussed in this chapter, CNTFET technology faces some substantial barriers to commercialization and widespread use:

  • Production of CNTFETs is very challenging because of the need to place and manipulate the nanometer-scale tubes.
  • Production of the nanotubes required for CNTFETs is also very challenging. The nanotubes can be thought of as starting from flat sheets of carbon fabric that must be rolled into tubes along a specific axis in order to produce a material with the desired semiconducting properties.
  • Carbon nanotubes degrade rapidly when exposed to oxygen. Fabrication technologies must take this into account to ensure the resulting circuit is durable and reliable.

Given the challenges of mass-producing CNTFETs, it will likely be several years before commercial devices begin to make wide use of carbon nanotube-based transistors.

The preceding sections have identified some advanced technologies (spintronics, quantum computing, and carbon-nanotube-based transistors) as promising areas that may someday contribute substantially to the future of computing. None of these technologies are in wide-scale use at the time of writing, but research has shown promising results, and many government, university, and commercial laboratories are hard at work developing these technologies and finding ways to put them to use in the computing devices of the future.

In addition to technologies such as these that are widely reported and appear to be advancing along at least a semipredictable path, there is always the possibility that an organization or individual may announce an unanticipated technological breakthrough. This may occur at any time, and such an event may upend the conventional wisdom regarding the anticipated path for the future. Only time will tell.

In the context of the uncertainty of the road ahead for computer architectures, it is prudent for the architecture professional to devise a strategy to ensure ongoing relevance regardless of the twists and turns future technology takes. The next section presents some suggestions for staying up to date with technological advances.

Building a future-tolerant skill set

Given the technological transitions that kicked off the era of transistor-based digital computing, and the likelihood of similar future events, it is important for professionals in the field of computer architecture to keep up with ongoing advances and to develop some intuition as to the likely directions the technology will take in the future. This section provides some recommended practices for keeping up with state-of-the-art technology.

Continuous learning

Computer architecture professionals must be willing to embrace the idea that technology continues to evolve rapidly, and they must devote substantial ongoing efforts to monitoring advances and factoring new developments into their day-to-day work and career-planning decisions.

The prudent professional relies on a wide variety of information sources to track technological developments and assess their impact on career goals. Some sources of information, such as traditional news reports, can be skimmed quickly and fully absorbed. Other sources, such as scientific literature and websites curated by experts in particular technologies, require time to digest complex technical information. More advanced topics, such as quantum computing, may require extended study just to grasp the fundamentals and begin to appreciate potential applications of the technology.

Even with a clear understanding of a particular technology, it can be challenging, or even impossible, to accurately predict its impact on the industry and, ultimately, the ways it will be integrated into the architectures of computing systems used by governments, businesses, and the public.

A practical and easy-to-implement approach for information gathering is to develop a collection of trusted sources for both mainstream and technical news and keep up to date with the information they offer. Mainstream news organizations, including television news, newspapers, magazines, and websites, often publish articles about promising technological developments and the impacts digital devices are having on societies around the world. In addition to discussing the purely technical aspects of computing systems (to some degree), these sources provide information on the social impact of computing technologies, such as concerns about its use for government and corporate surveillance and its employment in the spread of disinformation.

Technical websites operated by research organizations, individual technology experts, and enthusiastic users offer an immense quantity of information related to advances in computer architecture. As with all information accessible on the Internet, it is advisable to consider the reliability of the source whenever you encounter surprising information. While there are many spirited debates underway regarding the efficacy of individual early-stage technologies, there are also some people who appear to disagree with published information just to hear themselves argue. It is ultimately up to you to determine how much credence you should grant any opinions expressed on a web page.

Although individuals will have their own preferences, and the landscape of technology news sources is ever-changing, the following list provides a few fairly reliable sources of news on computing technology, in no particular order:

  • https://techcrunch.com/: TechCrunch reports on the business of the tech industry.
  • https://www.wired.com/: Wired is a monthly magazine and website that focuses on how emerging technologies affect culture, the economy, and politics.
  • https://arstechnica.com/: Ars Technica, founded in 1998, publishes information targeted at technologists and information technology professionals.
  • https://www.tomshardware.com/: Tom's Hardware provides news, articles, price comparisons, and reviews of computer hardware and high-technology devices.
  • https://www.engadget.com/: Engadget, founded in 2004, covers the intersection of gaming, technology, and entertainment.
  • https://gizmodo.com/: Gizmodo focuses on design, technology, and science fiction. The website tagline is "We come from the future."
  • https://thenextweb.com/: TNW was started in 2006 to bring insight and meaning to the world of technology.

This list, while by no means complete, provides some starting points for gathering information on the current state and near future of computing technology and its applications.

Information retrieved online can, when approached from a reasonably skeptical viewpoint, provide current and accurate information on the state of advances in computer architecture. Information consumed in this manner does not, however, provide an education with the rigorousness associated with formal schooling, or provide any form of public declaration that you have absorbed this information and are capable of making use of it in a professional context.

A college degree, the subject of the next section, provides a thorough grounding in a subject discipline and is generally accepted by potential employers and clients as evidence of the attainment of professional skills.

College education

If it has been a few years since you last attended college, or if you began your career without a college degree, it may be time to consider enrolling in a degree program. If even the thought of undertaking such a journey seems out of the question because of work or family responsibilities, consider that many accredited institutions offering excellent programs in areas of study directly related to computer architecture provide fully online education experiences. Online classes, combined with proctored examinations, can lead to Bachelor's and Master's degrees in technical disciplines from some of the most respected universities in the world.

For workers with a degree who have been in the workforce for several years, the technology and analytical methods learned in school may have become stale and obsolete to some degree. To restore relevance and remain fully informed about the forefront of technologies involved in the design and production of modern computer systems, the best approach may be a return to the classroom to gain a deeper understanding of technical advances that have occurred in the intervening years.

If you are not prepared to commit to a degree program, many institutions offer online courses leading to a certificate in a subject area such as computer hardware engineering or computer engineering technology. While providing a lesser credential than a Bachelor's or Master's degree, completion of a technology certificate program nevertheless demonstrates a level of educational attainment and knowledge of the subject matter.

There will be some expense for tuition and books when taking college courses, whether the learning venue is in-person or online. Some employers are willing to provide partial or complete funding for the participation of employees in accredited degree programs. This funding may be accompanied by a mandatory commitment by the student to remain with the employer for some period following completion of the coursework. Students should take care to fully understand any obligations they may incur if circumstances require them to withdraw from school or leave the employer.

Many websites are available to assist with a search for an online college degree or certificate program that meets your needs. Some examples follow:

Without being too repetitive with our warnings, you should carefully scrutinize any information gleaned from the Internet regarding online colleges. Ensure any institution under consideration is appropriately accredited and that the degrees it confers are accepted and valued by employers.

Those with the necessary resources, possibly with support provided by an employer, may even consider becoming a full-time student for the duration of a degree program. Employers who pay for degree programs will typically expect the student to agree to a binding commitment to the organization following completion of such a program. This approach can provide the quickest turnaround to a college degree and, in many cases, presents opportunities for participation in cutting-edge research on some of the most advanced computing technologies under development.

While a college degree from a respected institution in a relevant field of study is the gold-standard credential sought by employers and recognized by peers, opportunities are available to keep up with the latest research findings through participation in conferences and by reading scientific literature. These learning options are explored in the next section.

Conferences and literature

For professionals interested in keeping up with the leading edge of research in technologies related to the computer architectures of the future, there may be no better forum than hearing about the latest developments from the researchers themselves. There are regular conferences at locations around the world on every advanced computing topic you can imagine. For example, a list of worldwide conferences on the subject of quantum behavior, including many focusing on aspects of quantum computing, is available at http://quantum.info/conf/index.html.

As with other information from the Internet, it is helpful to view any unfamiliar conference with a degree of skepticism until you have vetted it thoroughly. There is, unfortunately, a phenomenon known as junk conferences, in which predatory individuals or organizations arrange conferences for the purpose of revenue generation rather than for sharing scientific knowledge. Be sure that any conference you sign up for and attend is overseen by a reputable organization and contains presentations by legitimate researchers in subject areas relevant to the conference.

There is a wide variety of scientific literature related to ongoing advances in technologies related to computer architecture. Professional organizations, such as IEEE, publish numerous scholarly journals devoted to the cutting edge of current research. Journals such as these are intended to communicate directly from researcher to researcher, so the level of technical knowledge expected of readers is quite high. If you have the necessary background and the willingness to appreciate the details in the papers published in scientific journals, you can read them to establish and maintain a level of knowledge on par with that of the scientists and engineers developing the next generation of computing technology.

Summary

Let's briefly review the topics we've discussed and learned about in the chapters of this book:

  • In Chapter 1, Introducing Computer Architecture, we began with the earliest design of an automated computing machine, Babbage's Analytical Engine, and traced the course of digital computer history from the earliest vacuum tube-based computers through to the first generations of processors. We also looked at the architecture of an early, but still prevalent, microprocessor: the 6502.
  • In Chapter 2, Digital Logic, we learned the basics of transistor technology, digital logic, registers, and sequential logic. We also discussed the use of hardware description languages in the development of complex digital devices.
  • Chapter 3, Processor Elements, covered the fundamental components of processors, including the control unit, the ALU, and the register set. The chapter introduced concepts related to the processor instruction set, including details on 6502 addressing modes, instruction categories, interrupt processing, and I/O operations.
  • Chapter 4, Computer System Components, introduced the MOSFET transistor and described its use in DRAM circuit technology. The chapter covered the processing and communication subsystems of modern computers, including the I/O subsystem, graphics displays, the network interface, and interfaces for the keyboard and mouse.
  • In Chapter 5, Hardware-Software Interface, we learned about the inner workings of drivers and how the BIOS firmware of the original PC has transitioned to UEFI in modern computers. This chapter covered the boot process and the concepts associated with processes and threads in modern operating systems.
  • Chapter 6, Specialized Computing Domains, introduced the unique features of real-time computing, digital signal processing, and GPU processing. Examples of specialized computing architectures relying on unique processing capabilities were presented, including cloud computer servers, business desktop computers, and high-performance gaming computers.
  • Chapter 7, Processor and Memory Architectures, addressed processor and memory architectures, including the unique features of the von Neumann, Harvard, and modified Harvard architectures. The chapter described the distinction between physical and virtual memory, and introduced the architecture of paged virtual memory, including the functions of an MMU.
  • In Chapter 8, Performance-Enhancing Techniques, we learned about a variety of techniques used in modern processors to accelerate instruction execution speed. Topics included cache memory, instruction pipelining, superscalar processing, simultaneous multithreading, and SIMD processing.
  • Chapter 9, Specialized Processor Extensions, addressed several auxiliary processor capabilities, including privileged execution modes, floating-point mathematics, power management, and system security management.
  • Chapter 10, Modern Processor Architectures and Instruction Sets, delved into the details of the architectures and instruction sets of the most prevalent 32-bit and 64-bit modern processors. For each of the x86, x64, 32-bit ARM, and 64-bit ARM processor architectures, the chapter introduced the register set, addressing modes, and instruction categories, and presented a simple but functional assembly language program.
  • Chapter 11, The RISC-V Architecture and Instruction Set, examined the features of the RISC-V architecture in detail. The chapter introduced the base 32-bit architecture, including the register set, instruction set, and standard extensions to the instruction set. Additional topics included the 64-bit version of the architecture and standard configurations available as commercially produced RISC-V processors. The chapter included a simple RISC-V assembly language program and provided guidance for implementing a RISC-V processor in a low-cost FPGA device.
  • Chapter 12, Processor Virtualization, introduced concepts associated with processor virtualization, including challenges that virtualization tools must overcome. The techniques used to implement virtualization in modern processor families, including x86, ARM, and RISC-V, were discussed. Several popular virtualization tools were described, and virtualization approaches used in cloud computing environments were presented.
  • Chapter 13, Domain-Specific Computer Architectures examined some specific computer architectures, including smartphones, personal computers, warehouse-scale cloud-computing environments, and neural networks. The unique processing requirements associated with each of these domains were examined and the tailoring of processor hardware to optimize the trade-off between cost, performance, and power consumption in each case was discussed.

In this chapter, we attempted to gain some perspective on the road ahead for computer architectures. We reviewed the major advances and ongoing trends that have led to the current state of computer design and attempted to extrapolate forward to identify the directions the development of computing system architectures is likely to take in the future. We also examined some potentially disruptive technologies that could alter the path of future computer architectures. To get a tiny glimpse into this future, if you work through the exercises at the end of this chapter, you will develop a quantum computing algorithm and run it on an actual quantum computer, for free!

This chapter also reviewed some suggested approaches for professional development for the computer architect that should lead to a skill set that remains relevant and tolerant of future advances, whatever they may be.

Having completed this chapter, and this book, you will have a good understanding of the evolution of computer architecture design from the earliest days to its current state, and will be familiar with ongoing trends in computer architecture that are likely to indicate future technological directions. You will also be aware of some potentially disruptive technologies that may substantially alter computer architectures in the future. Finally, you will have learned some useful techniques for maintaining a current skill set in the field of computer architecture.

This brings us to the end of the book. I hope you have enjoyed reading it and working through the exercises as much as I have enjoyed writing it and working through the exercises myself.

Exercises

  1. Install the Qiskit quantum processor software development framework by following the instructions at https://qiskit.org/documentation/install.html. The instructions suggest the installation of the Anaconda (https://www.anaconda.com/) data science and machine learning tool set. After installing Anaconda, create a Conda virtual environment named qisketenv to contain your work on quantum code and install Qisket in this environment with the command pip install qiskit. Make sure that you install the optional visualization dependencies with the pip install qiskit-terra[visualization] command.
  2. Create a free IBM Quantum Experience account at https://quantum-computing.ibm.com/. Locate your IBM Quantum Services API token at https://quantum-computing.ibm.com/account and install it into your local environment using the instructions at https://qiskit.org/documentation/install.html.
  3. Work through the example quantum program at https://qiskit.org/documentation/tutorials/fundamentals/1_getting_started_with_qiskit.html. This example creates a quantum circuit containing three qubits that implements a Greenberger–Horne–Zeilinger (GHZ) state. The GHZ state exhibits key properties of quantum entanglement. Execute the code in a simulation environment on your computer.
  4. Execute the code from Exercise 3 on an IBM quantum computer.