The traditional definition of Moore’s Law is The density of transistors on a semiconductor chip doubles on average every 18-24 months.<\/strong><\/p>\n It was first proposed in April 1965 by Gordon Moore<\/a>, the founder of the chip company Intel, in a paper called \u00abCramming more components onto Integrated Circuit<\/a>\u00bb (Cramming more components into Integrated Circuit)…<\/p>\n In August 2013, Bob Colwell, who once worked for Intel as the chief designer, declared at an industry conference in Chicago: Moore’s Law in the chip industry is about to end.<\/p>\n \u00abFrom the time frame of the plan, I think that 2020 may be the earliest end of Moore’s Law. You may be able to convince me to drag it to 2022, but no matter it (gate length, minimum line width of the crystal gate) is 7 nanometers, Or 5 nanometers, this (the end of Moore’s Law) is a big deal.\u00bb (The diameter of an ordinary human hair is about 75,000 nanometers)<\/p>\n Colwell, not the first, nor the last, predicted the end of Moore’s Law.<\/p>\n Moore himself predicted in 1995 that Moore’s Law would end in 2005.<\/p>\n In 2015, Moore himself once again predicted that Moore’s Law will end in 2025.<\/p>\n The most recent various predictions about the end of Moore’s Law, the main reason is that by 2025 if the gate length is reduced to only 3 nanometers, its length is only equivalent to the size of ten atoms. At this scale, the way electrons behave will enter the field of the uncertainty principle of quantum mechanics, and the reliability of transistors will be completely impossible to guarantee. In addition, at this scale, the heat dissipation of transistors and the production cost control of chips seem to be insurmountable technical challenges.<\/em><\/p>\n Will Moore’s Law really end?<\/strong><\/p>\n If so, does it mean that the development of science and technology will be stagnant, and humans will eat together on the earth and wait for death?<\/p>\n If not, what does it mean for the progress of human civilization in the future?<\/p>\n <\/p>\n Before looking to the future, it is very necessary to review the evolution of Moore’s Law over the past fifty years.<\/p>\n The concept of transistor density originally proposed by Moore in the 1965 paper is not the maximum number of transistors that can be placed on the chip, but the optimal value of the number of transistors from the perspective of production cost.<\/p>\n When producing chips, increasing the number of transistors generally reduces the unit cost of transistors. However, when the number exceeds a critical point, the probability of chip production defects increases and begins to offset the benefits of increased density. In the design and production of integrated circuits, it is ultimately necessary to seek one The best point.<\/p>\n Moore predicted in 1965 that the density of transistors would double every year within ten years. By 1975, the number of transistors on a chip would increase from 64 in 1965 to 65,000 in 1975.<\/p>\n Later, a memory chip produced by Intel in 1975 (with an area of a quarter square inch, equivalent to 161 square millimetres) reached 32,000 transistors, which was very close to Moore’s initial prediction.<\/p>\n In 1975, Moore summarized the main reasons for the increase in chip density in the previous ten years in a paper:<\/p>\n 1. Miniaturization of transistors However, the improvement in space utilization is ultimately limited, so Moore revised his forecast in 1975 to change the growth rate of transistor density from doubling every year to doubling every two years.<\/p>\n Take memory chips as an example. In 2000, DRAM had 256,000,000 transistors on an area of \u200b\u200b204 square millimetres. Compared with 1975, the density of transistors has increased by 6,300 times in 25 years. (If it is doubled in two years according to Moore’s Law Speed, 25 years is about 5800 times increase, basically relatively close)<\/p>\n The storage capacity of the corresponding chip has increased from 0.001 Mb to 256 Mb, an increase of 250,000 times.<\/p>\n In traditional engineering design, it is often necessary to weigh the pros and cons of multiple factors. But for a long time, the miniaturization of transistors has not only increased the density in practice but also made the transistor faster and lower energy consumption. There is no need to worry about other factors. limit.<\/p>\n On average, the generation of chip production technology is changed every two years, the gate length is reduced by 30% (x 0.7), the corresponding transistor density is doubled, the delay between transistors is reduced by 30%, the corresponding clock frequency is increased by 40%, and the voltage of the transistor is reduced by 30 %, the unit energy consumption is reduced by 50%. Since the overall number of transistors doubles, the overall energy consumption remains the same, but the overall circuit is 40% faster.<\/p>\n However, at the beginning of this century, the miniaturization of transistors encountered a bottleneck. When the gate length was less than 100 nanometers, the problem of transistor leakage became serious and became a problem that cannot be ignored.<\/p>\n <\/p>\n The essence of the transistor is to use the \u00abon\u00bb and \u00aboff\u00bb states to represent the 1 and 0 in the binary system.<\/p>\n The so-called Field-Effect Transistor in integrated circuits is mainly composed of three parts: Source, Gate, and Drain. The gate is essentially a capacitor, which is applied to it When voltage is applied, the channel under the gate connects the source and drain, and the transistor is turned on, which represents the state of \u00ab1\u00bb. When the voltage is cancelled, the current drops to zero and the transistor is turned off, which represents the state of \u00ab0\u00bb.<\/p>\n <\/p>\n People usually say that the clock frequency of the CPU is the speed of the transistor switching. 1 GHz means that it can switch one billion times in 1 second.<\/p>\n Why did the human computing revolution choose transistors?<\/p>\n Because of the continuous miniaturization of transistors, the computing power per unit production cost has continued to increase exponentially.<\/strong><\/p>\n In contrast, the ancient abacus, the speed of the abacus fiddle (similar to the speed of a transistor switch), and the capacity of data have not been substantially improved for more than two thousand years.<\/p>\n <\/p>\n With the continuous miniaturization of transistors, various leakage problems have become a major obstacle to the development of Moore’s Law. Leakage means a great increase in energy consumption, and the chip overheats or even fails.<\/p>\n A typical type of leakage is the so-called \u00abGate Oxide Leakage\u00bb.<\/p>\n Under the gate of the traditional field-effect transistor is a layer of silicon oxide (Silicon Oxide) material, and its thickness decreases with the miniaturization of the transistor (otherwise it will affect the capacitance of the gate and the performance of the transistor). When the gate length is reduced to the order of 45 nanometers, the effective thickness of silicon dioxide is only about one nanometer. Due to the effect of quantum tunnelling, it will cause serious leakage of the gate.<\/p>\n <\/p>\n In the end, Intel\u2019s solution after tens of thousands of experiments is to use a \u00abhigh dielectric\u00bb material, which is based on metal hafnium oxide, instead of silicon dioxide. The physical Vthickness is not reduced, but it does not affect the capacitance of the gate.<\/p>\n The 45-nanometer chip introduced by Intel in 2007 has reduced gate leakage by more than 90% compared to the previous generation technology.<\/p>\n <\/p>\n Another type of leakage comes from the so-called \u00abShort Channel Effect\u00bb (Short Channel Effect) problem. In short, the gate length of the transistor keeps shrinking, the threshold voltage of the transistor’s conduction keeps dropping, and there is still a weak voltage at zero voltage. Electric current passes.<\/p>\n The essence of this problem is that when the gate is short, the drain itself becomes a capacitor and competes with the gate. The smaller the gate, the farther away from the gate, between the source and the drain The leakage current cannot be controlled. As shown in the figure below.<\/p>\n <\/p>\n In 1996, when the industry was still producing 250nm chips, the public view was that it was almost impossible to miniaturize transistors below 100nm. But the Defense Advanced Research Projects Agency (DARPA) was already thinking about miniaturization to 25nm. , The leakage caused by the short channel effect is a challenge.<\/p>\n Professor Hu Chengming<\/a> of the University of California, Berkeley, received DARPA funding in 1997 and proposed the FinFET design concept. The essence of the idea is to wrap the transistor on the three sides with a gate so that any channel between the source and the drain is separated from The gate is not too far away, and the leakage is caused by the short channel effect is greatly reduced.<\/p>\n This design, because it is shaped like a fin (Fin), is also called FinFET. (FET is the abbreviation of \u00abField Effect Transistor\u00bb)<\/p>\n <\/p>\n More than ten years later, after overcoming various production technology challenges, Intel used FinFET technology for the first time in 22nm chips in 2011. This technology was called by Gordon Moore \u00abthe most radical in the semiconductor industry in the past 40 years.\u00bb Changes\u00bb.<\/p>\n <\/p>\n If history is a lesson, breaking the physical limits of transistor miniaturization is not as pessimistic as observers are now. A problem that seemed insurmountable at first would have unexpected solutions from a different perspective.<\/p>\n Moore’s Law originally talked about the density of transistors.<\/p>\n Increased density means miniaturization of transistors, which means:<\/p>\n Miniaturization is just the appearance. In the case of the same production cost and energy consumption, improving computing power is the essence of Moore’s Law.<\/strong> According to this idea, there are actually many ways to promote Moore’s Law.<\/p>\n <\/p>\n Before 2002, with the increase in chip density, the clock frequency of the CPU has also been increasing. For ordinary consumers, the frequency of the CPU represents the speed of the computer. The first IBM PC shipped in 1981, the frequency of the CPU was 4.77 Megahertz, which is equivalent to 4.77 million clock cycles per second. Assuming that the CPU can run instruction in one clock cycle, the higher the frequency, the faster it will be.<\/p>\n In 1995, the clock frequency of the Pentium chip reached 100 MHz, which is more than twenty times that of 1980.<\/p>\n In 2002, the clock frequency of Intel’s new Pentium chip exceeded 3000 MHz (3 GHz) for the first time.<\/p>\n The first major physical constraint that limits the clock frequency is the hysteresis of the signal transmission between transistors. This is why the greater the transistor density, the higher the clock frequency can be.<\/p>\n After 2002, the increase in CPU clock frequency encountered a second technical bottleneck: energy consumption.<\/p>\n Simply put, the energy consumption of the CPU is approximately proportional to the third power of the clock frequency. After 3 GHz, the continued increase in the frequency will cause the chip to overheat and face the risk of burning.<\/p>\n In fact, after 2002, the clock frequency of Intel CPUs has mostly been between 2 GHz-4 GHz, and there has been no substantial increase in 14 years.<\/p>\n <\/p>\n But the clock frequency no longer increases, it does not mean that CPU performance is stagnant. Just like the human brain, there has been no essential change in the past 200,000 years, but it does not mean that human civilization will not undergo ground-breaking progress.<\/p>\n At this time, the most useful idea is to find new dimensions to attack and solve problems.<\/strong><\/p>\n <\/p>\n If the clock speed of the CPU is like the calculation speed of the human brain, then the memory read speed of the CPU is like the speed at which people can obtain information. This is the first different dimension of improving CPU performance.<\/p>\n People who have basic work or research experience will have this experience:<\/p>\n Most of the time, the bottleneck that limits work efficiency is: check information and find things.<\/p>\n If you can’t find it, you can only do it in a hurry.<\/p>\n Twenty years ago, researchers who searched for materials had to go to the library, and small libraries had to go to larger libraries if they had no materials. Before searching by computers, they needed to flip through the cards one by one. The time to find materials was just a few. Hours or more, surpassing the time for real research and analysis. This is completely different from today, where you can accurately search and download most of the world’s papers on the Internet within ten seconds.<\/p>\n The computer’s memory architecture is actually subdivided into Register (register), Cache (cache), Memory (memory), Disk (hard disk). And the cache can be subdivided into Level 1 Cache, Level 2 Cache, three-level cache, or even four-level cache.<\/p>\n <\/p>\n For example, the data in the register is like the information written on the piece of paper in your hand. The amount of information is very small, but it is desirable to wait.<\/p>\n The first level cache, like a book on the desktop, has more information and can be obtained by reaching out;<\/p>\n The secondary cache, like a book in a drawer, can still be obtained soon after opening the drawer;<\/p>\n Memory, like a book on a shelf, stand up to find it;<\/p>\n The hard disk is the library’s materials, and it takes a few hours to go outside to find it.<\/p>\n Researchers, if they can’t get the materials they need quickly, have to run to the library every day, even if Newton\/Einstein is reborn, the smart brain can only be like a high-speed CPU, idling ineffectively, and painfully coming and going to the library. Waiting on the road.<\/p>\n Take Intel’s i7-4770 CPU as an example, its clock frequency is 3.4 GHz. For the primary and secondary caches, the latency of reading data is generally 5-12 clock cycles, which is equivalent to about 2-4 nanoseconds. If you want to access the memory When reading data, the lag is about 70 nanoseconds, which is equivalent to more than 200 clock cycles. If the memory cannot be found, unfortunately, you have to search on the hard disk. The delay is more than 4 milliseconds (equivalent to 4 million nanoseconds), and then faster The CPU clock frequency of the CPU is also dead at this time.<\/p>\n <\/p>\n With the development of Moore’s Law, the CPU clock speed is different from the read latency of ordinary memory (DRAM). The gap is increasing at a rate of 50% every year.<\/p>\n <\/p>\n In order to alleviate this contradiction, the cache (Cache) first appeared on Intel’s 386 processor in 1985 in an external form.<\/p>\n The built-in cache on the real chip first appeared on the 486 processor in 1989, when the capacity was only 8 KB, and the capacity was increased to 16 KB in the 1990s.<\/p>\n If the cache capacity is too large, it will affect the search speed, so there are secondary and tertiary caches. There are many subtle design details, which are not shown here.<\/p>\n The cache is essentially a memory based on SRAM (Static Random Access Memory). SRAM is essentially a logic unit composed of six transistors, as shown in the figure below.<\/p>\n <\/p>\n With the miniaturization of transistors, chip designers continue to add more built-in caches to the CPU chip.<\/p>\n Take the 14-nanometer i7-6560U processor produced by Intel in September 2015 as an example. It has two cores, each core has 64 KB of L1 cache, 256 KB of L2 cache, and shares a 4 MB The third-level cache.<\/p>\n The ratio of transistors used for cache to the transistors on the entire CPU chip has also changed from about 40% in the 486 eras to nearly 90% on many CPUs today. (Data source comes from a paper by Doug Burger<\/a>, University of Wisconsin, \u00abSystem-level Implications of Processor Memory Integration<\/a>\u00ab)<\/p>\n In other words, the management of computing, nearly 90% of the connotation, is actually the management of memory.<\/p>\n No matter what industry you are in, if you can efficiently search and store massive amounts of data, you may have succeeded 90%.<\/p>\n <\/p>\n Another dimension to solve the CPU clock bottleneck problem is to increase the parallelism of the system and do more things at the same time.<\/strong><\/p>\n Traditionally, a CPU chip has only one processor (core, also known as core or core). When the clock speed of a single CPU is difficult to increase, another idea of the chip designer is to add new ones on the same chip. The kernel allows multiple cores to process some calculations in parallel at the same time.<\/p>\nII.<\/h2>\n
\n2. Increased chip area
\n3. New design techniques improve space utilization<\/p>\nIII.<\/h2>\n
IV.<\/h2>\n
V.<\/h2>\n
VI.<\/h2>\n
\n
VII.<\/h2>\n
VIII.<\/h2>\n
IX.<\/h2>\n
X.<\/h2>\n