Processor Architectures Overview
Processor Architectures Overview
RISC-CISC-Harvard-Von Neumann
    PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information.
                              PDF generated at: Thu, 23 Feb 2012 19:39:43 UTC
Contents
Articles
   Von Neumann architecture                    1
   Harvard architecture                        8
   Complex instruction set computing          10
   Reduced instruction set computing          13
References
   Article Sources and Contributors           21
   Image Sources, Licenses and Contributors   22
Article Licenses
   License                                    23
Von Neumann architecture                                                                                                     1
    History
    The earliest computing machines had fixed programs. Some very simple computers still use this design, either for
    simplicity or training purposes. For example, a desk calculator (in principle) is a fixed program computer. It can do
    basic mathematics, but it cannot be used as a word processor or a gaming console. Changing the program of a
    fixed-program machine requires re-wiring, re-structuring, or re-designing the machine. The earliest computers were
    not so much "programmed" as they were "designed". "Reprogramming", when it was possible at all, was a laborious
    process, starting with flowcharts and paper notes, followed by detailed engineering designs, and then the
    often-arduous process of physically re-wiring and re-building the machine. It could take three weeks to set up a
    program on ENIAC and get it working.[4]
    With the proposal of the stored-program computer this changed. A stored-program computer includes by design an
    instruction set and can store in memory a set of instructions (a program) that details the computation.
    A stored-program design also allows for self-modifying code. One early motivation for such a facility was the need
    for a program to increment or otherwise modify the address portion of instructions, which had to be done manually
    in early designs. This became less important when index registers and indirect addressing became usual features of
    machine architecture. Another use was to embed frequently used data in the instruction stream using immediate
    addressing. Self-modifying code has largely fallen out of favor, since it is usually hard to understand and debug, as
    well as being inefficient under modern processor pipelining and caching schemes.
    On a large scale, the ability to treat instructions as data is what makes assemblers, compilers and other automated
    programming tools possible. One can "write programs which write programs".[5] On a smaller scale, repetitive
    I/O-intensive operations such as the BITBLT image manipulation primitive or pixel & vertex shaders in modern 3d
    graphics, were considered inefficient to run without custom hardware. These operations could be accelerated on
    general purpose processors with "on the fly compilation" ("just-in-time compilation") technology, e.g.,
    code-generating programsone form of self-modifying code that has remained popular.
    There are drawbacks to the Von Neumann design. Aside from the Von Neumann bottleneck described below,
    program modifications can be quite harmful, either by accident or design. In some simple stored-program computer
    designs, a malfunctioning program can damage itself, other programs, or the operating system, possibly leading to a
Von Neumann architecture                                                                                                    2
    computer crash. Memory protection and other forms of access control can usually protect against both accidental and
    malicious program modification.
    Both von Neumann's and Turing's papers described stored-program computers, but von Neumann's earlier paper
    achieved greater circulation and the computer architecture it outlined became known as the "von Neumann
    architecture". In the 1953 publication Faster than Thought: A Symposium on Digital Computing Machines (edited by
    B.V. Bowden), a section in the chapter on Computers in America reads as follows:[14]
          THE MACHINE OF THE INSTITUTE FOR ADVANCED STUDIES, PRINCETON
          In 1945, Professor J. von Neumann, who was then working at the Moore School of Engineering in
          Philadelphia, where the E.N.I.A.C. had been built, issued on behalf of a group of his co-workers a
          report on the logical design of digital computers. The report contained a fairly detailed proposal for the
          design of the machine which has since become known as the E.D.V.A.C. (electronic discrete variable
          automatic computer). This machine has only recently been completed in America, but the von Neumann
          report inspired the construction of the E.D.S.A.C. (electronic delay-storage automatic calculator) in
          Cambridge (see page 130).
          In 1947, Burks, Goldstine and von Neumann published another report which outlined the design of
          another type of machine (a parallel machine this time) which should be exceedingly fast, capable
          perhaps of 20,000 operations per second. They pointed out that the outstanding problem in constructing
          such a machine was in the development of a suitable memory, all the contents of which were
          instantaneously accessible, and at first they suggested the use of a special tubecalled the Selectron,
          which had been invented by the Princeton Laboratories of the R.C.A. These tubes were expensive and
          difficult to make, so von Neumann subsequently decided to build a machine based on the Williams
          memory. This machine, which was completed in June, 1952 in Princeton has become popularly known
          as the Maniac. The design of this machine has inspired that of half a dozen or more machines which are
          now being built in America, all of which are known affectionately as "Johniacs."'
    In the same book, the first two paragraphs of a chapter on ACE read as follows:[15]
          AUTOMATIC COMPUTATION AT THE NATIONAL PHYSICAL LABORATORY'
          One of the most modern digital computers which embodies developments and improvements in the
          technique of automatic electronic computing was recently demonstrated at the National Physical
          Laboratory, Teddington, where it has been designed and built by a small team of mathematicians and
          electronics research engineers on the staff of the Laboratory, assisted by a number of production
          engineers from the English Electric Company, Limited. The equipment so far erected at the Laboratory
          is only the pilot model of a much larger installation which will be known as the Automatic Computing
          Engine, but although comparatively small in bulk and containing only about 800 thermionic valves, as
          can be judged from Plates XII, XIII and XIV, it is an extremely rapid and versatile calculating machine.
          The basic concepts and abstract principles of computation by a machine were formulated by Dr. A. M.
          Turing, F.R.S., in a paper1. read before the London Mathematical Society in 1936, but work on such
          machines in Britain was delayed by the war. In 1945, however, an examination of the problems was
          made at the National Physical Laboratory by Mr. J. R. Womersley, then superintendent of the
          Mathematics Division of the Laboratory. He was joined by Dr. Turing and a small staff of specialists,
          and, by 1947, the preliminary planning was sufficiently advanced to warrant the establishment of the
          special group already mentioned. In April, 1948, the latter became the Electronics Section of the
          Laboratory, under the charge of Mr. F. M. Colebrook.
Von Neumann architecture                                                                                                  4
    Evolution
    Through the decades of the 1960s and 1970s computers generally
    became both smaller and faster, which led to some evolutions in their
    architecture. For example, memory-mapped I/O allows input and
    output devices to be treated the same as memory.[20] A single system
    bus could be used to provide a modular system with lower cost. This is
    sometimes called a "streamlining" of the architecture.[21] In subsequent
    decades, simple microcontrollers would sometimes omit features of the
    model to lower cost and size. Larger computers added features for
    higher performance.
                                                                                    Single system bus evolution of the architecture
    The term "von Neumann bottleneck" was coined by John Backus in his 1977 ACM Turing Award lecture. According
    to Backus:
          Surely there must be a less primitive way of making big changes in the store than by pushing vast
          numbers of words back and forth through the von Neumann bottleneck. Not only is this tube a literal
          bottleneck for the data traffic of a problem, but, more importantly, it is an intellectual bottleneck that has
          kept us tied to word-at-a-time thinking instead of encouraging us to think in terms of the larger
          conceptual units of the task at hand. Thus programming is basically planning and detailing the enormous
          traffic of words through the von Neumann bottleneck, and much of that traffic concerns not significant
          data itself, but where to find it.[22][23]
    The performance problem can be alleviated (to some extent) by several mechanisms. Providing a cache between the
    CPU and the main memory, providing separate caches or separate access paths for data and instructions (the
    so-called Modified Harvard architecture), using branch predictor algorithms and logic, and providing a limited CPU
    stack to reduce memory access are four of the ways performance is increased. The problem can also be sidestepped
    somewhat by using parallel computing, using for example the Non-Uniform Memory Access (NUMA)
    architecturethis approach is commonly employed by supercomputers. It is less clear whether the intellectual
    bottleneck that Backus criticized has changed much since 1977. Backus's proposed solution has not had a major
    influence. Modern functional programming and object-oriented programming are much less geared towards "pushing
    vast numbers of words back and forth" than earlier languages like Fortran were, but internally, that is still what
    computers spend much of their time doing, even highly parallel supercomputers.
    In some cases, emerging memristor technology may be able to circumvent the von Neumann bottleneck.[24]
Von Neumann architecture                                                                                                                              6
References
    Inline
    [1]    von Neumann 1945
    [2]    Ganesan 2009
    [3]    Markgraf, Joey D. (2007), The Von Neumann bottleneck (http:/ / aws. linnbenton. edu/ cs271c/ markgrj/ ), , retrieved August 24, 2011
    [4]    Copeland 2006, p.104
    [5]    MFTL (My Favorite Toy Language) entry Jargon File 4.4.7 (http:/ / catb. org/ ~esr/ jargon/ html/ M/ MFTL. html), , retrieved 2008-07-11
    [6]    Turing, A.M. (1936), "On Computable Numbers, with an Application to the Entscheidungsproblem", Proceedings of the London
          Mathematical Society, 2 42: 23065, 1937, doi:10.1112/plms/s2-42.1.230 (and Turing, A.M. (1938), "On Computable Numbers, with an
          Application to the Entscheidungsproblem: A correction", Proceedings of the London Mathematical Society, 2 43 (6): 5446, 1937,
          doi:10.1112/plms/s2-43.6.544)
    [7] The Life and Work of Konrad Zuse Part 10: Konrad Zuse and the Stored Program Computer (http:/ / web. archive. org/ web/
        20080601160645/ http:/ / www. epemag. com/ zuse/ part10. htm), archived from the original (http:/ / www. epemag. com/ zuse/ part10. htm)
        on June 1, 2008, , retrieved 2008-07-11
    [8] Lukoff, Herman (1979), From Dits to Bits...: A Personal History of the Electronic Computer, Robotics Press, ISBN978-0-89661-002-6
    [9] ENIAC project administrator Grist Brainerd's December 1943 progress report for the first period of the ENIAC's development implicitly
        proposed the stored program concept (while simultaneously rejecting its implementation in the ENIAC) by stating that "in order to have the
        simplest project and not to complicate matters" the ENIAC would be constructed without any "automatic regulation".
    [10] Copeland 2006, p.113
    [11] Copeland, Jack (2000), A Brief History of Computing: ENIAC and EDVAC (http:/ / www. alanturing. net/ turing_archive/ pages/ Reference
        Articles/ BriefHistofComp. html#ACE), , retrieved 27 January 2010
    [12] Copeland, Jack (2000), A Brief History of Computing: ENIAC and EDVAC (http:/ / www. alanturing. net/ turing_archive/ pages/ Reference
        Articles/ BriefHistofComp. html#ACE), , retrieved 27 January 2010 which cites Randell, B. (1972), Meltzer, B.; Michie, D., eds., "On Alan
        Turing and the Origins of Digital Computers", Machine Intelligence 7 (Edinburgh: Edinburgh University Press): 10, ISBN0902383264
    [13] Copeland 2006, pp.108111
    [14] Bowden 1953, pp.176,177
    [15] Bowden 1953, p.135
    [16] "Electronic Computer Project" (http:/ / www. ias. edu/ people/ vonneumann/ ecp/ ). Institute for Advanced Study. . Retrieved May 26, 2011.
    [17] Illiac Design Techniques, report number UIUCDCS-R-1955-146, Digital Computer Laboratory, University of Illinois at
        Urbana-Champaign, 1955
    [18] F.E. Hamilton, R.R. Seeber, R.A. Rowley, and E.S. Hughes (January 19, 1949). "Selective Sequence Electronic Calculator" (http:/ / patft1.
        uspto. gov/ netacgi/ nph-Parser?Sect1=PTO1& Sect2=HITOFF& d=PALL& p=1& u=/ netahtml/ PTO/ srchnum. htm& r=1& f=G& l=50&
        s1=2636672. PN. & OS=PN/ 2636672& RS=PN/ 2636672). US Patent 2,636,672. . Retrieved April 28, 2011. Issued April 28, 1953.
    [19] Herbert R.J. Grosch (1991), Computer: Bit Slices From a Life (http:/ / www. columbia. edu/ acis/ history/ computer. html), Third
        Millennium Books, ISBN0-88733-085-1,
    [20] C. Gordon Bell; R. Cady; H. McFarland; J. O'Laughlin; R. Noonan; W. Wulf (1970), "A New Architecture for Mini-ComputersThe DEC
        PDP-11" (http:/ / research. microsoft. com/ en-us/ um/ people/ gbell/ CGB Files/ New Architecture PDP11 SJCC 1970 c. pdf), Spring Joint
        Computer Conference: pp.657675, .
    [21] Linda Null; Julia Lobur (2010), The essentials of computer organization and architecture (http:/ / books. google. com/
        books?id=f83XxoBC_8MC& pg=PA36) (3rd ed.), Jones & Bartlett Learning, pp.36,199203, ISBN9781449600068,
    [22] Backus, John W.. "Can Programming Be Liberated from the von Neumann Style? A Functional Style and Its Algebra of Programs" (http:/ /
        www. cs. cmu. edu/ ~crary/ 819-f09/ Backus78. pdf). . Retrieved 2012-01-20.
    [23] Dijkstra, Edsger W.. "E. W. Dijkstra Archive: A review of the 1977 Turing Award Lecture" (http:/ / www. cs. utexas. edu/ ~EWD/
        transcriptions/ EWD06xx/ EWD692. html). . Retrieved 2008-07-11.
    [24] Mouttet, Blaise L (2009), "Memristor Pattern Recognition Circuit Architecture for Robotics" (http:/ / www. iiis. org/ CDs2008/ CD2009SCI/
        CITSA2009/ PapersPdf/ I086AI. pdf), Proceedings of the 2nd International Multi-Conference on Engineering and Technological Innovation
        II: 6570,
    [25] "COP8 Basic Family Users Manual" (http:/ / www. national. com/ appinfo/ mcu/ files/ Basic_user1. pdf). National Semiconductor. .
        Retrieved 2012-01-20.
Von Neumann architecture                                                                                                                          7
    [26] "COP888 Feature Family Users Manual" (http:/ / www. national. com/ appinfo/ mcu/ files/ Feature_user. pdf). National Semiconductor. .
        Retrieved 2012-01-20.
    General
     Bowden, B.V., ed. (1953), Faster Than Thought: A Symposium on Digital Computing Machines, London: Sir
      Isaac Pitman and Sons Ltd.
     Rojas, Ral; Hashagen, Ulf, eds. (2000), The First Computers: History and Architectures, MIT Press,
      ISBN0-262-18197-5
     Davis, Martin (2000), The universal computer: the road from Leibniz to Turing, New York: W W Norton &
      Company Inc., ISBN0-393-04785-7
     Can Programming be Liberated from the von Neumann Style?, John Backus, 1977 ACM Turing Award Lecture.
      Communications of the ACM, August 1978, Volume 21, Number 8 Online PDF (http://www.stanford.edu/
      class/cs242/readings/backus.pdf)
     C. Gordon Bell and Allen Newell (1971), Computer Structures: Readings and Examples, McGraw-Hill Book
      Company, New York. Massive (668 pages)
     Copeland, Jack (2006), "Colossus and the Rise of the Modern Computer", in Copeland, B. Jack, Colossus: The
      Secrets of Bletchley Park's Codebreaking Computers, Oxford: Oxford University Press, ISBN978-0-19-284055-4
     von Neumann, John (1945), First Draft of a Report on the EDVAC (http://qss.stanford.edu/~godfrey/
      vonNeumann/vnedvac.pdf), retrieved August 24, 2011
     Ganesan, Deepak (2009), The Von Neumann Model (http://none.cs.umass.edu/~dganesan/courses/fall09/
      handouts/Chapter4.pdf), retrieved October 22, 2011
    External links
     Harvard vs von Neumann (http://www.pic24micro.com/harvard_vs_von_neumann.html)
     A tool that emulates the behavior of a von Neumann machine (http://home.gna.org/vov/)
Harvard architecture                                                                                                      8
    Harvard architecture
    The Harvard architecture is a computer
    architecture with physically separate storage
    and signal pathways for instructions and
    data. The term originated from the Harvard
    Mark I relay-based computer, which stored
    instructions on punched tape (24 bits wide)
    and data in electro-mechanical counters.
    These early machines had data storage
    entirely contained within the central
    processing unit, and provided no access to
    the instruction storage as data. Programs
    needed to be loaded by an operator; the
    processor could not boot itself.                                        Harvard architecture
    Memory details
    In a Harvard architecture, there is no need to make the two memories share characteristics. In particular, the word
    width, timing, implementation technology, and memory address structure can differ. In some systems, instructions
    can be stored in read-only memory while data memory generally requires read-write memory. In some systems, there
    is much more instruction memory than data memory so instruction addresses are wider than data addresses.
    Another modification provides a pathway between the instruction memory (such as ROM or flash) and the CPU to
    allow words from the instruction memory to be treated as read-only data. This technique is used in some
    microcontrollers, including the Atmel AVR. This allows constant data, such as text strings or function tables, to be
    accessed without first having to be copied into data memory, preserving scarce (and power-hungry) data memory for
    read/write variables. Special machine language instructions are provided to read data from the instruction memory.
    (This is distinct from instructions which themselves embed constant data, although for individual constants the two
    mechanisms can substitute for each other.)
    Speed
    In recent years, the speed of the CPU has grown many times in comparison to the access speed of the main memory.
    Care needs to be taken to reduce the number of times main memory is accessed in order to maintain performance. If,
    for instance, every instruction run in the CPU requires an access to memory, the computer gains nothing for
    increased CPU speeda problem referred to as being "memory bound".
    It is possible to make extremely fast memory but this is only practical for small amounts of memory for cost, power
    and signal routing reasons. The solution is to provide a small amount of very fast memory known as a CPU cache
    which holds recently accessed data. As long as the data that the CPU needs is in the cache, the performance hit is
    much smaller than it is when the cache has to turn around and get the data from the main memory.
     Microcontrollers are characterized by having small amounts of program (flash memory) and data (SRAM)
      memory, with no cache, and take advantage of the Harvard architecture to speed processing by concurrent
      instruction and data access. The separate storage means the program and data memories can have different bit
      widths, for example using 16-bit wide instructions and 8-bit wide data. They also mean that instruction prefetch
      can be performed in parallel with other activities. Examples include, the AVR by Atmel Corp, the PIC by
      Microchip Technology, Inc. and the ARM Cortex-M3 processor (not all ARM chips have Harvard architecture).
    Even in these cases, it is common to have special instructions to access program memory as data for read-only tables,
    or for reprogramming.
    External links
     Harvard vs Von Neumann [1]
    References
    [1] http:/ / www. pic24micro. com/ harvard_vs_von_neumann. html
    New instructions
    In the 1970s, analysis of high level languages indicated some complex machine language implementations and it was
    determined that new instructions could improve performance. Some instructions were added that were never
    intended to be used in assembly language but fit well with compiled high level languages. Compilers were updated
    to take advantage of these instructions. The benefits of semantically rich instructions with compact encodings can be
    seen in modern processors as well, particularly in the high performance segment where caches are a central
    component (as opposed to most embedded systems). This is because these fast, but complex and expensive,
    memories are inherently limited in size, making compact code beneficial. Of course, the fundamental reason they are
Complex instruction set computing                                                                                             11
needed is that main memories (i.e. dynamic RAM today) remain slow compared to a (high performance) CPU-core.
    Design issues
    While many designs achieved the aim of higher throughput at lower cost and also allowed high-level language
    constructs to be expressed by fewer instructions, it was observed that this was not always the case. For instance,
    low-end versions of complex architectures (i.e. using less hardware) could lead to situations where it was possible to
    improve performance by not using a complex instruction (such as a procedure call or enter instruction), but instead
    using a sequence of simpler instructions.
    One reason for this was that architects (microcode writers) sometimes "over-designed" assembler language
    instructions, i.e. including features which were not possible to implement efficiently on the basic hardware available.
    This could, for instance, be "side effects" (above conventional flags), such as the setting of a register or memory
    location that was perhaps seldom used; if this was done via ordinary (non duplicated) internal buses, or even the
    external bus, it would demand extra cycles every time, and thus be quite inefficient.
    Even in balanced high performance designs, highly encoded and (relatively) high-level instructions could be
    complicated to decode and execute efficiently within a limited transistor budget. Such architectures therefore
    required a great deal of work on the part of the processor designer in cases where a simpler, but (typically) slower,
    solution based on decode tables and/or microcode sequencing is not appropriate. At a time when transistors and other
    components were a limited resource, this also left fewer components and less opportunity for other types of
    performance optimizations.
    Superscalar
    In a more modern context, the complex variable length encoding used by some of the typical CISC architectures
    makes it complicated, but still feasible, to build a superscalar implementation of a CISC programming model
    directly; the in-order superscalar original Pentium and the out-of-order superscalar Cyrix 6x86 are well known
    examples of this. The frequent memory accesses for operands of a typical CISC machine may limit the instruction
    level parallelism that can be extracted from the code, although this is strongly mediated by the fast cache structures
    used in modern designs, as well as by other measures. Due to inherently compact and semantically rich instructions,
    the average amount of work performed per machine code unit (i.e. per byte or bit) is higher for a CISC than a RISC
    processor, which may give it a significant advantage in a modern cache based implementation. (Whether the
    downsides versus the upsides justifies a complex design or not is food for a never-ending debate in certain circles.)
Complex instruction set computing                                                                                              12
    Transistors for logic, PLAs, and microcode are no longer scarce resources; only large high-speed cache memories are
    limited by the maximum number of transistors today. Although complex, the transistor count of CISC decoders do
    not grow exponentially like the total number of transistors per processor (the majority typically used for caches).
    Together with better tools and enhanced technologies, this has led to new implementations of highly encoded and
    variable length designs without load-store limitations (i.e. non-RISC). This governs re-implementations of older
    architectures such as the ubiquitous x86 (see below) as well as new designs for microcontrollers for embedded
    systems, and similar uses. The superscalar complexity in the case of modern x86 was solved with dynamically issued
    and buffered micro-operations, i.e. indirect and dynamic superscalar execution; the Pentium Pro and AMD K5 are
    early examples of this. It allows a fairly simple superscalar design to be located after the (fairly complex) decoders
    (and buffers), giving, so to speak, the best of both worlds in many respects.
    Notes
    [1] Patterson, D. A. and Ditzel, D. R. 1980. The case for the reduced instruction set computing. SIGARCH Comput. Archit. News 8, 6 (October
        1980), 25-33. DOI= http:/ / doi. acm. org/ 10. 1145/ 641914. 641917
    [2] http:/ / www. cs. uiowa. edu/ ~jones/ arch/ cisc/
    References
    This article was originally based on material from the Free On-line Dictionary of Computing, which is licensed
    under the GFDL.
     Tanenbaum, Andrew S. (2006) Structured Computer Organization, Fifth Edition, Pearson Education, Inc. Upper
      Saddle River, NJ.
    External links
     RISC vs. CISC comparison (http://www.pic24micro.com/cisc_vs_risc.html)
    logic for dealing with the delay in completing a memory access (cache miss, etc.) to only two instructions. This led
    to RISC designs being referred to as load/store architectures.[5]
    One more issue is that complex instructions are difficult to restart, e.g. following a page fault. In some cases,
    restarting from the beginning will work (although wasteful), but in many this would give incorrect results. Therefore
    the machine needs to have some hidden state to remember which parts went through and what needs to be done.
    With a load/store machine, the PC supplies all information.
    Alternatives
    RISC was developed as an alternative to what is now known as CISC. Over the years, other strategies have been
    implemented as alternatives to RISC and CISC. Some examples are VLIW, MISC, OISC, massive parallel
    processing, systolic array, reconfigurable computing, and dataflow architecture.
     Identical general purpose registers, allowing any register to be used in any context, simplifying compiler design
      (although normally there are separate floating point registers);
     Simple addressing modes, with complex addressing performed via sequences of arithmetic and/or load-store
      operations;
     Few data types in hardware, some CISCs have byte string instructions, or support complex numbers; this is so far
      unlikely to be found on a RISC.
    Exceptions abound, of course, within both CISC and RISC.
    RISC designs are also more likely to feature a Harvard memory model, where the instruction stream and the data
    stream are conceptually separated; this means that modifying the memory where code is held might not have any
    effect on the instructions executed by the processor (because the CPU has a separate instruction and data cache), at
    least until a special synchronization instruction is issued. On the upside, this allows both caches to be accessed
    simultaneously, which can often improve performance.
    Many early RISC designs also shared the characteristic of having a branch delay slot. A branch delay slot is an
    instruction space immediately following a jump or branch. The instruction in this space is executed, whether or not
    the branch is taken (in other words the effect of the branch is delayed). This instruction keeps the ALU of the CPU
    busy for the extra time normally needed to perform a branch. Nowadays the branch delay slot is considered an
    unfortunate side effect of a particular strategy for implementing some RISC designs, and modern RISC designs
    generally do away with it (such as PowerPC and more recent versions of SPARC and MIPS).
    Early RISC
    The first system that would today be known as RISC was the CDC 6600 supercomputer, designed in 1964, a decade
    before the term was invented. The CDC 6600 had a load-store architecture with only two addressing modes
    (register+register, and register+immediate constant) and 74 opcodes (whereas an Intel 8086 has 400). The 6600 had
    eleven pipelined functional units for arithmetic and logic, plus five load units and two store units; the memory had
    multiple banks so all load-store units could operate at the same time. The basic clock cycle/instruction issue rate was
    10 times faster than the memory access time. Jim Thornton and Seymour Cray designed it as a number-crunching
    CPU supported by 10 simple computers called "peripheral processors" to handle I/O and other operating system
    functions.[9] Thus the joking comment later that the acronym RISC actually stood for "Really Invented by Seymour
    Cray".
    Another early load-store machine was the Data General Nova minicomputer, designed in 1968 by Edson de Castro.
    It had an almost pure RISC instruction set, remarkably similar to that of today's ARM processors; however it has not
    been cited as having influenced the ARM designers, although Novas were in use at the University of Cambridge
    Computer Laboratory in the early 1980s.
    The earliest attempt to make a chip-based RISC CPU was a project at IBM which started in 1975. Named after the
    building where the project ran, the work led to the IBM 801 CPU family which was used widely inside IBM
    hardware. The 801 was eventually produced in a single-chip form as the ROMP in 1981, which stood for 'Research
    OPD [Office Products Division] Micro Processor'. As the name implies, this CPU was designed for "mini" tasks, and
    when IBM released the IBM RT-PC based on the design in 1986, the performance was not acceptable. Nevertheless
    the 801 inspired several research projects, including new ones at IBM that would eventually lead to their POWER
    system.
    The most public RISC designs, however, were the results of university research programs run with funding from the
    DARPA VLSI Program. The VLSI Program, practically unknown today, led to a huge number of advances in chip
    design, fabrication, and even computer graphics.
    The Berkeley RISC project started in 1980 under the direction of David Patterson and Carlo H. Sequin, based on
    gaining performance through the use of pipelining and an aggressive use of a technique known as register
Reduced instruction set computing                                                                                            17
    windowing. In a normal CPU, one has a small number of registers, and a program can use any register at any time. In
    a CPU with register windows, there are a huge number of registers, e.g. 128, but programs can only use a small
    number of them, e.g. eight, at any one time. A program that limits itself to eight registers per procedure can make
    very fast procedure calls: The call simply moves the window "down" by eight, to the set of eight registers used by
    that procedure, and the return moves the window back. (On a normal CPU, most calls must save at least a few
    registers' values to the stack in order to use those registers as working space, and restore their values on return.)
    The RISC project delivered the RISC-I processor in 1982. Consisting of only 44,420 transistors (compared with
    averages of about 100,000 in newer CISC designs of the era) RISC-I had only 32 instructions, and yet completely
    outperformed any other single-chip design. They followed this up with the 40,760 transistor, 39 instruction RISC-II
    in 1983, which ran over three times as fast as RISC-I.
    At about the same time, John L. Hennessy started a similar project called MIPS at Stanford University in 1981.
    MIPS focused almost entirely on the pipeline, making sure it could be run as "full" as possible. Although pipelining
    was already in use in other designs, several features of the MIPS chip made its pipeline far faster. The most
    important, and perhaps annoying, of these features was the demand that all instructions be able to complete in one
    cycle. This demand allowed the pipeline to be run at much higher data rates (there was no need for induced delays)
    and is responsible for much of the processor's performance. However, it also had the negative side effect of
    eliminating many potentially useful instructions, like a multiply or a divide.
    In the early years, the RISC efforts were well known, but largely confined to the university labs that had created
    them. The Berkeley effort became so well known that it eventually became the name for the entire concept. Many in
    the computer industry criticized that the performance benefits were unlikely to translate into real-world settings due
    to the decreased memory efficiency of multiple instructions, and that that was the reason no one was using them. But
    starting in 1986, all of the RISC research projects started delivering products.
    Later RISC
    Berkeley's research was not directly commercialized, but the RISC-II design was used by Sun Microsystems to
    develop the SPARC, by Pyramid Technology to develop their line of mid-range multi-processor machines, and by
    almost every other company a few years later. It was Sun's use of a RISC chip in their new machines that
    demonstrated that RISC's benefits were real, and their machines quickly outpaced the competition and essentially
    took over the entire workstation market.
    John Hennessy left Stanford (temporarily) to commercialize the MIPS design, starting the company known as MIPS
    Computer Systems. Their first design was a second-generation MIPS chip known as the R2000. MIPS designs went
    on to become one of the most used RISC chips when they were included in the PlayStation and Nintendo 64 game
    consoles. Today they are one of the most common embedded processors in use for high-end applications.
    IBM learned from the RT-PC failure and went on to design the RS/6000 based on their new POWER architecture.
    They then moved their existing AS/400 systems to POWER chips, and found much to their surprise that even the
    very complex instruction set ran considerably faster. POWER would also find itself moving "down" in scale to
    produce the PowerPC design, which eliminated many of the "IBM only" instructions and created a single-chip
    implementation. Today the PowerPC is one of the most commonly used CPUs for automotive applications (some
    cars have more than 10 of them inside). It was also the CPU used in most Apple Macintosh machines from 1994 to
    2006. (Starting in February 2006, Apple switched their main production line to Intel x86 processors.)
    Almost all other vendors quickly joined. From the UK, similar research efforts resulted in the INMOS transputer, the
    Acorn Archimedes and the Advanced RISC Machine line, which is a huge success today. Most mobile phones and
    MP3 players use ARM processors. Companies with existing CISC designs also quickly joined the revolution. Intel
    released the i860 and i960 by the late 1980s, although they were not very successful. Motorola built a new design
    called the 88000 in homage to their famed CISC 68000, but it saw almost no use. The company eventually
    abandoned it and joined IBM to produce the PowerPC. AMD released their 29000 which would go on to become the
Reduced instruction set computing                                                                                              18
Further reading
    Television
     Computer Chronicles (1986). " RISC (http://www.archive.org/details/RISC1986)".
    External links
     RISC vs. CISC (http://www-cs-faculty.stanford.edu/~eroberts/courses/soco/projects/2000-01/risc/risccisc/
      )
     What is RISC (http://www-cs-faculty.stanford.edu/~eroberts/courses/soco/projects/2000-01/risc/whatis/)
     RISC vs. CISC from historical perspective (http://www.cpushack.net/CPU/cpuAppendA.html)
Article Sources and Contributors                                                                                                                                                                 21
    Harvard architecture Source: http://en.wikipedia.org/w/index.php?oldid=477983174 Contributors: AgadaUrbanit, Ale And Quail, Alex Pascual, Amarco90, Anomalocaris, Antandrus,
    Arjun01, AxelBoldt, Bachrach44, Bitflung, BokicaK, Bozoid, CanisRufus, CryptoDerk, DHR, Dcljr, DexDor, Doradus, Dyl, Eaglizard, Epinheiro, Fanopanic, Frap, Furrykef, Fuzheado,
    GrahamDavies, Guy Macon, HenkeB, Henriok, Iain.mcclatchie, J.delanoy, JamesMLane, Jni, JonHarder, JorgePeixoto, Jwortzel, Kbdank71, Krauss, Kvng, Levin, LittleDan, LokiClock,
    Lordofcode, Magic5ball, Malleus Fatuorum, Maury Markowitz, Michael Hardy, Moilforgold, NathanBeach, Neelix, Nessa los, Oosterwal, Oxymoron83, Pion, Pjrm, Plugwash, Psycotica0, R. S.
    Shaw, RTC, Rdsmith4, Remi0o, Reswobslc, Ric8ard, Robert K S, Rwwww, Sepia tone, Shadowjams, Sietse Snel, SpeedyGonsales, Srasku, Sw1974, Toddintr, Transcendent, Unyoyega,
    Wernher, Witguiota, Ykhwong, Zarek, 130 anonymous edits
    Complex instruction set computing Source: http://en.wikipedia.org/w/index.php?oldid=475827237 Contributors: 209.239.198.xxx, Alimentarywatson, Andrejj, Arndbergmann, Blazar,
    Buybooks Marius, CanisRufus, Carbuncle, Cassie Puma, Cdleary, Chris Howard, Collabi, Conversion script, Cybercobra, DMTagatac, DaleDe, Davnor, Deflective, Destynova, DmitryKo, Dyl,
    Edward, Ejrrjs, EncMstr, EoGuy, Epbr123, Eras-mus, Ergbert, Ethancleary, EvanCarroll, Eyreland, Fejesjoco, Flying Bishop, Frap, Galain, Gardar Rurak, Graham Chapman, Guy Harris,
    HenkeB, James Foster, Jason Quinn, Joanjoc, JonHarder, JonathonReinhart, Jpfagerback, Kallikanzarid, Karl-Henner, Kbdank71, Kelly Martin, Kwertii, Liao, Lion10, MFH, Materialscientist,
    MattGiuca, Mike4ty4, Mudlock, Murray Langton, NapoliRoma, Neilc, Nikto parcheesy, Nxavar, Nyat, Optakeover, OrgasGirl, PS2pcGAMER, Pgquiles, PokeYourHeadOff, Prodego, Qbeep,
    Quuxplusone, R'n'B, R. S. Shaw, Rdnk, RekishiEJ, Rich Farmbrough, Rilak, Robert Merkel, Rwwww, Saaya, Sct72, SimDoc, SimonP, Skittleys, Slady, Sopoforic, Stephan Leeds, Swiftly,
    Template namespace initialisation script, Tesi1700, Thincat, Tirppa, TutterMouse, UnicornTapestry, Urhixidur, VampWillow, Virtualphtn, Whaa?, WhiteTimberwolf, Wiki alf, Wws, 119
    anonymous edits
    Reduced instruction set computing Source: http://en.wikipedia.org/w/index.php?oldid=475286182 Contributors: 15.253, 16@r, 18.94, 209.239.198.xxx, 62.253.64.xxx, Adam Bishop,
    AgadaUrbanit, Ale07, Alecv, Ancheta Wis, Andre Engels, Andrew.baine, Aninhumer, Anss123, Autarchprinceps, AvayaLive, Bcaff05, Beanyk, Betacommand, Bobanater, Bobblewik, Brianski,
    Bryan Derksen, Btwied, C xong, C. A. Russell, Cambrant, Capricorn42, Cbturner46, Charles Matthews, Christan80, Cliffster1, Cmdrjameson, Conversion script, Corti, Cybercobra, Cybermaster,
    Damian Yerrick, Darkink, Davewho2, David Gerard, David Shay, DavidCary, DavidConner, Davipo, Dbfirs, Dcoetzee, DeadEyeArrow, Deflective, Derek Ross, Dkanter, Dmsar, Donreed, Dr
    zepsuj, DragonHawk, Drcwright, Drj, Dro Kulix, Dyl, Eclipsed aurora, EdBever, EdgeOfEpsilon, Edward, Eloquence, EnOreg, Eras-mus, Evice, Finlay McWalter, Fonzy, Frap, Fredrik,
    Fromageestciel, Fujimuji, Furrykef, G3pro, GCFreak2, Gaius Cornelius, Gazno, Gesslein, Gjs238, Graham87, GregLindahl, GregorB, Guy Harris, Hadal, Hardyplants, Heirpixel, HenkeB,
    Henriok, Hephaestos, HubmaN, ICE77, ISC PB, Iain.mcclatchie, Ianw, Imroy, In2thats12, IvanLanin, JVz, Jack1956, Jamesmusik, Jasongagich, Jay.slovak, JayC, Jengod, Jerryobject, Jesse
    Viviano, Jevansen, Jfmantis, Jiang, JoanneB, Joey Eads, Johncatsoulis, JonHarder, Josh Grosse, JulesH, Kaszeta, Kate, Kbdank71, Kelly Martin, Kevin, Kman543210, Knutux, Koper, Koyaanis
    Qatsi, Kristof vt, Kwamikagami, Kwertii, Labalius, Larowebr, Leszek Jaczuk, Levin, Liao, Liftarn, Ligulem, Littleman TAMU, Lorkki, Lquilter, MER-C, MFH, Magus732, Marcosw, Mark
    Richards, MarkMLl, Matsuiny2004, MattGiuca, Mattpat, Maurreen, Maury Markowitz, Mav, Mdz, MehrdadAfshari, MetaNest, Michael Hardy, Micky750k, Mike4ty4, MikeCapone, Mikeblas,
    Milan Kerlger, Mintleaf, Miremare, Modster, Moxfyre, MrPrada, Mrand, Mrwojo, Murray Langton, Nasukaren, Nate Silva, Neilc, Nikevich, Nurg, Nutschig, OCNative, Odysseus1479,
    Optakeover, Orichalque, Owengerig, Parklandspanaway, Paul D. Anderson, Paul Foxworthy, Pgquiles, Phil webster, Philippe, Pixel8, Plr4ever, Pnm, PrimeHunter, Ptoboley, QTCaptain,
    Quuxplusone, Qwertyus, R. S. Shaw, RAMChYLD, RadicalBender, Radimvice, Rajrajmarley, Rat144, Raysonho, RedWolf, Rehnn83, Remi0o, Retodon8, Rilak, Robert K S, Robert Merkel,
    Romanm, Ruud Koot, Rwwww, Saaya, Sbierwagen, Scepia, Scootey, Self-Perfection, Senpai71, Shieldforyoureyes, Shirifan, Sietse Snel, Simetrical, SimonW, Snoyes, Solipsist, Sonu mangla,
    SpeedyGonsales, SpuriousQ, Stan Shebs, Stephan Leeds, Stewartadcock, StuartBrady, Surturz, Susvolans, T-bonham, The Appleton, TheMandarin, Thecheesykid, Thorpe, Thumperward,
    Thunderbrand, TimBentley, Tksharpless, Toresbe, Toussaint, Trevj, UncleDouggie, UnicornTapestry, Unyoyega, Uriyan, VampWillow, Vishwastengse, Watcharakorn, Wcooley, Weeniewhite,
    Weevil, Wehe, Wernher, Wik, WojPob, Worthawholebean, Wws, Xyb, Yurik, Zachlipton, ZeroOne, ^demon, 405 anonymous edits
Image Sources, Licenses and Contributors                                                                                                                                                            22
    License
    Creative Commons Attribution-Share Alike 3.0 Unported
    //creativecommons.org/licenses/by-sa/3.0/