## **:: MID TERM ASSIGNMENT ::**

**ID: 11533** 

Name: Ashir Ali Khan

**Subject: Computer Architecture** 

**Teacher: Sir Muhammad Amin** 



**Iqra National University** 

|                           | QUESTIO<br>QUESTIO<br>Give Ans<br>PART : A<br>AMSWERZ :  | N: 1:-                                      |       | n of                   | the foll                        | owing :               | 1              |
|---------------------------|----------------------------------------------------------|---------------------------------------------|-------|------------------------|---------------------------------|-----------------------|----------------|
|                           | IFU                                                      |                                             |       | ( •                    | SU                              | RU                    | BFU            |
|                           |                                                          | I-cad<br>XU                                 |       |                        | LSU                             |                       | 0 4 0          |
|                           | Instr<br>L2                                              |                                             |       | H01                    |                                 | a-L2                  |                |
| The<br>show<br>INSI<br>Th | following<br>in the o<br><b>Puction S</b><br>is enit eno | functione<br>bove fig<br>coversi<br>ables H | e out | 201 0<br>411 (<br>-q-1 | ue on en<br>((U):-<br>order (CO | ach core<br>0) pipela | , as<br>he. It |

11533 tracks register namer, 000 instruction dependency, and handling a instruction resource dispatch. This unit is also central to performance measurement. through a function called instrumentation. Instruction faicturing Unit (IFU) (PREDICTION):-These units contains the instruction eache, beanch prediction logic, instruction fetching control, and buffers. The relative size is the result of the elaborate beanch prediction design. INSTRUCTION DECODE UNIT (IDU): The IDU is fed foor the IFU buffers, and is responsible for pausing and decoding of all 2/Achitechice operation codes. LOAD - STORE UNUT (LSU) :-The LSU contains the data cache. It is responsible for handling all types of operand accesses of all lengths, modes, and formats as defined in the 2/Architecture TRANSLATION UNIT (XU) :-The XU has a large translation look aside buffer (TLB) and the Dynamic Address Translation & (DAT) function that handles the dynamic translation of logical to physical addresses.

3 11533 FIXED-POINT UNIT (FXU) -The FXU handles fixed-point authomatic BINIARY FLOATING - POINT UNIT (BFU):-The BFU handler all binary and hexadeumal bloaking - point and fixed point multiplication operations DECIMAL FLOATING - POINT UNIT ( DFU) -The DU suns both floating -point and decimal fixed-point operations and fixed-point division operations RECOVERY UNIT(PU) -The RU keeps a copy of the complete state of the system that includes all registers, collects haidwale fault signal, and manager the hardware recovery actions DEDICATED (0- PROCESSER (COP) -The dedicated coprocessor is responsible for data compression and encryption functions for each core 1-cache :-This is a 65 KBLI instruction cache, allowing the IFU to prejetch intructions before they are needed L2 control :-This is the control logic that manager the traffic

11533 4 through the two L2 cacher. DATA-L1 :-A 1mb L2 data cache for all memory teappic other than instructions INSTRA- L2 :-A 1mb L2 instructions cache PART B - X - X - X Discuss the IAS operation in defail Start IS NER NO MAR + PC Yes in IBR.? No memory fetch access cycle required. MBR (MAR) 1ebt-IR (MBR (20:21) NO Yes IBR - MBR (20:39) IR (- 18P(0:7) instruction IR ( NIBR (0:1) MARCHBRE 8:19 MARC MBR (28:3) required? MAR ~ MBR (8:19) PC + PC+1 Decode instruction in IR. AC + AC + MCX) ACEM(X) GOTO M(X, 0:19) IF AC >O HEN go to M(X) CO: 19 Yes Is ACX MBR & M(MAR) Execution cycle. NO ACK AC+MBR MBR ~ M(MAR) PCK MAR MO = contents of memory ACK MBR location whose address is X (i: j) = bils i through j

11533 DATA TRANSFERS Move data between memory and ALU registers or between two ALU registers UNCONDITIONAL BRANCH :-Normally, the control unit executes intructions In sequences from memory. This requence can be changed by a branch instruction, which facilitate repetitive operations CONDITIONAL BRANCH :-The branch can be made dependent on a condition, Hus allowing decision points ARITHMETIC :-Operations performed by the ALU ADDRESS MODIFY :-Permits addresses to be computed in the ALU and then inserted into instructions stored in momory This allows a program considerable addressing flexibility. PART : C :-ANSWER :. An Embedded system is a combination of computer hadware and software. As with any electronic systen, this system requires a hardware platform and that is built with a microprocessor of microcatroller. The Embedded system hardware includes elements like user interface, Input/output interfacer, display and memory etc. Generally, an embedded system comprises power supply, processor, memory, Knews,

11533 serial communication ports and system application specific circuits. Embedded system software is written in a high-level language, and then compiled to achieve a specific bonchion within a non-volabile memory in the hardware. Embedded system software designed to keep in view of three limit. They are availability of system memory and processor speed. When the system runs endlessly, there is a need to limit the power dissipation for events like Run, stop and wake up. Examples of Embedded Systems -· Digital alarm clocks Smart watches and digital weist watches · Mashing machines and dist washers Air-conditioners and thermostats. · Traffic lights. · Printers, photocopy, bax machines and scannes Digital and video camelas · Calculators. · Digital themometers. PART: D - x - x - x Different desktop applications that requires the great power of contemporary micro processor-based system are: > Image processing application. > Speech recognition.

16.623 11533 =) Video conferencing Multimedia authoring Voice and video annotation of files simulation modeling. PART: E :- X- X-ANGWER :-The techniques used for contemporary processors to inclease speed all: Pipelining : Pipelining is when computers receives multiple instructions' and carry them out as they are recieved Branch Prediction :-Branch prediction is the process of being able to predict the next set of instruction so that they can cauled out. Superscalar Execution: Superscalar execution is when you are able to give more than one set of instruction al. time, Data Flow Analysis:-Data flow analysis analyses instructions that need each other. 4) Speculative execution :-Speculative execution cary out-instructions before they are actually executed. XXXXX

11533 8 PART : F :-ANSWER :-As the cluck speeds and logic density increase, a number of obstractes become more significant. including. · POWER :-The power density increases with an increase in logic density and dock speed. One challenge of this is the difficulty of discipating the heat generated RC DELAY :-The speed at which electrons can blow on a chip between transistors is limited by the resistance and capacitance of the metal while connecting them. delay increases as the RC product increases As components on the chip decrease in size the wive are close together, increasing capacitance. MEMORY LATENCY :-Memory speeds lag processor speeds PART : G :-ANSWER :-Amdahl's law deals with the potential speedup of a program and wing multiple processors compared to a single processor. Consider a program vonning on a single processor such that a fraction (1-F) of the execution time involves code that is inherently sequential, and a fraction F. that involves code that is infinitely parallelizable with

11533 no scheduling bullhead. let T be the total execution time of the program using a single processor. Then the speedup using a parallel processor with N processor's that bully exploits the parallel partion of the program is as follows: Speedup = Time to execute program on a single processor Time to execute program on N pavalled processors = T(1-F) + TF = 1T(1-F) + TF (1-F) + -N Tuo important conclusions can be deawn: 1) when fis month, the use of parallel processors has little effect. 2) As N approaches infinity, speedup is there are diminishing relvins for using more processors. PAPT : H :- X - X - X ANSWER :-MULTICORE -Multicore refers to an architecture in which a single physical processor incorporates the core logic of more than one processor. A single integrated circuit is used to package a hold these processors. These single integrated circuits are known as a die. Multicore architecture places multiple processor cores and bundles these as a singles physical processor. The objective

11533 is to create a system that can complete more tasks at the same time, thereby gaining better overall system performance. This Hechnology is most commonly used in multicure processors, where was or more processor chips or cores ron concurrently as a single system. Multicore-based processors are used in mobile devices, deskhopes, workstations and servers The concept of multicore technology is mainly centered on the possibility of parallel computing, which can significantly boost computer speed and efficiency by including two or more central processing whits (CRUS) in a single chip. This reduces the system's heat and power consumption. This means much better performances with less or the same amount of energy. MIC 1-Chip manufactures all now in the process of malon a huge leave forward in the number of cores per chip, with more than 50 cores per chip. The leave in performance as well as the challenges in developing software to exploit such a large more number of cores has led to the introduction of new term known as many integrated core (MIC). GPGPU -A general - purpose GPU (GPGPU) is a graphics processing unit (GPU) that performs non-specialized carculations that would hypically be conducted by the GPU (central processing unit). Ordering Ordinally, the GPU is dedicated to

(1)11533 giaphics sendering GDGPUS and used for tasks that were formerly domain of high-power CPU, what physics calculations, encryption / decryption, scientific computations and the generation of coupto currencies such as Bit coin Because graphic cards are constructed by massive parallelism, they can dwarf the calculation rate of even the most powerful CPUs for many parallel processing faisks The same shader cores that allow multiple pixels to be rendered simultaneously can similarly Process multiple stream of data at the same time Although a shader core is not draw nearly as complex as a CPU, a high-end GPU may have thousands of shade cores ; in contrast, a multicore CPU might have eight or twelve cores. PART. I -ANSWER :-Quick Path Interconnect (OPI) protocol layersi-OPI is defined as a four-layer protocol architecture, 3 encompassing the following layers. Physical -Consist of the actual wires callying the signal as well as circuitry and logic to support ancillary. beatures required in the transmission and receipt of the 1s and Os. The writ of teansfer at the physical layer is 20 bits, which is called a Phit (physical unit?

11533 12 LINK :-Responsible for Reliable transmission and flow control. The link layer's anit of bransfer is an 80-bit Flit (Flow control anit) ROUTING :-Provider the framework for directing packets Projoca - Labric The high-level set groles for exchanging packets of data between devices. A packet is comprised ) an integral number of Flits PARI : J :- \* \* \*- \*-ANSWER :-Physical and logical architecture of PCIe:-A rost Complex device, also referred to as a chipset or a host bridge, connects the processor and memory substychen to the PCI Express switch Fabric comprising one or mor PCLe and PCLe switch devices. The root complex acts as a buffering device, to deal with difference in data rates between 1/0 controller, and memory and processor components. The root complex also translates between PCIe agrigation formats and the processor and memory signals and control requirement. The chipset will hypically support multiple PCIE ports, some of which attach directly to a PCIE device, and one or more that attach to a switch that manages multiple PCLe streams.

| 115                                       | 3. (3)                                                         |
|-------------------------------------------|----------------------------------------------------------------|
|                                           | Core Core                                                      |
| Gigabit<br>ethernet<br>PCIE-PCI<br>bridge | Pcie Memory<br>Chipset Memory                                  |
|                                           | PCIE<br>Cle. Switch PCIE                                       |
| Legacy<br>endpoint                        | Pcle Pcle Pcle<br>Pcle Pcle Pcle<br>endpoint endpoint endpoint |
| X                                         | <u> </u>                                                       |
|                                           |                                                                |

11533 14 QUESTION : 2 :-Write a detail note on each of the following: PART : A :-STRUCTURAL COMPONENTS OF A COMPUTER -A computer has four main components. the central processing unit or CPU, the primary memory, input units and autput units. A system bus connects all four components, passing and relaying information among them. This type of computer organization and architecture called a "von Neumann machine" after John von Neumann, who finalized the theory and design of the first modern digital computer => C.P.U :-Computer scientists typically call the CPU the "brain" of the computer because this is where programs are executed. A program is a set of instructions that tells the computer have to accomplish a specific task, such as sending a file to the printer, opening a browser window, or playing music or video The CPU is juither broken up into three smaller components: the acithmetic unit handles all the simple matternatical computation; the control units interpret the instructions in a computer program; and the intruction decoding unit converts. computer programming instructions into machine code.

11533 Machine code is the basic language understand by all the components in a computer. > MEMORY :-Once the CPU converts a specific set of compute program instructions into machine code, it stores that machine code in primary storage of memory. The machine cade will be treated as either data as instructions. The CPU betches data and instructions from memory, uses an instruction to manipulate the data, and then sends the result and the next set of instruction back to memory. => INPUT UNITS :-Input units are all the devices you use to feed ---information to the computer, such as a keyboard, a hard duive or a networking card. These devices. in essence, bring data from the coutside world" into your computer, in which much the same way that you eyes and eass bring information to you blain. Each input device has its own hardware controller that connects to the CPU and primary memory, and it has a set of instructions that tells the cpu how to use it -) OUTPUT UNITS :-Output units are the devices your computer uses to relay information to the user, such as a printer, monitors and speakers. For example, everything you see on your computer monitor starts as machine code in memory. The CPU takes that

-11533 -machine code and converts it into a format required by you monitor's hardware, you appointor's hardware then converts that information into different light intensities so that you see words or pictures. # THE SYSTEM BUS :-The system bus lets the four components of the computer communicate with one another. The system but transmit data and instructions It also sends addresses that tell the CPU where in primary memory the data and instructions are coming from and where the results should go × × × PART : B :-THE KEY CHARACTERISTICS OF A COMPUTER FAMILY :-) Similar or identical instruction set: In many cases, the exact same set of machine is instruction rue supported on all members of the bamily. This, a program that executes one machine will also excute on any other In some cases, the lower end of the family has an instruction set that is a subset of that of the top end of the family. This beans that program can move up but not Same basic operating system is available for all

11533 family member. In some cases, additional features are added is the higher - end members =) Increasing speed: The rate of instruction execution incleases in going from lawer to higher family =) Increasing number of 1/0 ports: In going from lower to higher bamily member. =) Increasing memory size : In going from lower to higher family members. =) Increasing cost: In going from lower to higher family member X -X-PART : C :-STORED PROGRAM COMPUTER :-A stored - program computer is a computer that shores program instructions in electronic memory. This contrasts with machines where the program instruction, are stored on plug boards or similar mechanism. A computer with a von Neumann auchitecture shares program data and instruction data in the same memory; a computer will a Harvard architecture has seperate memories for sloring program and data. Both are stored-program designs. Stored-program. computer is some time used as a synonym for von Meumann auchikecture, however Professor Jack Copeland considers that it is "historically inappropriate, to reper to electronic stred-program

11533 8 digital computers as " von Neumann machines" Hennessy and Patterson write that the early Hauvard machines were legarded as "reactionary the advocates of stored - program computers PART D -MODRE'S LAIN :-Moore's law is a computing term the which originated around 1970; the simplified version of this law chales that processor speads overall processing power for computers will double every two years. Aquick check among technicians in different computer companies shows that the term is not very popular but the rule is still accepted To break down the law even putter, it specifically stated that the number of teansistors on an apportable CPU would double every hus years (which is essentially the came thing that was stated before) but "more teansicher is more accurate The law is named after Intel co-founder Goldon Movie, who described the teend in his 1965 paper. The paper stated that the number of components in integrated circuits had doubted Every year from the invention of the integrated circuit in 1958 unbill 1965 and predicted that the trend would continue "for at least ten years". His prediction has proved very accurate. The law

11533 9 1111 is used in the semiconductor industry to quide long. term planning and to set taugets for research and development. The capabilities of many digital electronic devices are strongly linked to Modie's law : processing speed, memory capacity, sensors and even the number and size of pixels in digital cameras. All of these are improving at (roughly) exponential eater as well PART : E :- \* INSTRUCTION CYCLE STATE DIAGRAM :-Operand Instruction Operand shore. Felch Fetch Mult Mulhiple resultoperands Instruction Instruction operand Data operand oddress operation address address operation calculation decoding aculation calculation Returns for string Instruction complete ivi vector data letch next instruction

20 11533 =) Instruction address calculation (iac): Determine the address of the next instruction to be executed this involves adding a fixed number to Usually For example, the address of the previous instruction ib each instruction is libbits long and memory is organized into 16-bit words, then add 1 to the previous address. If instead, memory is organized as individually addressable 8 - bit bytes, then add 2 to the plevious address. -) Instruction Fetch (IF): Read instruction from its location into the placessor Instruction operation decoding (lod): Analyze instruction to determine type of operation to be performed and operand(s) to be used. -) Operand address calculation (anc): If the operation involves reference to an operand in memory or available via 1/0 then determine the address of the operand -) Operand betch (OF): Fetch the operand from memory or read it in from 1/0. -> Data operation (do): - Perform the operation indicated in the instruction. =) Operand store (Os) - write the result into memory or out to 1/0 <del>x x x X</del>

21 11533 PART : F :-CLASSES OF INTERPUPTS :-PROGRAM :-EEEEEEE Generated by some condition that occurs result of a instruction execution, such as alithmetic averblow, division by zero, altempt to execute an illegal machine instruction, or reference outside a user's allowed memory space TIME :-Generated by a times within the processor. This allows the operating system to perform certain functions on a regular basis. 77777777777 1/0 1-Generated by an 1/0 controller, to signal normal completion of an operation or to signal a variety of eucli conditions. HAPDWARE FAILURE :-Generated by a failure such as power faiture or memory parity error. X \_\_\_ X \_\_\_ X PART : G :-BUS INTERCONNECTION SCHEME :-

11533 I/0 ... I/0 CPU Memory Memory Control lines BUY Address lines Data lines BUS INTERCONNECTION SCHEME A buc is a communication pathway connecting two or more devices. Alkey characteristic is that it is a shared transmission medium Multiple devices connect to the bus and a signal transmitted by any one device is available for reception by all other devices attached to the bus. 16 two devices transmit during the same time period, their signal. all overlap and become garbled. Thus, only one device at a time can successfully transmit. Typically, a bus consist of multiple communication pathways, or lines, Each line is capable of transmitting signals representing binary 1 and binary O. An 8-bit unit q data can be transmitted over eight bus lines. A bus that connects mayor computer components (processor, memory, I/O) is called a system bus 3900

11533 DATA LINES -The data lines a provide a path but moving data among system modules. These lines, collectively, are called the data bus, ADDRESS LINES -The address lines are used to this designate the source or destination of the data on the data bus. For example, on an 8-pit address bus, address 04111111 and below might reference locations in a memory module (module 0) with 128 words of memory; and address 10000000 and above refers to devices attached to an 1/0 module (module 1) CONTROL LINENS :-The control lines are used to control the access to and the use of the data and address lines, Control signals transmit both command and timing information among system modules. Timing signal, indicate the validity q data and address information. Command signals specify speration to be performed. Typical control lines include: . MEMORY WRITE :-Causes data on the bus to be written into the addressed location. . MEMORY READ :-Cause, data from the addressed location

| 3       |                                                                                                                                 |
|---------|---------------------------------------------------------------------------------------------------------------------------------|
| -       | 11533 (24)                                                                                                                      |
|         | to be placed or the bus.<br>I/O Mrite:-<br>Causes data or the bus to be output<br>to the addressed 1/0 port.                    |
| atre to | Lauses data from the addressed 1/0<br>port to be placed on the bus.                                                             |
|         | Indicates that data have been accepted<br>from or placed on the bus.<br>Bus request :-<br>Indicates that a module needs to gain |
|         | control of the bus.<br>Bus grant :-<br>Indicates that a requesting module has<br>been granted control of the bus.               |
|         | Indicates that an interrupt is pending                                                                                          |
|         | Acknowledges that the pending interrupt<br>has been recognized.<br>Clock:<br>Is used to synchronize operation                   |
|         | Initializes al moduler                                                                                                          |
|         | The operation of the bus is as follows if<br>one module wishes to send data to another,                                         |
|         |                                                                                                                                 |

11533 it must do two thing. To obtain the use of the bus, and to transfer data via the bus. If one module wishes to request data from anothe module, it must obtain the use P bus pateansper a request to the other module over the appropriate control and address lines. It must then wait for that second module to send the data x-x-x-x QUESTION : NO:3 :-Differentiate each of the following: PART: A :-Computer Organization and Computer Architecture :-COMPUTER COMPUTER ARCHITECTURE ORGANIZATION Computer Organization 1) Computer Architecture is concerned with the 1s concerned with structure and behaviour the way houdware seen by the user, together to form a computer system. It deals with the compon- 2) It acts on the interents of a connection in a face between hardware system Computer Organization tells 3) Computer Architecture and roputer Architecture as how exactly all the help us to understand

| 2        | (1533                                                                    | 26                                                                                                   |
|----------|--------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------|
|          | anits in the system are<br>arranged and inter connected                  |                                                                                                      |
|          | 4) wheneas Organization<br>expresses the realization<br>of architecture. | 4) A programmer can view<br>cuchitecture in terms of instr-<br>uctions, addressing modes             |
| 7        | 5) An organization is done<br>on the basis of cuchitecture               | system architecture is consid-                                                                       |
|          | 5) Computer Organization deals                                           | 6) Computer Architecture                                                                             |
|          | ) Diganization involves                                                  | deals with high-level<br>design issues<br>of Architecture involves                                   |
| -        | Circuit design, Addess,                                                  | Logic (instruction sets,<br>Addressing modes, Data                                                   |
| 23       | PART: B :-                                                               | types, Cache optimization).                                                                          |
| 2        | RISC and CISC :-<br>RISC                                                 | CISC                                                                                                 |
|          | inclruction set.                                                         | 1) CISC stands for complex<br>Instruction set computer.                                              |
| ()<br>() | in structions laking about                                               | 2) CSIC processor has complex<br>instruction that takeup                                             |
| Ki ki    | clock cycle per instruction<br>(CCP1) is 1.5.                            | The average clock for execution.<br>The average clock cycle per<br>instruction (CPI) is in the range |
|          | Performance is optimized                                                 | 9 2 and 15.<br>3) Performance is optimized                                                           |
|          |                                                                          |                                                                                                      |

| 11535                            | 27                                |
|----------------------------------|-----------------------------------|
| with more focus on               | with more focus on                |
| Software                         | hardware the                      |
| 1) It has no memory unit and     | 4) It has a memory unit           |
| uses seperate hardware to        | to implement complex              |
| implement instructions.          | instructions.                     |
| s) It has a hard-wired unit      | 5) It has a microprogra-          |
| 9 programming.                   | mming unit.                       |
| 6) The instruction set is reduce | - O The instruction set has       |
| ed i-e it has only a fea         | a variety of different            |
| instruction in the instruction   | instructions that can be          |
|                                  | on used for complex operation the |
| are very positive.               | 7) CISC has many different (      |
| 1) The instruction set has a     | addressing modes and              |
| variety of different instru      |                                   |
| tions that can be used           | Consider represent higher -       |
| for complex operations.          | level programming language        |
| 8) Complex addressing modes      | statements more efficiently.      |
| are synthesized using the        | e B) CISC already supports        |
| software.                        | complex addressing modes.         |
| 2) Multiple register sets au     | e q) Only has a single            |
| present.                         | register set.                     |
| 0) RISC processors are           | 10) They are normally not         |
|                                  |                                   |
| highly pipelined.                | pipelined or less pipelined       |
| 1) the complexity of KISC        | ies (1) The complexity lies in    |
| with the compiler that           | the microprogram.                 |
| executes the program.            | (2) Execution time is             |
| 12) Execution time is very       | very high.                        |
| less.                            | 13) Code Expansion is             |
|                                  |                                   |
| 13) code expansion can be        | a not a problem.                  |

| 1 contractions | 11533                                                                                             | 28                                                                                                                                                                                                                                                                                                                                                 |
|----------------|---------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|                | memory for calculation,                                                                           | 14) Decoding g instruction<br>is complex.<br>15) It requires external<br>memory box calculations<br>16) Examples g CISC processors<br>are the System (360,<br>VAX, PDP-11, Motorola<br>68000 family, AMD, and<br>Intel x86 CPUs.<br>11) CISC architecture is used<br>in low-end applications<br>such as security systems,<br>home automation, etc. |
|                | MICRO PROLESSORS                                                                                  | D MILRO CONTROLLERS :-<br>MILRO CONTROLLERS :-<br>1) Micro controller is the heart<br>9 embedded.                                                                                                                                                                                                                                                  |
| 2)             | It is only a processor, so<br>memory and I/O components<br>need to be connected                   | 2) Micro Controller has a<br>processor along with internal<br>memory and I/O components.<br>3) Memory and I/O are memory<br>already present, and the<br>Internal circuit is small.                                                                                                                                                                 |
|                | externally.<br>Memory and I/O has to be<br>connected externally, 10<br>the circuit becomes large. | Internal circuit is small.                                                                                                                                                                                                                                                                                                                         |

| ~     | (59)                                                                                                            |
|-------|-----------------------------------------------------------------------------------------------------------------|
|       | 11533 29                                                                                                        |
| ()    | You can't use it in compact 4) you can use it in compact                                                        |
|       | system. system.                                                                                                 |
| to a  | Cost of the entire system of Cost of the entire system<br>is high.<br>is low.                                   |
| 6)    | Due to external components. 6) As external components                                                           |
|       | the total power consumption are low, total power                                                                |
|       | is high The doce this it consumption is less, so it                                                             |
|       | ideal for the devices running can be used with devices                                                          |
|       |                                                                                                                 |
|       | It is mainly used in personal hyperter personal hyperter personal hyperter                                      |
| -     | computer.                                                                                                       |
| 3)    | Microprocessors has a smaller It is used mainly in a                                                            |
|       | numbe of registers, so more washing markine, MP3                                                                |
| -     | operations are memory based players, and embedded                                                               |
| - 9   | Microprocessors are based system.                                                                               |
|       | on Van Neumann model. 8) Micro controlles has more<br>It is a contral processing register. Hence the program    |
| 10)   |                                                                                                                 |
|       |                                                                                                                 |
|       | based integrated chip. 9) Microcontrollers are<br>It has no PAM, point, input- based on Harvard                 |
|       | The has no party, port, the architechure.                                                                       |
|       |                                                                                                                 |
|       |                                                                                                                 |
| (12)  | It uses an external bus to the cherestophilon of along<br>interface to RAM, ROM, and processor with a CPU along |
|       | alles peripherals with other peripherals.                                                                       |
|       | Mice parprover - based system 11) It has a City along                                                           |
| -13)- | with RAM, and                                                                                                   |
|       | can run at a very my the other peripheral embedded                                                              |
|       | speed because of the other perpheral embedded                                                                   |
|       |                                                                                                                 |

11533 technology involved. on a single chip. (4) It's used for general 12) It uses an internal purposer applications that contiolling bus. 13) Microcontroller based allow you to handle system run up to 200 MH2 loads of data. It's complex and expensive, 15) or more depending with a large number of the architecture. 14) It is used for applicainstructions to process tion-specific systems. 16) Most of the microprocessors do not have power 15 It's simple and in expensive with less saving features. number of instructions to process 16) Most of the micro. controller, offer powersaving mode. PART : D :-CORTEX-A, CORTEX-R, AND CORTEX-M :-COPIEK-M CORTEX-A CORTEX-R + Real time = Micro-+ Application controllers processors. plocessors. Pesigned for small Used in wide a Used in critical range of devices system where date devices and mixed that have bully interpretion is signal processing. functional processors essential, High performance, > High performance, > low performance high efficiency and real time and safe pointe, higher

| high computational            | efficiency.                                     |
|-------------------------------|-------------------------------------------------|
| power                         | 00 V                                            |
| It runs at relatively.        |                                                 |
| high clock frequency.         |                                                 |
| It is connected               | > It is connected<br>to less memory             |
| to large amount               |                                                 |
| g memory.<br>It handles large | = It is designed to = It is built into          |
|                               | handle bast changing micro contioller wi        |
| and is capable of             | data, and to be I/O lines and                   |
| running complex               | sufficiently responsive designed for sma        |
| operating system              | to handle data bactor systems to                |
| directly.                     | through put without rely on heavy               |
| /                             | slowing down.   digital input di                |
| pr has                        | - Output.                                       |
|                               | = It is cost effective = It requires les        |
|                               | requires low power energy and ha                |
| A CAL                         | and less physical longer battery<br>area. life. |
|                               | area. libe.                                     |
|                               | code and it                                     |
|                               | easy to use.                                    |
|                               |                                                 |
| = Application -               | Poppie syste                                    |
| Mobiles, celephon             | a mal consum                                    |
| tablets, laptop,              | car system. small consum<br>electronics.        |
| etc,                          |                                                 |
|                               |                                                 |
| The states                    |                                                 |



| ~      | 11533 33                                                                                                                                                                                                                                                                               |  |
|--------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| 22223  | The I/O consists of three sections:<br>4-A sequence of instruction which prepare for<br>the operation.<br>I/O command - the actual I/O command.<br>S-A sequence of instruction which complete the<br>operation.                                                                        |  |
|        | PART: #F:-<br>Discibled interrupt and nested interrupt<br>processing:<br>DISABLED INTERPUPT PROCESSING:-<br>Handle and service individual interrupt<br>squentially.<br>High interrupt latency.<br>Pelahively easy to implement and debug.<br>Not suitable for complex embedded systems |  |
| 222222 | Userprogram Interrupt:<br>Hondler X<br>Interrupt:<br>Hondler Y ·                                                                                                                                                                                                                       |  |

|     | 11533 34                                                                                                                                                                                                   | 22   |
|-----|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|
|     | Handle multiple interrupts without a priority<br>assignment.<br>Medium or high interrupt latency.                                                                                                          | 2225 |
|     | <ul> <li>⇒ Enable interrupts before the servicing q qn<br/>individual interrupt is complete.</li> <li>⇒ No prioritization, so low priority interrupts<br/>can block higher priority interrupts.</li> </ul> | 1223 |
|     | User program Handla X                                                                                                                                                                                      | CCCC |
|     | Interrupt<br>Handlery                                                                                                                                                                                      | 111  |
|     | PADT: G :-<br>Programming in handware and programming                                                                                                                                                      | 7977 |
| -1) | IN HARDWARE IN SOFTWARE:<br>Programming in houdware 1) Programming in software                                                                                                                             | 666  |
| )   | means configuring a means supplying a specific<br>small set q basic logic set q control signals to<br>components specifically a general-purpose hard-                                                      | 2773 |

IISB3 boi a parkiwa computation. Hardware programming languages are concurrent in nature and executed a piece of cale in puallel Hardware programming language are often used to model a digital circuits and the circuits sythesized to a hardware. Hardware programming languages are used to find the Timing Delay of the circuit. S Hardware languages are faster and the usage of memory allocation is critical considering the lormation of chip. S tor hardware language, knowledge of digital and hardware circuit is needed. Some of the examples of HDL (Hardware Description longuage) which are used as Hardware language are VHDL, Veirlog, System Veirlog. 35 wave. 2) A software programming language are sequential in nature and executed a piece of code sequentially 3) Software plogramming languages are given executed as a piece of instructions to CPU and the code is not synthesizable 4) Software languages cannot be used for to find the timing Delay 9 the circuit s) Software languages are slower and the consideration g memory usage is not citical 6) For software language, Algorithm and processor Knowledge is needed 7) Some examples & software languages are C, C+7 which are used as Hardware language are VHDL, Verilog, System Verilog.

36 11533 QUESTION 4 -Solve Each of the following. PART : A :-ANSWER :-(a) Show the assembly the language code for the program, starling all address OSA. Address Contents. (DADM(OFA) 08A STOR M(OFB) O8B LOAD M(OFA) TUMP + M(08D) LOAD - M(OFA) 08C STOR M(OFB) 08D (b) Explain what this program does ANSWER . This program is to store the absolute value of content at memory location OFA into memory location OFB PAPT : B :- X - X - X ANSWER -Operand Opcode 000000000010 00000001 In the beginning, the CPU have to fetch the instruction

11533 from the memory. Then, the instruction will include the address of the data which is required to load. Through the execution time, the memory will be accessed in that time to load the data contents which is located at that address for a total of two tip to memory. -X <del>x x x</del> PART : C :-GIVEN --Clock speed of the processor = 60MH2 Number of instructions the executed program consist-= 104,000 Cycles per instruction Instruction Type Instruction Count Integer authinetic 1 46000 Data transfer 33000 2 2 Floating point 16000 Control Transfer 2 9000 TO FIND :-CPI = 2MIPS rate =? Execution time =? SOLUTION :-Calculating the CPI is : CPI = Instruction count × Cycles per second (1) Number q instruction the executed program consist Substitute the values of "instruction count" and "cycles per second" from the above table in equation (D).

38 (1533  $(PI = (96000 \times 1) + (33000 \times 2) + (16000 \times 2) + (9000 \times 2)$ 104,000 CPI = 46000 + 66000 + 32,000 + 18000 104,000. 162,000 104,000 1.55 Therefore, the CPI for this program is 1.55 Now Calculating MIPS :-MIPS = F  $CPT \times 10^{6}$ Frequency is given as 60MHz, putting volues 7 converted MH, MIPS = 60×100. to H2 1.55 × 100 = 38.70 Calculating execution time :-Execution time = CPT × Instruction count x clock time 0 = CPI x Instruction Count frequency.

| 222                                                                                                             | (1533 39)                                                                                                        |
|-----------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------|
|                                                                                                                 | - 1.55 × 104000                                                                                                  |
|                                                                                                                 | - 1.55 × 104000<br>60 × 106                                                                                      |
| -                                                                                                               |                                                                                                                  |
| 2                                                                                                               | = 0.0026                                                                                                         |
| the second se | Execution time = 0.0026                                                                                          |
| -                                                                                                               | PART: D:- X X X                                                                                                  |
| 3                                                                                                               | ANSWER:-                                                                                                         |
|                                                                                                                 | (a) Determine the average CPI:-                                                                                  |
| -                                                                                                               | Since we have the same instruction mix, that means                                                               |
|                                                                                                                 | the additional instructions for each task could be                                                               |
| 5                                                                                                               | allocated appropriately between the instruction type.                                                            |
|                                                                                                                 | allocated appropriately between the instruction type.<br>Therefore, the following table be gotten:               |
| -                                                                                                               |                                                                                                                  |
| -                                                                                                               | Instruction Type (P] Instruction Mix                                                                             |
| 5                                                                                                               | Arithmetic and 1 60%                                                                                             |
|                                                                                                                 | logic<br>logic 181/                                                                                              |
|                                                                                                                 | Lock Stur                                                                                                        |
|                                                                                                                 | cache hit.<br>Read 4 12%                                                                                         |
|                                                                                                                 | Blunch                                                                                                           |
|                                                                                                                 | Memory reparence 12 10%                                                                                          |
|                                                                                                                 | and the second |
|                                                                                                                 | The average $CPI = (1 \times 0.6) + (2 \times 0.18) + (4 \times 0.12) + (12 \times 0.1)$                         |
|                                                                                                                 | - 9-64                                                                                                           |
| -                                                                                                               | Therefore, the CPT has been increased since the time                                                             |
|                                                                                                                 | box memory access is also increased.                                                                             |
| -                                                                                                               | 00                                                                                                               |
| -                                                                                                               |                                                                                                                  |
| -                                                                                                               |                                                                                                                  |

11533 (b) determine the average MIPS rate:  $\frac{MIPs}{MIPs} = \frac{400}{2.64} = 152$   $\frac{MIPs}{152} = 152$ There is a corresponding drop in the MIPS rate. (c) Calculate the speed up bactor. The speedup backor equals to the ratio of the execution times. The execution time is calculated as the following:  $T = I_c / (Mips \times 10^6)$ For the one processor,  $T_1 = (2 \times 10^6) / (178 \times 10^6)$  = 11 ms. For the 8 processor, each processor executes 1/8 of the 2 million instructions plus the 25,000.  $= \frac{2 \times 10^6}{8} + 0.025 \times 10^6$ 159 × 10<sup>6</sup> - 1.8 ms Therefore we have, Speed up = time to execute program on a single processor time to execute plogram on N palallel processor = 11 = 6.11 1-8 (d) (ompare the actual speedup factor with the theoretical speedup factor determined by Amdahl's law.

7777779997 11533 In bact, there are two inefficiencies in the parallel system The first one is that there are more additional instructions which is added to coordinate between threads The second one is that there is contention for memory access. Thus, none of the code is inherently secial, and all of it is parallelizable but with scheduling overhead. It could be said that the memory access conflict means some extent memory reference instruction are not parallelizable By depending on the information given, it is not obvious how to qualify this effect in Amdahl's equation Therefor, if it is supposed that the fraction of code which is parallelizable, is F=1, then Amdahl's Ique decreases to ope Speedup = N=8. Therefore, the actual speedup is only about 15% of the theoretical Speedup. PART E -ANSWER :-STEP : 1 -(a) The PC contain 300, the address of the firstinstruction. This value is loaded in to the MAR (b) The value in location 300 (which is the instruction with the value 1940 in Hexadecimal) is loaded into the MBR, and the PC is incremented. These two steps can be done in parallel. (c) The value in the MBR is loaded into the IR

42 11533 STEP: 2 1-(a) The address purtion of the IR (940) is loaded into the MAR (b). The value in location 940 is loaded into He MABR. (c) The value in the MBR is loaded into the AC. STEP: 3 :-(a) The value in the PC (301) is loaded in to the MAR (b) The value in location 301 ( which is the instruction with the value S941) is loaded into the MBR, and the DC is incremented. (c) The value in the MBR is loaded into the IR STEP: 4 :-(a) The address portion of the TR (941) is loaded Into the MAR (b) The value in location 941 is loaded into the MBR (c) The dd value of AC and the value of location MBR are added and the result is stored in the AC STEP: 5 .-(a) The value in the PC (302) is loaded in to the MAR. (b) The value in location 302 (which is the instruction with the value 2941) is loaded into the MARR and the PC is incremented.

11533 to The value in the MBR is loaded into the IR STEP: 6 :-10) The address portion of the IR (942) is loaded into the MAR. (b) The value in the AC is loaded into the MBR (c) The value in the MBR is stored in location 941. PARTIF - X-X-ANSWER ... (A) What is the maximum directly addressable memory capacity ( in bytes) ?  $(32-8) = 2^{(24)}$ = 16,777, 216 bytes -16MB (8 bits = 1 byte for he opcode), (B) Discuss the impact on the system speed if the microprocessor bus has : (b.1) 32-bit local address bus and a 16-bit local data bus, or. A 32-bit local address bus and a 16-bit local data bus. Instruction and data transfers would take three bus cycle each, one for the address and two for the data. Since if the addres bus is 32 bits the whole address can be transferred to memory at once and decoded there; however, since

11533 44) L'UNUCCELLE POUCE REFE the data bus is only 16 bits, it will require 2 bus cycles to fetch the 32-bit instruction or operand. (b.2) 16-bit local address bus and a 16-bit local data bus. A 16-bit local address bus and a 16-bit local data by. Instruction and data transfers would take four but cycles each, two for the address and two for He data. Therefore, that will have the processor perform two Gansmission in order to send to memory the whole 32-bit address; this will require more complex memory interface control to latch the two halves of the address before it perform an access to it. In addition to this two-steps address issue, since the data bus is also 16-bits, the microprocessor will need 2 bus cycles to fetch He 32-bit instruction or operand (c) How many bits are needed but the program counter and the instruction register? 0 For the PC needs 24 bits (24-bit addresses), and for the IR needs 32 bits (32-bit addresses). PART : G :-X ANIWER :-First we need to find the time taken to fetch one operand from one memory location, frequency and bus cycle.

11533 (45 Given clock rate = 4 MH2 Frequency Clock Rate 1 4N/1H2 = 0.25 MS Therefore, frequency or bus cycle is 0.3545 Then, memory cycle will take 0.25Ms X4=1 Hence to fetch one operand from memory By applying this <sup>cc</sup> if an odd-aligned word is referenced, two memory cycles, each consisting of four bus cycles, are required to transfer the word "from the Oueshion, so we will have 3 cases: 1MS is seguired. First case: is if both operands are even-aligned, so the time required is 1x2 = 2 MS to petch both operands. Second case: is if both operands are odd aligned, so the time required is 1×4=4Ms to petch both operands. Third case is if only one is odd-aligned, so the time required is 1x3 = 3MS to fetch both operands x-x-x-x

## **::THE END::**