# 科目:計算機結構【資工系碩士班】

共3頁第1頁

If some questions are unclear or not well defined to you, you can make your own assumptions and state them clearly in the answer sheet.

### 1. Short Questions (15%)

Answer and explain the following questions.

What does it represent, assuming that it is

- (a) a two's complement integer? (2.5%)
- (b) a single precision floating-point number? (2-50/0)
- 1.2 Explain what "exception" is and how exceptions are handled. (5 %)
- 1.3 In the simplest implementation for MIPS instruction set, every instruction begins execution on one clock edge and completes execution on the next clock edge. Please explain disadvantages of the single-cycle implementation. (5 o/o)

#### 2. Performance Analysis (23%)

2.1 (8%) Consider the machine with three instruction classes X, Y, and Z. The corresponding CPIs for these instruction classes are 1, 2, and 3, respectively. Now suppose we measure the code for the same program from two different compilers and obtain the following data:

| Code from  | Instruction counts (in billions) for each instruction class |   |   |  |
|------------|-------------------------------------------------------------|---|---|--|
|            | X                                                           | Y | Z |  |
| Compiler 1 | 5                                                           | 1 | 1 |  |
| Compiler 2 | 10                                                          | 1 | 1 |  |

Assume that the machine's clock rate is 500 MHz. Which code sequence will execute faster according to MIPS According to execution time?

2.2 (3%) The table (Table 2) below shows the number of floating-point operations executed in two different programs and the runtime for those programs on three different machines:

| Program   | Floating-point | Execution time in seconds |           |           |
|-----------|----------------|---------------------------|-----------|-----------|
|           | operations     | Machine A                 | Machine B | Machine C |
| Program 1 | 100,000,000    | 1000                      | 100       | 40        |
| Program 2 | 1,000,000      | 1                         | 10        | 40        |

Which machine is fastest according to total execution time?

Table 2

- 2.3 (6%) Assume that equal amounts of time will be spent running each program on some machine. Which machine is fastest using the data of Table 2 and assuming a weighting that generates equal execution time for each benchmark on machine A? Which machine is fastest if we assume a weighting that generates equal execution time for each benchmark on machine B?
- 2.4 (6%) There are two possible improvements: either make multiply instruction run four times faster than before, or make memory access instructions run two times faster than before. You repeatedly run a program that takes 100 seconds to execute. Of this time, 20% is used for multiplication, 40% for memory access instructions, and 40% for other tasks. What will the speedup be if you improve only memory access? What will the speedup be if both improvements are made?

科目:計算機結構【資工系碩士班】

共3頁第2頁

#### 3. Instruction Set (16%)

- 3.1 (6%) What is the "addressing mode"? Please explain "displacement addressing" and "PC-relative addressing".
- 3.2 (10%) Memory-memory and load-store are two architectural styles of instruction sets. We can calculate the instruction bytes fetched and the memory data bytes transferred using the following assumptions about the two instruction sets:
  - The opcode is always 1 byte (8 bits)
- All memory address are 2 bytes (16 bits)
- All data operands are 4 bytes (32 bits)
- All instructions are an integral number of bytes in length
- There are no optimizations to reduce memory traffic

For the following C code, write an equivalent assembly language program in each architecture style (assume all variables are initially in memory):

a = b + c; b = a + c; d = a - b;

For each code sequence, calculate the instruction bytes fetched and the memory data bytes transferred (read or written). Which architecture is most efficient as measured by code size? Which architecture is most efficient as measured by total memory bandwidth required (code + data)?

### 4. Pipelining (22%)

- 4.1 (5%) MIPS instructions classically take five steps to execute in pipeline. Please explain the detailed operations of the five-stage pipeline used in MIPS instructions.
- 4.2 (9%) Explain the three different hazards: data hazards, control hazards, and structural hazards. Describe the schemes for resolving these hazards.
- 4.3 (8%) For each pipeline register in Fig. 1, label each portion of the pipeline register with the name of the value that is loaded into the register. Determine the length of each field in bits. For example, the IF/ID pipeline register contains two fields, one of which is an instruction field that is 32 bits wide.

### 5. Memory Hierarchy (24%)

- 5.1 (6%) Please explain what memory hierarchy is and why it is necessary.
- 5.2 (6%) Cache misses can be sorted into three simple categories: Compulsory, Capacity, and Conflict. Please explain why they occur and how to reduce them.
- 5.3 (6%) Consider three machines with different cache configurations:
  - Cache 1: Direct-mapped with one-word blocks
  - Cache 2: Direct-mapped with four-word blocks
  - Cache 3: Two-way set associative with four-word blocks

The following miss rate measurements have been made:

- Cache 1: Instruction miss rate is 4%; data miss rate is 8%
- Cache 2: Instruction miss rate is 2%; data miss rate is 6%
- Cache 3: Instruction miss rate is 2%; data miss rate is 4%

For these machines, one-half of the instructions contain a data reference. Assume that the cache miss penalty is 6 + Block size in words. The CPI for this workload was measured on a machine with cache 1 and was found to be 2.0. Determine which machine spends the most cycles on cache misses.

科目:計算機結構【資工系碩士班】

共马頁第马頁

5.4 (6%) The cycle times for the machines in Problem 5.3 are 2 ns for the first and second machines and 2.5 ns for the third machine. Determine which machine is the fastest and which is the slowest.



Fig. 1

# 科目 作業系統與資料結構【資工系碩士班甲組(含乙組選考)】 #3 頁 第/ 頁

-1-

Consider the following set of processes, with the length of the CPU-burst time given in milliseconds: <u>Process Burst Time</u> <u>Priority</u>

| Process | Burst Time | <u>Priorit</u> |  |
|---------|------------|----------------|--|
| P1      | 10         | 3              |  |
| P2      | 1          | 1              |  |
| P3      | 2          | 3              |  |
| P4      | 1          | 4              |  |
| P5      | 5          | 2              |  |

The processes are assumed to have arrived in the order P1, P2, P3, P4, P5, all at time 0.

- a. (3%) Draw four Gantt charts illustrating the execution of these processes using FCFS, SJF, a nonpreemptive priority (a smaller priority number implies a higher priority), and RR (quantum = 1) scheduling.
- b. (3%) What is the turnaround time of each process for each of the scheduling algorithms in part (a)?
- c. (3%) What is the waiting time of each process for each of the scheduling algorithms in part (a)?
- d. (1%) Which of the schedules in part (a) results in the minimal average waiting time (over all processes)?
- 2. (10%) Consider a system consisting of m resources of the same type, being shared by n processes: Resources can be requested and released by processes only one at a time. Show that the system is deadlock-free if the following two conditions hold:
  - a. The maximum need of each process is between 1 and m resources  $(5 \sqrt[4]{g})$
  - b. The sum of all maximum needs is less than  $m+n-(\mathcal{G}_{\mathcal{P}}/_{\mathcal{P}})$
- 3. (15%) Consider a segmentation-with-paging address translation scheme shown in the figure below. This scheme provides each process with eight segments, each with a maximum size of 1 Mbytes (2<sup>20</sup>bytes). This scheme is to be 64 Mbytes (2<sup>26</sup>bytes). The architecture provides a single base register (i.e., STBR) from which to start the address translation process. In the architecture, pages of size 4 Kbytes (2<sup>12</sup>bytes) are to be used. Note that one word in the memory has four bytes and a logical / physical address specifies one word in the memory. The segment table and page table are not necessarily implemented in main memory, hence are not constrained by word-wide (i.e. 4-byte) access.

Please compute the width (i.e. bit length) of variables s, d, p, f, and q in the figure. For clearness and grading, please highlight or underline your answer of each variable in your answer sheet.

### 科目:作業系統與資料結構【資工系碩士班甲組(含乙組選考)】 #3頁第二頁

- 2 -



- 4. (15%) Assume we have a demand-paged memory. The page table is held in registers. It takes 8 milliseconds to service a page fault if an empty page is available or the replaced page is not modified, and 20 milliseconds if the replaced page is modified. Memory access time is 100 nanoseconds. Assume that the page to be replaced is modified 70 percent of the time. What is the maximum acceptable page-fault rate for an effective access time of no more than 200 nanoseconds?
- 5. (15%) Let T(n) be the running time of Foo(n). Find the order of T. Procedure foo(integer n):

for i from 1 to n do x=nwhile x > 0 do

x=x-i

- 6. (10%) Suppose a special programming language has stack as built-in data type (with built-in functions for stacks) but doesn't have array as its built-in data type. Describe how one can implement arrays using stacks in this language.
- 7. (10%) Suppose the worst case of a sort algorithm is that it needs the maximum number of data exchange. Give the numbers from 1 to 10, what conditions are the worst cases when we use quick sort and heap sort. Also prove your answers.
- 8. (15%) The following is a B-tree of order 3. Please plot the tree again after deleting key "20". How many disk accesses are needed?

科目:作業系統與資料結構【資工系碩士班甲組(含乙組選考)】 共3頁第3頁



- 1. [10%] Consider an  $n \times n$  chessboard for some fixed  $n \in \mathbb{Z}^+$ . For  $1 \le k \le n$ , how many  $k \times k$  squares are contained in this chessboard? How many squares in total?
- 2. [10%] In triangle ABC, the length of side BC is 293. If the length of side AB is a perfect square, the length of side AC is a power of 2, and the length of side AC is twice the length of side AB, determine the perimeter of the triangle.
- 3. [10%] Let  $\Sigma = \{v, w, x, y, z\}$  and  $A = \bigcup_{n=1}^{6} \Sigma^{n}$ . How many strings in A have xy as a proper prefix?
- 4. [10%] Let M = (S, I, O, v, w) be a finite state machine, where  $S = \{s_0, s_1, s_2, s_3, s_4, s_5\}$  is the set of states;  $I = \{0, 1\}$  is the input alphabet for M;  $O = \{0, 1\}$  is the output alphabet for M;  $v: S \times I \rightarrow S$  is the next state function; and  $w: S \times I \rightarrow O$  is the output function defined as follows:

|                       | ν                     |             | w |   |
|-----------------------|-----------------------|-------------|---|---|
|                       | 0                     | 1           | 0 | 1 |
| s <sub>0</sub>        | <i>S</i> 4            | Sı          | 0 | 0 |
| Sì                    | S4                    | <b>s</b> 2  | 0 | 1 |
| <i>s</i> <sub>2</sub> | <b>S</b> 3            | \$5         | 0 | 0 |
| <i>S</i> 3            | <i>s</i> <sub>2</sub> | <b>\$</b> 5 | 1 | 0 |
| \$4                   | S <sub>4</sub>        | S4          | 1 | 1 |
| \$5                   | <i>s</i> <sub>2</sub> | \$3         | 0 | 1 |

Determine the transient states, sink states, and the set of states of each strongly connected submachine with the input alphabet {0, 1}.

- 5. [10%] Let  $A = \{1, 2, 3, 4, 5\} \times \{1, 2, 3, 4, 5\}$ , and define R on A by  $(x_1, y_1)$  R  $(x_2, y_2)$  if  $(x_1+y_1) = (x_2+y_2)$ , where R is an equivalence relation on A. Determine the equivalence class [(2, 4)].
- 6. [10%] Let G = (V, E) be a loop-free connected 4-regular planar graph. If |E| = 16, how many regions are there in a planar depiction of G?

# 科目:離散數學【資工系碩士班甲組(含乙組選考)】

共乙頁第乙頁

- 7. [10%] The function f(x) is the exponential generating function for the sequence  $a_0$ ,  $a_1, a_2, \ldots$ , whereas the function g(x) is the exponential generating function for the sequence  $b_0, b_1, b_2, \ldots$  Express g(x) in terms of f(x) if  $b_1 = 2$ ,  $b_2 = 4$ ,  $b_n = 2a_n$  for  $n \in \mathbb{N}$  and  $n \neq 1, 2$ .
- 8. Solve the following recurrence relations:
  - (a) [5%]  $a_{n+1} a_n = 3n^2 n$ ,  $n \ge 0$ ,  $a_0 = 3$ . (Find  $a_n = ?$ )
  - (b) [5%]  $a_{n+1} = -2a_n 4b_n$ ,  $b_{n+1} = 4a_n + 6b_n$ ,  $n \ge 0$ ,  $a_0 = 1$ ,  $b_0 = 0$ . (Find  $a_n = ?$ ,  $b_n = ?$ )
- 9. [10%] Let G be a group with subgroups H and K. If (the order of G) = 660, (the order of K) = 66, and  $K \subset H \subset G$ , what are the possible values for the order of H?
- 10. [10%] Find [17]<sup>-1</sup> in  $\mathbb{Z}_{1009}$  and [18]<sup>-1</sup> in  $\mathbb{Z}_{1024}$ ?

## 科目:工程數學【資工系碩士班乙組選考】

共3頁第1頁

1. Given the matrix

$$A = \begin{bmatrix} 1 & 2 & 1 \\ 2 & 3 & 2 \\ 3 & 2 & 1 \end{bmatrix},$$
$$B = \begin{bmatrix} 1 & 2 \\ 3 & 2 \end{bmatrix},$$

- (a) (5%) Decompose A = LDU such that L is lower-triangular, D is diagonal and U is upper-triangular.
- (b) (5%) Compute  $B^{10}$
- (c) (5%) Compute  $B^{100}$
- 2. A Gaussian random variable has a probability density function given by

$$f(x) = \frac{1}{\sqrt{2\pi}\sigma} \exp\left[-\frac{1}{2}\frac{(x-\mu)^2}{\sigma^2}\right],$$

which is often denoted by  $\mathcal{N}(\mu, \sigma^2)$ .

Suppose we have two random sample generator  $X_1$  and  $X_2$ . We randomly choose one of them and get a sample of x = 3. Suppose that the distribution for samples from  $X_1$  and  $X_2$  are  $\mathcal{N}(1,4)$  and  $\mathcal{N}(4,1)$ . What is the probability that this sample comes from  $X_1$ ,

- (a) (5%) if  $X_1$  and  $X_2$  are equally likely to be chosen?
- (b) (5%) if  $X_1$  is two times more likely than  $X_2$  to be chosen?
- 3. Let X,Y be two independent random variables with the same uniform distribution U[0,1].
  - (a) (5%) What is the probability  $Pr(Y < \sin X)$ ?
  - (b) (5%) What is the probability  $Pr(X^2 + Y^2 < 1)$ ?
- 4. The Levi-Civita symbol is defined by

$$\epsilon_{ijk} = \begin{cases} 1, & \text{if } (i,j,k) \text{ is an even permutation of } (1,2,3), \\ -1, & \text{if } (i,j,k) \text{ is an odd permutation of } (1,2,3), \\ 0, & \text{otherwise.} \end{cases}$$

The Kronecker delta is defined by  $\delta_{ij} = \begin{cases} 1, & \text{if } i = j \\ 0, & \text{otherwise.} \end{cases}$ 

(a) (5%) Show that

$$\sum_{i=1}^{3} \epsilon_{ijk} \epsilon_{ilm} = \delta_{jl} \delta_{km} - \delta_{jm} \delta_{kl}.$$

(b) (5%) Show that

$$A \times (B \times C) = B(A \cdot C) - C(A \cdot B),$$

where A, B and C are vectors in 3-D space.

5. Solve the following ordinary differential equations.

(a) (5%) 
$$y' = x\sqrt{1-y^2}$$
,  $y(0) = 1$ 

(b) (5%) 
$$y' = -\frac{x+y}{x+y^2}$$
,  $y(0) = 3$ 

(c) (5%) 
$$y'' - 2y' + 5y = 0$$
,  $y(0) = 1, y'(0) = 1$ 

6. Let f(x) be periodic with one period being

$$f(x) = x^2, \quad -\pi \le x \le \pi.$$

- (a) (5%) What is the Fourier series for f(x)?
- (b) (5%) Show that

$$\sum_{n=1}^{\infty} \frac{1}{n^2} = \frac{\pi^2}{6}$$

7. The Laplace transform of a function f(t) is defined by

$$F(s) = \int_0^\infty e^{-st} f(t) dt.$$

Find the Laplace transform of the following functions

(a) (5%) 
$$f(t) = \begin{cases} -2 & 0 \le t < 3 \\ 2 & 3 \le t < 6 \\ 1 & 6 \le t < \infty \end{cases}$$

# 科目:工程數學【資工系碩士班乙組選考】

共3頁第3頁

(b) 
$$(5\%)$$
  $f(t) = t \sin 2t$ 

- 8. Compute the following contour integrals, where C is the unit circle of the complex z-plane and is traversed counter-clockwise
  - (a) (5%)

$$\oint_C z^{-3} \cos z dz$$

(b) (5%)

$$\oint_C \frac{(z+1)(z+2)}{z(z-0.5)} dz$$

9. A walk in a graph G of length n from node i to node j is an sequence of nodes  $(v_0 = i, v_2, \ldots, v_n = j)$ , where there is an edge between adjacent nodes in the sequence. For example, (1, 2, 5) is a walk of length 2 from node 1 to node 5 in the given graph. The adjacency matrix A of a graph of N nodes is the  $N \times N$  matrix where

$$A_{ij} = \begin{cases} 1, & \text{if node } i \text{ and node } j \text{ is connected by an edge} \\ 0, & \text{otherwise} \end{cases}$$



For this graph,

- (a) (5%) What is the adjacency matrix?
- (b) (5%) What is the number of walks of length 4 between node 1 and node 5?

### 科目:電子學【資工系碩士班乙組選考】

共2頁第 | 頁

- 1. (40 %) The relative areas of the transistors in the following amplifier circuit are all unity except for Q7 which has a relative area of 4. Assume that  $V_T \cong 25mV$ ,  $\beta = 99$  and  $V_{BEQ} \cong 0.7V$  for all transistors.
  - (a) Calculate the voltage gain from Q5 to the output (Vo/Vx). (10%)
  - (b) Calculate the voltage ratio Vx/Vin. (10%)
  - (c) Discuss the purpose of the use of R3 and R4. (5 %)
  - (d) In additional to achieve larger voltage gain, discuss in details why this amplifier circuit uses two differential amplifying stages (5%)
  - (e) Draw a summing amplifier circuit that can compute the sum (v1+v2) of two input voltage values (v1 and v2). You should design this summing amplifier based on the circuit shown below by adding some extra resistors. (10%)



- 2. (25 %) For the following circuit, we assume  $r_o = \infty$ ,  $\frac{1}{2}k_n \frac{w}{L} = 2mA/V^2$ ,  $V_t = 1V$  and  $C_{gs} = C_{gd} = 2pF$ .
  - (a) Determine the midband voltage gain. (8%)
  - (b) Select the appropriate values for C1, C2, and CS such that the low-frequency response will be dominated by a pole at 100 Hz and the nearest pole or zero will be at least a decade away. (10 %)
  - (c) Find out the upper 3-dB frequency (7 %).



## 科目:電子學【資工系碩士班乙組選考】

共2百第2百

- 3. (35 %) Answer the following questions regarding the digital logic circuit:
  - (a) Discuss how to choose suitable transistor parameters such that there will be no static current flowing from the P-MOS directly to N-MOS for CMOS inverter in Fig. 3(a). (7%)
  - (b) Assume  $\mu_n C_{ox} = 2\mu_p C_{ox} = 20\mu A/V^2$ ,  $V_{tn} = \left|V_{tp}\right| = 2V$ ,  $(W/L)_n = 10$ , and  $(W/L)_p = 20$ , find the maximum current that this CMOS inverter can sink while  $V_{in} = 10V$  and  $V_o$  remains  $\leq 0.5V$ . (5%)
  - (c) Suppose the CMOS inverter is switched at a frequency of 2 MHz, and loaded by a 10-pf capacitance. Calculate the dynamic power dissipated. (4%)
  - (d) Draw a circuit diagram of a CMOS gate that realizes the logic function  $Y = \overline{A(B + CD)}$ . You should also specify the suitable transistor size (W/L) for all the transistors in order to achieve symmetrical pull-up and pull-down current-driving capability. (Assume  $\mu_n C_{ox} = 2\mu_p C_{ox}$ ) (12%)
  - (e) Find out the logic function of Y for the circuit shown in Fig. 3(b). (7%)



Fig. 3(a)

Fig. 3(b)