## UW Userid

## ECE 327 Final <br> 2016t1 (Winter)

## Instructions and General Information

- 100 marks total
- Time limit: 2.5 hours ( 150 minutes)
- No books, no notes, no computers. Calculators are allowed
- If you need extra paper, request some from a proctor.
- Write neatly.
- To earn part marks, you must show the formulas you use and all of your work.
- The proctors and instructors will not answer questions, except in cases where an error on the exam is suspected. If you are confused about a question, write down your assumptions or interpretation.
- Justifications of answers will be marked according to correctness, clarity, and concision.

|  | $\begin{array}{r}\text { Total } \\ \text { Marks }\end{array}$ | $\begin{array}{r}\text { Approx. } \\ \text { Time }\end{array}$ |
| :--- | :--- | :--- |
| Q4age |  |  |$)$

## Potentially Useful Information

$$
\begin{aligned}
& P=\frac{1}{2}\left(A \times C \times V^{2} \times F\right)+(\tau \times A \times V \times I S h \times F)+(V \times I L) \\
& T=\frac{\operatorname{lns} \times C}{F} \\
& \mathrm{~F} \propto \frac{(\mathrm{~V}-\mathrm{Vt})^{2}}{\mathrm{~V}} \\
& P=V \times I \\
& P=\frac{W}{T} \\
& \mathrm{IL} \propto e^{\frac{-q \times \mathrm{Vt}}{k \times T}} \\
& \mathrm{~S}=\frac{\mathrm{T} 1}{\mathrm{~T} 2} \\
& \mathrm{M}=\frac{\mathrm{F} / 10^{6}}{\left(\sum_{i=0}^{n} \mathrm{PI}_{i} \times \mathrm{C}_{i}\right)} \\
& A^{\prime}=(1-E(1-P b)) A \\
& q=1.60218 \times 10^{-19} \mathrm{C} \\
& k=1.38066 \times 10^{-23} \mathrm{~J} / \mathrm{K} \\
& \log _{x} y=\frac{\log y}{\log x} \\
& \left(x^{y}\right)^{z}=x^{(y z)} \\
& \left(x^{y}\right)\left(x^{z}\right)=x^{(y+z)} \\
& a=b^{c} \text { is equivalent to: } \\
& a^{1 / c}=b
\end{aligned}
$$

$\qquad$
$\qquad$

## Q0 (1 Mark) !!Almost Free!!

(estimated time: 0 minutes)

Ten years from now, what, if anything, will you remember about this course, other than TimBits?
$\qquad$

## Q1 (20 Marks) DFD

(estimated time: 25 minutes)
Your task is to design a dataflow diagram for the expression: $a+b \times(c+d)+c \times(e+f)$.

## NOTES:

1. Outputs shall be registered
2. Optimization goals, in order of decreasing importance:
(a) Maximize throughput
(b) Minimize number of multipliers
(c) Minimize number of adders
(d) Minimize number of registers
(e) Minimize clock period
(f) Minimize latency
(g) Minimize number of inputs
3. Description of the multiplier:

- Latency=2.
- Throughtput=0.5.
- Combinational inputs and registered outputs. An internal register is used for the output and to store the intermediate value between the two clock cycles. This internal register does not count toward the registers used in the design.
- You shall draw a multiplier as shown below.


4. You may schedule the input values to arrive in any order, but you may read each input value only once.
5. You do not need to do any allocation.
6. The only algebraic optimizations you may use are commutativity and associativity.

## Next page is for DFD and analysis

$\qquad$

## Analysis:

|  |  |
| :--- | :--- |
| Throughput: | $\square$ |
| Latency: | $\square$ |
| Clock period: |  |
|  |  |


| Number of mults: |  |
| :--- | :--- |
| Number of adds: |  |
| Number of regs: |  |
| Number of inputs: |  |
|  |  |

## Q2 (25 Marks) The New, the Old, and the Midterm Leftovers

(estimated time: 20 minutes)
Of the four framgments of VHDL code below, one is synthesizable and good, the other three are either illegal, unsynthesizable, or synthesizable but bad coding practices.
For each of the code fragments Q2a-Q2d:

1. Answer whether the code is legal
2. If the code is illegal: explain why, and proceed to the next code fragment.
3. Answer whether the code is synthesizable.
4. If the code is unsynthesizable: explain why, and proceed to the next code fragment.
5. Answer whether the code adheres to good coding practices, according to the guidelines for ECE 327.
6. If the code does not follow good coding practices: explain why.
7. For the one fragment that is synthesizable and good practice, in Q2e, you will calculate the minimum number of FPGA cells needed to implement the circuit.

## NOTES:

1. The signal declarations are:
```
clk : std_logic;
st : std_logic_vector( 2 downto 0 ); -- one hot state
a, b, c, d, y : unsigned( 15 downto 0 );
z : unsigned( 7 downto 0 );
```

2. When calculating the number of FPGA cells for the good synthesizable fragment:

- Optimizations are allowed, so long as the externally visible input-to-output behaviour of the system does not change.
- For full marks, you must justify your answer with a drawing and/or text.

```
Q2a
y <= a + b + c when st(0) = '1'
    else a + c + d when st(1) = '1'
    else b + c + d when st(2) = '1'
    else (others => '0');
process begin
    wait until rising_edge(clk);
    z <= y( 7 downto 0 );
end process;
```

Explanation if illegal, unsynthesizable, or bad practice:

## This problem continues on the next page

$\qquad$
$\qquad$

## Q2b

```
y <= a + b + c when st(0) = '1'
    else a + c + d when st(1) = '1'
    else b + c + d when st(2) = '1';
```

process begin
wait until rising_edge(clk);
$z<=y(7$ downto 0);
end process;

Explanation if illegal, unsynthesizable, or bad practice:

Legal
Synthesizable Good Practice

$\qquad$
$\qquad$

```
Q2c
process begin
    wait until rising_edge(clk);
    y <= a + b + c when st(0) = '1'
        else a + c + d when st(1) = '1'
        else b + c + d when st(2) = '1';
end process;
z <= y( 7 downto 0 );
```

Explanation if illegal, unsynthesizable, or bad practice:
$\qquad$
$\qquad$

## Q2d

```
process begin
    wait until rising_edge(clk);
    case st is
        when "001" => y <= a + b + c;
        when "010" => y <= a + c + d;
        when "100" => y <= b + c + d;
    end case;
end process;
z <= y( 7 downto 0 );
```

Explanation if illegal, unsynthesizable, or bad practice:
$\qquad$
$\qquad$

This problem continues on the next page

Q2e (10 Marks) Area analysis for synthesizable and good code fragment:
The synthesizable and good fragment is: $\square$ (Q2a, Q2b, Q2c, or Q2d).

Minimum number of FPGA cells: $\square$
$\qquad$
$\qquad$

## Q3 (12 Marks) Latch Design

(estimated time: 15 minutes)
For each of the circuits below, answer whether it is a correct latch. If it is a correct latch, analyze the timing parameters. If it is not a correct latch, explain how the behaviour is incorrect or how to fix the design.

## NOTES:

1. All gates have a delay of 1 .
2. There are extra copies of the circuits on the next page.


If incorrect, explanation:

Correct $\square \square$


2


If incorrect, explanation:
$\begin{array}{ccc} & \text { Yes } & \text { No } \\ \text { Correct } & \square & \square\end{array}$
$\begin{array}{lc}\text { If correct, timing parameters } \\ & \mathbf{H i} \quad \mathbf{L o} \\ \text { Active } \mathrm{Hi} / \mathrm{Lo} & \square \\ \text { Clock-to-Q } & \square \\ \text { Setup } & \square \\ \text { Hold } & \square \\ & \\ & \end{array}$
$\qquad$
$\qquad$

3


If incorrect, explanation:


## Extra copies of latch circuits


$\qquad$

## Q4 (8 Marks) Latch Usage

(estimated time: 15 minutes)
Some high-speed pipelines in ASICs use latches rather than flip-flops. Latches must be used in an alternating pattern of active-high and active-low, as shown below.


Active-high latch $d-\prod_{-N}{ }^{q}$


## Q4a (5 Marks) Behaviour

Draw the execution-trace/waveform for the signals $\mathrm{b}, \mathrm{c}$, and d .

## NOTES:

1. Use zero-delay simulation semantics.


## Q4b (3 Marks) Timing Parameters

Answer which one of the timing parameters below causes much more difficulty in latch-based pipelines than in flop-based pipelines. For full marks, you must justify your answer.

|  | Much more difficult |
| :--- | :--- |
| Setup | $\square$ |
| Hold | $\square$ |
| Clock skew | $\square$ |
| Clock jitter | $\square$ |

## Q5 (15 Marks) Elmore

(estimated time: 20 minutes)
In this question, you will analyze the circuits below ( $\mathrm{A}, \mathrm{B}, \mathrm{C}$, and D ) with respect to the Elmore delay from G0 to G2.

A


B

C


D


## Q5a (8 Marks) Ranking Circuits

Rank the circuits (A, B, C, and D) in terms of the delay from G0 to G2, from smallest delay (fastest) to largest delay (slowest).

## NOTES:

1. If multiple circuits have the same delay, write the identifiers for the circuits ( $\mathrm{A}, \mathrm{B}, \mathrm{C}$, or D ) on the same line in the ranking.
2. Each resister Ri and capacitor $\mathrm{C}_{i}$ has the same value in each circuit.
For example, the $\mathrm{R}_{1}$ resistors have the same value in each circuit, and the $\mathrm{R}_{2}$ resistors have the same value in each circuit; but the value of $\mathrm{R}_{1}$ might be different from the value of $\mathrm{R}_{2}$.

## Circuit

## Fastest 1

2 $\qquad$

3 $\qquad$

## Slowest 4

$\qquad$
$\qquad$

## Q5b (7 Marks) Resistance and Capacitance

You are given one each of a $10 \mathrm{k} \Omega, 20 \mathrm{k} \Omega, 30 \mathrm{k} \Omega$, and $40 \mathrm{k} \Omega$ resistor and one each of a $10 \mathrm{pF}, 20 \mathrm{pF}, 30 \mathrm{pF}$, and 40 pF capacitor. Your task is to choose the location $\left(\mathrm{R}_{1} \ldots \mathrm{R}_{4}\right)$ for each resistor and ( $\mathrm{C}_{1} \ldots \mathrm{C}_{4}$ ) for each capacitor so that you minimize the delay from G0 to G2 for your fastest circuit.
On the copy of your fastest circuit below, label each resistor and capacitor with the value (e.g., $10 \mathrm{k} \Omega$ ) that would minimize the delay.
NOTES:

1. Annotate only your fastest circuit. Leave the other three circuits blank. If multiple circuits are equally fast, arbitrarily choose one of the fastest to annotate.
2. Use each value (e.g., $10 \mathrm{k} \Omega$ ) exactly once.
3. If multiple locations of a resistor or capacitor will have an equal effect on the delay, list those equivalent locations below.

A


B


C


D


Equivalent locations: R1, R3 (Example)
Equivalent locations: $\qquad$
Equivalent locations: $\qquad$

Equivalent locations: $\qquad$
Equivalent locations: $\qquad$
Equivalent locations: $\qquad$

## Q6 (20 Marks) Power and Performance

(estimated time: 25 minutes)
You have been promoted to be the technical leader of the F4 group for the next generation Waterluvian filter. The F4 module is the most important module in the Waterluvian filter and its design is guarded with the highest levels of security.
Each of your engineers, Aarti, Bob, and Clio, have drawn their dataflow diagram. It is your task to evaluate their dataflow diagrams in terms of MPPS/Watt, where MPPS=mega-pixels per second. Using Aarti's design as a baseline, estimate the relative MPPS/Watt of Bob and Clio's designs (e.g., "Bob's MPPS/Watt will be $e^{2 \pi}$ times the MPSS/Watt of Aarti's"). For full marks, you must justify your answer.


Aarti's original (A)


Bob's option (B)


Clio's option (C)

## NOTES:

1. All designs will be run with the same supply voltage.
2. Each design will be run at its maximum clock speed.
$\qquad$
$\qquad$
$\qquad$
$\qquad$
$\qquad$
$\qquad$
$\qquad$
$\qquad$
$\qquad$
$\qquad$
$\qquad$ $\begin{array}{ll} & \\ \text { Relative MPPS/Watt of Bob's design compared to Aarti's. } & \square \\ \text { Relative MPPS/Watt of Clio's design compared to Aarti's. } & \square\end{array}$
