# Power Dissipation Analysis and Optimization of Deep Submicron CMOS Digital Circuits Richard X. Gu and Mohamed I. Elmasry, Fellow, IEEE Abstract—This paper introduces a simple analytical model for estimating standby and switching power dissipation in deep submicron CMOS digital circuits. The model is based on Berkeley Short-Channel IGFET model and fits HSPICE simulation results well. Static and dynamic power analysis for various threshold voltages is addressed. A design methodology to minimize the power-delay product by selecting the lower and upper bounds of the supply and threshold voltages is presented. The effects of the supply voltage, the threshold voltage, and η, which reflects the drain induced barrier lowing, are also addressed. #### I. Introduction VLSI fabrication technology of deep submicron CMOS devices has witnessed giant steps in the last several years [1], [2], [3]. Room temperature 0.1 $\mu$ m CMOS technology on bulk silicon with an 11.8 ps gate delay was reported in [4]. The performance improvements at room temperature due to device miniaturization provide very high speed logic operation. It is predicted that room temperature deep submicron CMOS technology will be favored in high speed computing compared to bipolar ECL [5]. With the rapid VLSI technology progress in miniaturization, the supply voltage should be scaled down to avoid hot-carrier effects in CMOS logic circuits. The speed of the circuits decreases if the ratio of $V_{dd}/V_{th}$ is less than five because the current driving capability decreases. In order to maintain and increase the speed of CMOS circuits, the threshold voltage is scaled down. However, threshold voltage scaling causes an exponential increase in the standby current. As a result, estimating the dynamic power of CMOS circuits as the only dominant power component is no longer valid. A design methodology that optimizes speed and low power by choice of supply and threshold voltages was reported [6]. However, due to the existence of standby currents in deep submicron CMOS devices, a detailed analysis and an understanding of the standby current and its Manuscript received January 19, 1995; revised September 15, 1995. This work is supported in part by NSERC. related power consumption in digital circuits is very important. In this paper, a simplified analytical model for power analysis of deep submicron CMOS circuits is introduced. The model is based on power calculations of both the switching power and standby power. Section II presents the DC standby current model for deep submicron CMOS circuits. The total power consumption of some conventional CMOS gates using this model is described in Section III. Section IV deals with a design methodology to minimize the power-delay product by selecting the appropriate supply and threshold voltage. # II. STANDBY CURRENT OF DEEP SUBMICRON DIGITAL CMOS CIRCUITS When deep submicron MOS transistors operate in the subthreshold region, the standby drain current is exponentially dependent on the gate-source voltage. Therefore, in CMOS logic circuits, even when $V_{gs} = 0$ , a DC leakage current still exists. Most of CMOS logic circuits are composed of series-parallel combination networks of MOS transistors. The DC standby current of parallel connected MOS transistors is the sum of the currents of each transistor. Thus, the analysis of the standby current of stacked MOS transistors with $V_{gs} = 0$ is essential to measure the DC power of deep submicron CMOS circuits. In the following, a model of the standby current of stacked MOS circuits as shown in Fig. 1 is introduced. We only calculate the DC current of stacked NMOS transistors. This method is also applied to stacked PMOS transistors. The Berkeley Short-Channel IGFET model (BSIM) [7] is used for model calculation. The threshold voltage is expressed as $$V_{th} = V_{FB} + \phi_s + k_1 \sqrt{\phi_s} - k_2 \phi_s - \eta V_{dd}$$ (1) where $V_{FB}$ is the flatband voltage, $\phi_s$ is two times the Fermi potential, $k_1$ and $k_2$ terms represent the nonuniform doping effect, $\eta$ models the drain induced barrier lowing (DIBL) effect, which is an undesirable punchthrough current flowing between the source and drain below the surface of the channel. The drain current in the subthreshold region is experimentally dependent on $V_{th}$ and $V_{dd}$ and is expressed as $$I_s = \frac{I_{\text{sub}}I_{\text{limit}}}{I_{\text{sub}} + I_{\text{limit}}} \tag{2}$$ R. X. Gu was with the Department of Electrical & Computer Engineering, University of Waterloo, Waterloo, Ontario, Canada N2L 3G1. He is now with the DSP R&D Center, Texas Instruments Incorporated, Dallas, M. I. Elmasry is with the Department of Electrical & Computer Engineering, University of Waterloo, Waterloo, Ontario, Canada N2L 3G1. Publisher Item Identifier S 0018-9200(96)03403-8. Fig. 1. (a) One NMOS, (b) two stacked NMOS transistors, and (c) three stacked NMOS transistors with $V_{\rm gs}=0$ . where $$I_{\text{sub}} = \mu_0 C_{\text{ox}} \frac{W_{\text{eff}}}{L_{\text{eff}}} V_t^2 e^{1.8} \exp[(V_{gs} - V_{th})/nV_t]$$ $$\cdot (1 - \exp(-V_{ds}/V_t))$$ (3) $$I_{\text{limit}} = 4.5\mu_0 C_{\text{ox}} \frac{W_{\text{eff}}}{L_{\text{eff}}} V_t^2.$$ (4) NMOS transistors with $V_{gs}=0$ operate in the weak-inversion region. It is obvious that $I_{\text{limit}}$ is much larger than $I_{\text{sub}}$ under this condition. According to (2), the weak inversion current is given by $$I_s = I_o \exp((V_{gs} - V_{th})/nV_T) (1 - \exp(-V_{ds}/V_T))$$ (5) where $V_T = q/kT$ and $I_o = \mu_0 C_{\rm ox}(W_{\rm eff}/L_{\rm eff}) V_T^2 e^{1.8}$ . This equation shows that the drain current is almost independent of the drain–source voltage $V_{ds}$ when the ratio of $V_{ds}/V_T$ is larger than 2. This is true only when $V_{gs} = 0$ for two NMOS transistors and $V_{gs} = V_{dd}$ for the other transistors in a stacked network (see Cases 1 and 2 below). However, when $V_{gs} = 0$ for more than two NMOS transistors (see Case 3), at least $V_{ds}/V_T$ of the lowest transistor goes down to less than one. The standby current is thus a function of $V_{ds}$ . In the following, the weak inversion currents of frequently used patterns (three cases) such as one transistor, two stacked transistors, and three stacked transistors with $V_{gs} = 0$ are derived, respectively. The case of three stacked transistors with $V_{gs}$ of two transistors equal to zero is the same as Case 2. #### A. Case 1: One Transistor In Fig. 1(a), the current $I_{s1}$ passed through NMOS N1 is given by $$I_{s1} = I_o \exp(-V_{th}/nV_T)$$ = $I_o \exp[-(k_1\sqrt{\phi_s} - k_2\phi_s - \eta V_{dd})/nV_T].$ (6) #### B. Case 2: Two Stacked Transistors In Fig. 1(b), the current $I_{s2}$ passed through NMOS transistors N1 and N2 is given by $$I_{s2} = I_o \exp[-(k_1 \sqrt{\phi_s} - k_2 2\phi_s - \eta V_{ds2})/nV_T]$$ (7) $$= I_o \exp[-V_{ds2} - (k_1 \sqrt{\phi_s + V_{ds2}} - k_2(\phi_s + V_{ds2}) - \eta V_{dd})/nV_T].$$ (8) Solving the above equations with $\sqrt{1+x} = 1 + 0.5x$ , (x < 1), $V_{ds2}$ is given by $$V_{ds2} = \frac{\eta V_{dd}}{C} \tag{9}$$ where $C = 1 + \eta + k_1/2\sqrt{\phi_s} - K_2$ . ### C. Case 3: Three Stacked Transistors In Fig. 1(c), the current $I_{s3}$ passed through NMOS transistors N1, N2, and N3 is given by $$I_{s3} = I_o \exp[-(k_1\sqrt{\phi_s} - k_2\phi_s - \eta V_{ds3})/nV_T]$$ $$\cdot (1 - \exp(-(V_{ds3}/V_T))) \qquad (10)$$ $$= I_o \exp[(-V_{ds3} - (k_1\sqrt{\phi_s} + V_{ds33}) - k_2(\phi_s + V_{ds3}) - \eta V_{ds2})/nV_T] \qquad (11)$$ $$= I_o \exp[(-V_{ds2} - V_{ds3} - (k_1\sqrt{\phi_s} + V_{ds2} + V_{ds3}) - k_2(\phi_s + V_{ds2} + V_{ds3}) - k_2(\phi_s + V_{ds2} + V_{ds3}) - k_2(\phi_s + V_{ds2} + V_{ds3}) - \eta V_{dd})/nV_T]. \qquad (12)$$ Solving (11) and (12), $V_{ds2}$ is given by $$V_{ds2} = \frac{\eta V_{dd}}{C} \tag{13}$$ where $C = 1 + \eta + k_1/2\sqrt{\phi_s} - K_2$ . HSPICE simulations show that $V_{ds3}$ is smaller than $V_T$ . Thus, the exponential term in (10) can be expressed as $\exp(-V_{ds3}/V_T) = 1 - V_{ds3}/V_T$ , (( $V_{ds3}/V_T$ ) < 1). Introducing this equation to (10) and solving (10) and (11) results in the following: $$\log\left(\frac{V_{ds3}}{V_{T}}\right) + \frac{C}{n} \frac{V_{ds3}}{V_{T}} = \frac{\eta^{2} V_{dd}}{C n V_{T}}.$$ (14) For a typical deep submicron CMOS technology, $k_1 = 0.6$ to $1V^{0.5}$ , $k_2 = 0$ to 0.1, $\eta = 0.04$ to 1, n = 1.4 to 1.5 and $\phi_s = 0.9$ V. If we choose $k_1 = 0.8V^{0.5}$ , $k_2 = 0.1$ , $\eta = 0.08$ , n = 1.45 and $\phi_s = 0.9$ V, according to (13), we get C = 1.4. Thus, C and n are very close to each other. Assuming $V_{dd} = 0.9$ to 1.5 V, the $\eta^2 V_{dd}/CnV_T$ term is of the order of 0.1 and therefore is neglected. Then (14) becomes $$\log\left(\frac{V_{ds3}}{V_T}\right) + \frac{V_{ds3}}{V_T} = 0. \tag{15}$$ $V_{ds3}$ is thus given by $$V_{ds3} = 0.6V_T = 16mV. (16)$$ The above equation shows that $V_{ds3}$ is independent of $V_{dd}$ . Equation (1) can be rewritten as the following $$V_{th} = V_{th0} - \eta V_{dd}. \tag{17}$$ According to the above mathematical manipulation, the following results are obtained $$I_{s1} = I_o \exp(CV_{ds3}/nV_T) \exp(-V_{th0}/nV_T) \exp(\eta V_{dd}/nV_T)$$ = 1.8 $$I_o \exp(-V_{th0}/nV_T) \exp(\eta V_{dd}/nV_T)$$ (18) $$I_{s2} = 1.8 I_o \exp(-V_{th0}/nV_T)$$ (19) $$I_{s3} = I_o \exp(-V_{th0}/nV_T).$$ (20) Thus $$I_{s1}:I_{s2}:I_{s3} = 1.8 \exp(\eta V_{dd}/nV_T):1.8:1.$$ (21) The above equation shows the drain induced barrier lowing effect plays an important role in the DC leakage current. Reducing the DIBL effect is the key issue for low power applications. Reducing the oxide thickness and junction depth reduces the DIBL effect. The DC standby current of an MOS network can be expressed as a function of a single MOS transistor. If the number of stacked MOS transistors is more than three, the standby current is very small and can be neglected. # III. Power Analysis of Deep Submicron CMOS Gates In digital CMOS circuits, the total power is given by $$P_{t} = a_{t}(C_{L}V_{dd}^{2}f) + I_{s}V_{dd}. {(22)}$$ The first term is the switching component of the power, where $a_t$ is the activity factor, i.e., the probability that a power-consuming transition occurs, $C_L$ is the loading capacitance, f is the clock frequency. The second term is the power caused by the leakage current. In this section, the power dissipation of some conventional CMOS gates is analyzed. In the following, we assume that NMOS and PMOS have the same standby current in an MOS logic network. ### A. CMOS Inverter The standby current $I_s$ of the CMOS inverter [Fig. 2(a)] is given by Case 1 $$I_{\rm s} = I_{\rm s1}. \tag{23}$$ The average DC power is $$P_{s} = I_{s1}V_{dd}. \tag{24}$$ If the fanout number is m, $C_g$ is the total parasitic capacitance of a loaded gate and $C_d$ the drain-substrate capacitance, the average dynamic power is $$P_{d} = a_{t}(2C_{d} + mC_{\sigma})V_{dd}^{2}f.$$ (25) ### B. Two-Input CMOS NAND Gates The standby current $I_s$ of the two-input CMOS NAND gates [Fig. 2(b)] is given by $$A = 0, B = 0, I_s = I_{s2}$$ (26) $$A = 1, B = 0, I_s = I_{s1} (27)$$ Fig. 2. (a) Inverter, (b) NAND gate, (c) NOR gate, (d) six-transistor XOR, and (e) eight-transistor XOR. $$A = 0, B = 1, I_s = I_{s1}$$ (28) $$A = 1, B = 1, I_s = 2I_{s1}.$$ (29) If we consider the probability of each state is the same, then the average DC power is $$P_{s} = 1.04 I_{s1} V_{dd}. {30}$$ The average dynamic power is $$P_d = 0.25a_t(3C_d + mC_g)V_{dd}^2 f. {(31)}$$ The average power dissipation of a two-input NOR gate is equal to that of a two-input NAND gate. #### C. Three-Input CMOS NAND Gates The power calculation of the three-input CMOS NAND gates [Fig. 2(c)] is similar to that of the two-input CMOS NAND gates. The three-input CMOS NAND gates have eight logic states: one $I_{s3}3$ , three $I_{s2}$ , three $I_{s1}$ , and one $3I_{s1}$ . Thus, the average DC power is expressed as $$P_s = 0.81 I_{s1} V_{dd}. (32)$$ The average dynamic power is given by $$P_d = 0.125a_t(4C_d + mC_g)V_{dd}^2f. (33)$$ The average power dissipation of a three-input NOR gate is equal to that of a three-input NAND gate. ## D. Six-Transistor CMOS XOR Gates The standby currents of six-transistor CMOS XOR gates [Fig. 2(d)] have only two values: $I_{s1}$ and $4I_{s1}$ . The average DC power is given by $$P_{s} = 2.5I_{1}V_{dd}. (34)$$ The average dynamic power is expressed as $$P_d = 0.25a_t(4C_d + 2mC_g)V_{dd}^2f. (35)$$ Fig. 3. DC and switching power of two-input CMOS gates. Fig. 4. DC and switching power of CMOS inverters. #### E. Eight-Transistor CMOS XOR Gates The standby current $I_s$ of eight-transistor CMOS XOR gates [Fig. 2(e)] is $I_s = 4I_{s1}$ for all combinations of the inputs. The average DC power is expressed as $$P_s = 4I_{s1}V_{dd}. (36)$$ The average dynamic power is given by $$P_d = 0.25a_t(4C_d + 2mC_g)V_{dd}^2f. (37)$$ We choose $C_g=26$ fF, $C_d=3.5$ fF, m=4, and $a_t=30\%$ for a 0.15 $\mu m$ CMOS technology. The figure of the DC and dynamic power dissipation versus the supply voltage of a two-input gate which is calculated by both the mode and HSPICE simulator is shown in Fig. 3. The Fig. 5. DC and switching power of three-input CMOS gates. Fig. 6. DC and switching power of six- and eight-transistor CMOS XOR. model fits HSPICE simulation results well. Therefore, we may use the above models as a presimulator to estimate power dissipation. The examples of power dissipation of the inverters, three-input gates, and XOR gates are shown in Figs. 4, 5, and 6, respectively. The above figures show that the DC power dissipations with $V_{th} = 0.2$ V is about the same order compared with the switching power. For low power applications, the DC power should be much smaller than the switching power. Thus, a higher threshold voltage is required, for example, $V_{th}$ could be chosen between 0.3 V and 0.4 V. Comparing Figs. 3 and 5, it suggests that the three-input CMOS gates consume a little bit less DC power than the two-input CMOS gates. However, the speed of the three-input CMOS gates is slower than the two-input CMOS gates. # IV. MINIMIZING THE POWER-DELAY PRODUCT BY THE SELECTION OF THE SUPPLY VOLTAGE In deep submicron CMOS digital circuits, analytical analysis of circuit performances, such as the power dissipation and delay, is an important issue. The speed and power consumption trade against each other. In general, if the speed has top priority, the threshold voltage is reduced and the supply voltage is increased; if the power dissipation has top priority, the threshold voltage is increased and the supply voltage is decreased. In this section, we will study the power-delay product (PDP), which is a useful figure of merit for digital circuits. Based on the power analysis in the previous sections, we are able to minimize the power-delay product by selecting the supply and threshold voltages. A design methodology to find the lower and upper bounds of the supply and threshold voltages is discussed. The delay time between the input and the output waveforms measured at the $(V_{dd}/2)$ points for a chain of the CMOS inverters is given by [8] $$\tau_{\text{inv}} = k_1 \frac{1}{V_{dd} \left(1 - \frac{V_{th0}}{V_{dd}}\right)^2}$$ (38) where $k_1$ is a constant. In the above equation, we assume that $V_{th}$ is equal to $V_{th0}$ . The delay time for series MOS-FET's is assumed to increase linearly with an increase of number of series MOSFET's [9]. The delay time of the two-input and three-input NAND and NOR gates can be expressed as $au_{ m NAND2, NAND3, NOR2, NOR3}$ $$= k_{\text{NAND2, NAND3, NOR2, NOR3}} k_1 \frac{1}{V_{dd} \left(1 - \frac{V_{th0}}{V_{dd}}\right)^2}.$$ (39) For a small output load, $k_{\text{NAND2,NOR2}}$ are close to two and $k_{\text{NAND3,NOR3}}$ are close to three. The power-delay product of an MOS logic network is expressed as $$PD_{\text{total}} = (I_{\text{total}}V_{dd} + P_d)t_D$$ = $k_{s1}PD_{s1} + k_{s2}PD_{s2} + k_{s3}PD_{s3} + PD_d$ (40) where $PD_{s1}$ , $PD_{s2}$ , $PD_{s3}$ are the DC components of the PDP of the inverters, NAND2 and NOR2 gates, NAND3 and NOR3 gates, respectively. $k_{s1}$ , $k_{s2}$ , and $k_{s3}$ are the total number of the inverters, NAND2 or NOR2, and NAND3 or NOR3 in a VLSI system, respectively. $PD_d$ are the dynamic component of the PDP. The equations of $PD_{s1}$ , $PD_{s2}$ , $PD_{s3}$ , and $PD_d$ are given by $$PD_{s1} = 1.8k \exp(-V_{th0}/nV_{T}) \exp(\eta V_{dd}/nV_{T})$$ $$\cdot \frac{1}{V_{dd} \left(1 - \frac{V_{th0}}{V_{dd}}\right)^{2}}$$ (41) $$PD_{s2} = 1.8 \ k_{\text{NAND2, NOR2}} \ k \ \exp(-V_{th0}/nV_{T})$$ $$\cdot \frac{1}{V_{dd} \left(1 - \frac{V_{th0}}{V_{dd}}\right)^{2}}$$ (42) $$PD_{s3} = k_{\text{NAND3, NOR3}}k \exp(-V_{th0}/nV_T)$$ $$\cdot \frac{1}{v_{dd} \left(1 - \frac{V_{th0}}{V_{dd}}\right)^2}$$ (43) and $$PD_d = \frac{CV_{dd}f}{\left(1 - \frac{V_{th0}}{V_{dd}}\right)^2}. (44)$$ From the above equations, we find that $PD_{s2}$ and $PD_{s3}$ have almost the same expressions. Because the DC standby current is independent of $V_{dd}$ and the delay decreases with the increase of $V_{dd}$ , $PD_{s2}$ and $PD_{s3}$ are monotonically decreasing functions of $V_{dd}$ . From (41), (42), and (43), $PD_{s1}$ dominates the PDP if we assume that the probability of each logic state of NAND2, NOR2, NAND3, and NOR3 is identical. Thus, the PDP of an inverter determines the total PDP of a VLSI system. To minimize the PDP of an inverter, we set the derivatives of $PD_d$ and $PD_{s1}$ with respect to $V_{dd}$ to zero. Thus, the relations between $V_{dd}$ , $\eta$ , and $V_{th}$ are expressed as $$V_{dd} = 3V_{th} (45)$$ and $$V_{dd}^{2} - \left(V_{th0} + \frac{nV_{T}}{\eta}\right)V_{dd} - \frac{nV_{T}V_{th0}}{\eta} = 0.$$ (46) Solving (46) we get $$V_{dd} = 0.5 \left( V_{th0} + \frac{nV_T}{\eta} + \sqrt{\left( V_{th} + \frac{nV_T}{\eta} \right)^2 + \frac{4nV_T V_{th0}}{\eta}} \right). \tag{47}$$ The figures of $V_{dd}$ versus $V_{th}$ and $V_{dd}$ versus $\eta$ are plotted using the above two equations with n=1.4 and $T=25\,^{\circ}$ C. These figures show that the optimized lower and upper bounds of the supply voltage is found based on given technology parameters such as the threshold voltage and $\eta$ , which is extracted from the drain induced barrier lowing Fig. 7 shows that $V_{dd}$ is a function of $V_{th}$ for minimizing the dynamic PDP and $V_{dd}$ is a function of both $V_{th}$ and $\eta$ for minimizing the DC PDP. Two intersection points x=0.88 V and y=1.26 V are formed by lines of " $PD_d$ " and " $PD_{s1}$ $\eta=0.06$ ." From the point x=0.88 V, it means that both the static and dynamic components of the PDP are minimized for $V_{th}=0.29$ V and $\eta=0.08$ when $V_{dd}$ is 0.88 V. If $V_{th}$ varies, the optimum supply voltage is found in the area bounded by the lines of " $PD_d$ " and " $PD_{s1}$ $\eta=0.08$ ". Fig. 7. Supply voltage vs. threshold voltage with the minimized power-delay product. Thus, the lines of " $PD_{d}$ " and " $PD_{s1}$ $\eta=0.08$ " set the lower and upper bounds for the supply voltage. In general, $V_{dd}$ is chosen between the two values set by (45) and (47). The selection of the supply voltage depends on the component dominating the PDP. If the dynamic component of the PDP is dominant, $V_{dd}$ should move closer to the optimum $V_{dd}$ given by (45), and vice versa. Compared with the lines of " $PD_{s1}$ $\eta=0.06$ " and " $PD_{s1}$ $\eta=0.08$ ," we find that as $\eta$ decreases, the DC current is decreased, so that $V_{dd}$ can be kept high to increase the speed of circuits; as $\eta$ increases, the DC current is increased, in order to reduce power consumption, $V_{dd}$ should be kept low resulting in a longer delay. Fig. 8 suggests that $V_{dd}$ has to increase with the increase of $V_{th}$ to satisfy the speed requirement, and $V_{dd}$ has to decrease with the increase of $\eta$ to reduce the DC power. When $V_{th}$ is low, the upper bound supply voltage is mainly determined by the DC PDP and the lower bound supply voltage is resolved by the switching PDP. On the other hand, when $V_{th}$ is high, the upper bound supply voltage is mainly determined by the dynamic PDP and the lower bound supply voltage is resolved by the DC PDP. This is because the DC PDP is an exponential function of $V_{dd}$ ; thus, the differential of $V_{dd}$ with respect to $V_{th}$ for the DC PDP is smaller than the dynamic PDP. The physical explanation is that the DC PDP determines the upper bound of the supply voltage for its high DC power consumption and the dynamic PDP determines the lower bound of the supply voltage for the speed consideration while the threshold voltage is low, and vice versa. Both Figs. 7 and 8 show that $\eta$ should be kept as small as possible for high-performance operation to suppress the drain induced barrier lowing effect. For applications with a given supply voltage, it is valid to use (45) and (47) to find the lower and upper bounds of the threshold voltage. The threshold voltage versus $\eta$ with $V_{dd}=1$ V is shown in Fig. 9. When $\eta=0.07$ , it is easy to find the optimum threshold voltage which is equal to 0.33 V. Otherwise, the optimized threshold voltage is found between the two lines. Fig. 8. Supply voltage vs. $\eta$ with the minimized power-delay product. Fig. 9. Threshold voltage vs. $\eta$ with the minimized power-delay product. Fig. 10. Power-delay product vs. supply voltage with $f=100~\mathrm{MHz}$ and $\eta=0.06.$ Combining (24) and (25), we are able to calculate the PDP of an inverter. Parameters such as $\eta=0.06$ , activity $a_t=30\%$ , $C_g=26$ fF, $C_d=3.5$ fF, and m=3 for a 0.15 $\mu$ m CMOS technology are used. The power-delay product versus the supply voltage for the CMOS inverter is shown in Figs. 10 and 11. Figs. 10 and 11 show that Fig. 11. Power-delay product vs. supply voltage with $V_{th} = 0.3 \text{ V}$ . all the minimum PDP's are all within the region determined by (45) and (47). The curve with $V_{th} = 0.3$ V has the best PDP when the operating frequency is 100 MHz. Carefully choosing $V_{th}$ is one of the major factors to increase the circuit performance. It is predictable that the PDP increases with the increase of the operating frequency. #### V. Conclusions We have introduced a simple analytical model for the power analysis of deep submicron CMOS circuits. The model is a simplified version of Berkeley Short-Channel IGFET model and can be used as a power dissipation estimator. The presented analysis shows that the leakage current in the MOS logic network can be expressed as a function of the leakage current of a single MOS transistor. Power calculations of both the switching power and standby power fit HSPICE simulation results well. The power-delay product is optimized by appropriately selecting the supply and threshold voltages. Therefore, we have introduced a design methodology to find the lower and upper bounds of the supply and threshold voltages. The effects of $\eta$ , the supply, and threshold voltages on the circuit performance were presented. #### REFERENCES - B. Davari et al., "A high performance 0.25 μm CMOS technology," in Proc. IEEE Int. Elect. Dev. Meeting, 1988, pp. 56-59. - [2] G. G. Shahidi et al., "A room temperature 0.1 µm CMOS on SOI," in Proc. IEEE Symp. VLSI Tech., 1993, pp. 27–28. - [3] M. Fujishima et al., "Low-power 1/2 frequency dividers using 0.1 μm CMOS circuit built with ultrathin SIMOX substrates," IEEE J. Solid-State Circuits, vol. 28, pp. 510-512, April 1993. - [4] K. F. Lee et al., "Room temperature 0.1 μm CMOS technology with 11.8 ps gate delay," in Proc. IEEE Int. Elec. Dev. Meeting, 1994, pp. 131-134. - [5] A. Masaki, "Possibilities of deep-submicrometer CMOS for very-highspeed computer logic," *Proc. IEEE*, vol. 81, pp. 1311–1324, Sept. 1993 - [6] D. Liu and C. Svensson, "Trading speed for low power by choice of supply and threshold voltages," *IEEE J. Solid-State Circuits*, vol. 28, pp. 10-17, Jan. 1993. - [7] B. J. Sheu et al., "BSIM Berkeley short-channel IGFET model for MOS transistors," *IEEE J. Solid-State Circuits*, vol. SC-22, pp. 558– 566, Aug. 1987. - [8] J. R. Burns, "Switching response of complementary symmetry MOS transistor logic circuits," RCA Review, pp. 627-661, 1964. - [9] M. Shoji, CMOS Digital Circuit Technology. New York: Prentice-Hall, 1988. Richard X. Gu was born in Shanghai, China, in December 1962. He received the B.Sc. degree in electronics from Fudan University, Shanghai, China, and the M.Sc degree in physics from the University of Western Ontario, London, Ontario, Canada in 1984 and 1990, respectively. He received the Ph.D. degree at the Department of Electrical and Computer Engineering, University of Waterloo, Canada, in 1995. From 1984 to 1988, he joined the Microelectronics Lab of Shanghai Institute of Testing Tech- nology. From 1990 to 1995, he was a Research Assistant and then a Research Associate in the VLSI Research Group at the University of Waterloo. He is currently with the DSP R&D Center, Texas Instruments Incorporated as a Member of the Technical Staff. He is a co-author of the book *High-Performance Digital VLSI Circuit Design*. His research interests include device and circuit modeling, high speed CMOS and BiCMOS digital circuits and systems, deep submicron low power circuits and systems and high speed CMOS telecom circuits. Mohamed I. Elmasry (F'88) was born in Cairo, Egypt, on December 24, 1943. He received the B.Sc. degree from Cairo University, Cairo, Egypt, and the MA.Sc. and Ph.D. degrees from the University of Ottawa, Ottawa, Ontario, Canada, all in electrical engineering in 1965, 1970, and 1974 respectively. He has worked in the area of digital integrated circuits and system design for the last 30 years. He worked for Cairo University from 1965 to 1968 and for Bell-Northern Research, Ottawa, Canada, from 1972 to 1974. He has been with the Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario, Canada, since 1974, where he is a Professor and founding Director of the VLSI Research Group. He has a cross appointment with the Department of Computer Science where he is a Professor. He holds the NSERC/BNR Research Chair in VLSI design at the same university since 1986. He has served as a consultant to research laboratories in Canada, Japan, and in the United States, including AT&T Bell Labs, GE, CDC, Ford Microelectronics, IBM, Actel, Hitachi, Rockwell, Linear Technology, Xerox and BNR, and in the area of LSI/VLSI digital circuit/subsystem design. During sabbatical leaves from Waterloo he was at the Micro Components Organization, Burroughs Corporation (Unisvs), San Diego, California, Kuwait University, Kuwait and Swiss Federal Institute of Technology, Lausanne, Switzerland. He has authored and co-authored over 200 papers on integrated circuit design and design automation. He has several patents to his credit. He is editor of the IEEE Press books Digital MOS Integrated Circuits, 1981; Digital VLSI Systems, 1985; Digital MOS Integrated Circuits II, 1991; and Analysis and Design of BiCMOS Integrated Circuits, 1993. He is also author of the book Digital Bipolar Integrated Circuits (John Wiley, 1983), a co-author of the books Digital BiCMOS Integrated Circuits (Kluwer, 1992), Optimal VLSI Architectural Synthesis (Kluwer, 1993), Low-Power VLSI Digital Design: Circuits and Systems (Kluwer, 1995), and Digital High-Performance VLSI Circuit Design (Kluwer, 1995), and editor of the book Artificial Neural Networks Engineering (Kluwer, 1994). Dr. Elmasry has served in many professional organizations in different positions and received many Canadian and International Awards. He is a founding member of the Canadian Conference on VLSI, the Canadian Microelectronics Corporation (CMC), the International Conference on Microelectronics (ICM), and the founding president of Pico Electronics Inc. Dr. Elmasry is a member of the Association of Professional Engineers of Ontario and is a Fellow of the IEEE for his contributions to "digital integrated circuits."