# Statistical Design of the 6T SRAM Bit-Cell

Vasudha Gupta and Mohab Anis

Department of Electrical and Computer Engineering, University of Waterloo, ON, Canada N2L3G1 {vgupta, manis}@vlsi.uwaterloo.ca

Abstract—In this paper, a method for the statistical design of the SRAM bit-cell is proposed to ensure a high memory yield, while meeting design specifications for performance, stability, area and leakage. The method generates the nominal design parameters; i.e., the widths and lengths of the bit-cell transistors, which provide maximum immunity to the variations in a transistor's dimensions and intrinsic threshold voltage fluctuations. Moreover, the need to deviate from the conventional bit-cell sizing strategy to obtain a high-yield, low-leakage design in the nanometer regime is demonstrated.

Index Terms—SRAM (Static random access memory) chips, circuit optimization, design methodology.

### I. INTRODUCTION

▼ URRENTLY, more than 50% of the area of the Systemon-Chip (SoC) designs is occupied by embedded memory [1]. This is due to increasing integration of functional blocks that require large memory for data manipulation and storage. The use of the minimum size transistors in the SRAM, along with technology scaling, increases the intrinsic variability causing a large deviation in the transistor properties such as the threshold voltage  $(V_{th})$ . This and the varying transistor dimensions adversely impact the yield of the SRAMs. Memory yield is the percentage of the number of bit-cells that meet all the functionality and performance requirements, even under variability. For an SRAM, the bit-cell is functional if it allows for a non-destructive read and a successful write. Traditionally, these have been evaluated as Static Noise Margin (SNM) and write trip voltage, respectively. The performance constraints are the desired memory speed and leakage.

It should be noted that the SRAM design involves evaluation of several bit-cell architectures and layout topologies for process-layout interactions. It also requires choice of SRAM specific physical design rules and assessment of their robustness. At this level, process developers are involved. However, this phase of the bit-cell development is not analysed in our work. Our work is of significance to the circuit designer, who is involved in the 'electrical' design phase. This entails optimal selection of the transistor sizes to avoid parametric failures such as the destructive read, write and access failures and excessive leakage, which can occur due to variations in the transistor parameters. From now on, 'bit-cell design' refers to this electrical design phase.

In large measure, the variability in the bit-cell metrics such as the SNM, write trip voltage, speed and leakage, is caused by intrinsic  $V_{th}$  variations; which in turn are related to the bit-cell

Copyright (c) 2008 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending an email to pubs-permissions@ieee.org. transistor dimensions. Therefore, to meet the specifications for all the bit-cell design metrics, some of which are conflicting, in the minimum possible area, the widths and lengths of the bit-cell transistors must be chosen optimally [12]. The impact of variability on the design metrics should be considered up front during the design phase, to maximize yield.

1

However, the current industrial practice is to first develop a database, by simulations, which characterizes the design metrics for various transistor sizes. This is used to carefully choose the sizes of the bit-cell transistors. MC (Monte-Carlo) simulations are run for the chosen design to verify if variations in the design metrics such as the SNM are within the desirable bounds. The chosen design is updated and MC simulations are run iteratively, till all the design metrics meet the specifications. Another approach is to develop analytical functions for the design metrics, as a function of the device sizes. These are then deployed to choose nominal transistor sizes. These procedures have many drawbacks. The characterization, through simulations, is quite time consuming. These methods are iterative and rely on manual transistor size selection. Moreover, the chosen design may not be optimal, it may be an over-design with larger area. In this work, a systematic, statistical design method to determine the optimal size of the bit-cell transistors is proposed. The proposed framework is capable of eliminating the characterization step completely. Currently, the electrical bit-cell design in the industry takes about 2-3 weeks, even more. But with the proposed method, the optimal design can be obtained in a day or two.

The various causes of variability in transistor dimensions are sub-wavelength lithography, proximity effects, etc. [2][3]. The intrinsic variability is caused by the atomistic level differences between the devices, even though they might have identical layouts and environments. This manifests, primarily as variations in the device  $V_{th}$ . The main reasons for  $V_{th}$ variations are fluctuations in the number and location of the dopant atoms in the channel (Random Dopant Fluctuation or RDF), line edge roughness and oxide thickness variations [4]. Of these, the most significant factor is observed to be the RDF [4][5][6][8]. In addition, the  $V_{th}$  distribution due to RDF is normal and the variance ( $\sigma_{Vth}^2$ ) is inversely proportional to the channel area. In [7], the channel area component of the  $\sigma_{Vth}$  for fabricated MOSFETs is observed to be dominant.

Because of the intrinsic  $V_{th}$  variations, the SNM, read current and write trip voltage are observed to have normal distributions [9]-[11]. Technology scaling has a two-fold impact on the SRAM. First, an increase in the  $\sigma_{Vth}$  due to scaling SRAM transistors causes the distributions of SNM, read current and write trip voltage to take a larger sigma value. Secondly, the increase of the memory density at each successive technology node requires the bit-cell to tolerate a larger number of sigma variations (e.g.,  $4\sigma$  to  $5\sigma$ ) in the design characteristics [13] to ensure a satisfactory memory yield.



Fig. 1. (a)6T SRAM bit-cell (b)Sample SRAM bit-cell layout, from [19].

However, little work has been done to propose a systematic framework to design the bit-cell, while incorporating the impact of the statistical variations in the design metrics, upfront. Methods to optimize one of the design metrics- e.g. read access or leakage, have been proposed [41]-[43], but we could only find [12], which attempts to optimize the yield considering all design metrics . [12] uses the concept of failure probability to choose the sizes of bit-cell transistors. However, it ignores the joint failure probabilities (e.g. read and speed failure occurring together) because of computational complexity. Therefore, the obtained design solution need not be optimal. On the other hand, in our work, the bit-cell failures are formulated as problem constraints. Because the solution of the optimization problem requires that all the constraints be satisfied simultaneously, the failure of the bit-cell due to simultaneous occurrence of two or more reasons is also accounted for. Additionally, [12] ignores inter-die variations. It also uses semi-analytical models at three stages - 1. for the transistor characteristics (such as the saturation current and leakage), 2. for the design metrics such as the write time, and 3. for variability and yield. Modeling of the transistor characteristics and design metrics not only induces approximation errors, but also has limited usage in the industry, because designers prefer available SPICE models. The proposed method, in this work, does not use analytical modeling for either the transistor characteristics or the design metrics. Modeling is done only for variability and yield (i.e. stage 3). This reduces approximation and makes the proposed method attractive and practical for industrial usage.

In our work, the proposed method provides nominal transistor dimensions that provide the maximum immunity to the intrinsic  $V_{th}$  fluctuations due to RDF, and the variability in the transistor dimensions. It involves a minimal infrastructure for model building and mathematical computations and uses readily available models and tools in the industry for simulation. Also, the proposed formulation imparts the necessary flexibility to tune the design as per the specifications, as demonstrated in Section IV. High performance-moderate leakage and low leakage-moderate performance bit-cells in the 45nm CMOS technology are designed and analysed. It is shown that the conventional sizing is no longer sufficient to ensure a high yield for a low leakage bit-cell design.

In Section II, the bit-cell design constraints are described. Section III formulates the statistical design problem and describes the yield optimization. The results and observations are discussed in Section IV. Section V concludes the paper.

#### **II. PRELIMINARIES**

# A. Design Constraints

A typical 6T SRAM bit-cell is depicted in Fig.1 (a). M1 and M2 are called drivers, M3 and M4 are load transistors and M5 and M6 denote the access transistors. The output nodes of inverters 1 (M1 and M3) and 2 (M2 and M4) are called VL and VR, respectively. For subsequent discussions, assume VL at logic "0" and VR at logic "1". A brief overview of the bit-cell design metrics such as the SNM, write trip voltage, read current, leakage and area follows.

1) Static Noise Margin (SNM): SNM is the maximum static noise that the bit-cell can tolerated, while still maintaining reliable operation [13],[16],[17]. The bit-cell is the most unstable (the least SNM) when the word line (WL) is turned ON for *read*. VL rises to an intermediate voltage level due to the voltage divider action between M1 and M5 and can flip the bit-cell. Generally, the driver should be stronger than the access transistor [12]-[14], to ensure a non-destructive read since VL remains close to the ground. In this work, SNM is measured by DC simulation when WL=1.

2) Write Trip Voltage (Vtrip): This is measured by DC simulation. To determine if the bit-cell is writable, a DC sweep is applied on  $\overline{BL}$  in Fig.1 (a), and the maximum bit-line voltage at which the bit-cell flips is noted as the write trip voltage [13]-[14] or Vtrip. The bit-cell should have a reasonable value for the Vtrip [13] to guarantee that no unintended write occurs during the read cycle and that write is not too difficult. Typically, the access transistors should be stronger than the load devices.

3) Read Current: The bit-line current, through the access and driver stack, M5 and M1 in Fig.1(a), during the read cycle is the performance metric for the bit-cell [13]-[15]. To a large extent, the read access time is governed by the time required to develop a certain differential voltage between the bit-lines as one of them discharges during *read*. This is a function of the bit-cell read current, which can also be measured by DC simulation. During the read operation, the  $\overline{BL}$  leaks though the M6 transistor of idle bit-cells (when VR = "0" in the idle bit-cell) in the column. If the  $\overline{BL}$  voltage dips too much, it may cause erroneous sensing. This can be prevented if the leakage is kept within reasonable limits. Therefore, to ensure a successful read, the designer also needs to carefully calculate the maximum tolerable leakage limit.

4) Bit-Cell Leakage: The leakage current in the bit-cell is the primary contributor to the overall memory leakage. It is the sum of various components, e.g. the sub-threshold leakage, gate leakage and band-to-band tunneling. In this paper, the total bit-cell leakage, measured by DC simulation, is considered.

5) Area: : The bit-cell can have different types of layout [12],[18]-[19]. Researchers at IBM [21], Intel [22] and TI [23], have proposed Restrictive Design Rules (RDRs) such as single-orientation poly-silicon gates, resulting in geometries that are more regular with enhanced manufacturability [20]. Some of these have already been adopted as best practices for memory [23]. The layout topology in 45nm technology from [19] is used in this work and reproduced in Fig.1 (b). The corresponding tight design rules are also mentioned in [19]. The x and y dimensions of the bit-cell layout are calculated as a function of the layout rules as follows:

$$Area = x_{dim} \times y_{dim},$$

$$x_{dim} = 2 \times max(x1, x2), \quad y_{dim} = max(y1, y2),$$

$$x1 = (\frac{1}{2})(PP) + W_{ld} + PN + W_{drv} + PoG + (\frac{1}{2})(PoPo),$$

$$x2 = (\frac{1}{2})(PP) + W_{ld} + PN + W_{ax} + PoG + (\frac{1}{2})(CW),$$

$$y1 = 2\left[(\frac{1}{2})(CW) + 2(GC) + L_{ld}\right] + CW,$$

$$y2 = 2\left[(\frac{1}{2})(CW) + 2(GC)\right] + L_{drv} + L_{ax} + CW.$$
(1)

In Fig. 1(b), all diffusion contacts have diffusion layer underneath (not visible), but there is no diffusion layer overhang around the contact. This is because the design rules for SRAM are scaled beyond those of standard logic-process design rules and several design rules are violated within the array to achieve competitive area [38]. Post-layout lithographic correction is used to ensure a robust layout [39]. The rectangular contacts are the coupled contacts, which are used to strap poly and diffusion for cross-coupling without using metal [38]. It should be noted that Eq 1 can be easily modified to formulate the x and y dimensions of any different layout topology.

The bit-cell area is very important from the economic perspective. For a good SNM, the driver needs to be stronger than the access transistor. However, the access transistor cannot be made too small since this degrades the read current. Additionally, the access transistor needs to be reasonably strong to enable a successful write. The strength of the load can be reduced to improve the Vtrip, but a very weak load deteriorates the SNM, although the impact is small. The lengths of the driver and access can be reduced to improve the read performance, but this adversely impacts the leakage, which has become a serious concern these days. The problem is further compounded because of process variations. The above discussion outlines the bit-cell design problem.

# B. Preparatory Work

The bit-cell design is constrained by the specifications for SNM, Vtrip, read current and leakage. Each of these should be satisfied for a range of operating parameters (voltage,temperature), design parameters (transistor width/length) and statistical parameters (process fluctuations such as  $V_{th}$ ).

1) Operating parameters: These are often more critical and can be accounted for by evaluating the design metrics at their respective worst-case operating conditions. For example, the leakage is the worst at high temperature (sub-threshold leakage being the primary component) and high supply voltage. Similarly, read current should be simulated at the performance corner to meet the timing goal. The number of performance corners and the voltage and temperature for each performance corner are determined by the intended set of applications. E.g., for mobile applications, the operating temperature is lower than that for high-performance applications. The performance corner in this work is low voltage and high temperature (for worst read current). Table I documents the worst-case operating conditions (low or high) for all the design metrics.

 TABLE I

 WORST-CASE OPERATING CONDITIONS FOR DESIGN CONSTRAINTS

| Operating<br>Condition | SNM | Write<br>Trip | Read<br>Current | Leakage |
|------------------------|-----|---------------|-----------------|---------|
| Voltage                | L/H | L             | L               | H       |
| Temperature            | H   | L             | H               | H       |

Most of the behavior in Table I should remain the same for all technologies. The worst-case voltage condition for SNM (when WL=1) is interesting for the 45nm technology. At high voltage and high temperature, the SNM begins to degrade as VR storing "1" leaks excessively through M2. This behavior can be arrested, if the cell ratio  $\beta$  (the ratio of the W/L of the driver to that of access) is increased as shown in Fig.2. With a higher  $\beta$ , VL storing "0" remains closer to ground (stronger "0") and the gate voltage of M2 reduces, thereby, shutting it off more effectively. However, since the proposed solution explores the entire space of the allowable transistor sizes, SNM is simulated at both high and low voltages. A nominal supply voltage of 1V is assumed and Predictive Technology Models are employed in 45nm technology [28].

2) Design Parameters: These are the widths and lengths of the driver, access and load transistors. In this work, the inter-die variations in the widths and lengths of transistors are considered. Since the gate length impacts  $V_{th}$  significantly, inter-die threshold voltage variations are also accounted for



Fig. 2. SNM variation with voltage and temperature  $(a)\beta = 1$   $(b)\beta = 1.3$ .

implicitly [24]. According to the *ITRS* [32], the gate dimension variations have a  $3\sigma$  value of  $\pm 12\%$  of the physical gate length. With a physical gate length of 25nm in 45nm technology [12],[32], the  $3\sigma$  variation in the gate dimension is selected as 3nm (a different transistor length can be chosen for the design, but the  $3\sigma$  variation remains fixed at 3nm).

3) Statistical Parameters:  $V_{th}$  is the most significant statistical parameter. Because of the small area of the SRAM bit-cell, the close proximity of the transistors, the use of restricted design rules, a highly regular layout and fairly controlled process for the array fabrication, the effect of intradie variations in the channel length and width is negligible [12]. Therefore, in this paper, the intrinsic  $V_{th}$  variation due to RDF is considered as the main source of intra-die variation.

The  $V_{th}$  variations of six transistors are considered to be six independent and un-correlated Gaussian random variables [10], [12]. Also, the  $\sigma_{Vth0}$  - standard deviation of the  $V_{th}$  distribution of a minimum sized transistor, is an input parameter.  $\sigma_{Vth0}$  is usually available in the process development kits of vendors. Then, the  $\sigma_{Vth}$  for a transistor of width W, and length L is related to the transistor size as follows [12],[4]:

$$\sigma_{Vth} = \sigma_{Vth0} \sqrt{\frac{W_{min} L_{min}}{W L}}.$$
(2)

At the circuit design level, the designer can specify only the nominal values of the geometrical transistor dimensions (design parameters), and has little control over statistical parameters such as the  $V_{th}$  variations due to mismatch. However, as shown by (2), the choice of design parameters is used to control the extent of device mismatch.

## **III. PROBLEM FORMULATION**

## A. Intra-Die Variation

To consider the impact of intrinsic  $V_{th}$  fluctuation due to RDF, the conservative method is to calculate the worst-case  $V_{th}$  change for each of the six transistors (e.g.,  $\pm N$  sigma) and evaluate the design constraint for this case [34]. However, this approach over-estimates variability and gives poor results in terms of bit-cell area. Therefore, the statistical approach is applied in this paper to model the intra-die variations.

Because of the intrinsic  $V_{th}$  variations, the design metrics-SNM, Vtrip and read current exhibit gaussian distributions [13]. To statistically model the impact of the intrinsic  $V_{th}$ variations due to RDF, on the design metrics, the statistical average and variance of the gaussian distributions of the design metrics must be estimated. This can be done by using the Taylor series expansion as follows [40],[11],[12]:

$$SNM_{avg} = SNM_0 + \frac{1}{2} \sum_{i=1}^{6} \left( \frac{\partial^2 SNM}{\partial V t h_i^2} \right) \sigma_i^2$$
  
and  $\sigma_1 = \sigma_2 = \sigma_{drv}, \sigma_3 = \sigma_4 = \sigma_{ld}, \sigma_5 = \sigma_6 = \sigma_{ax}$   
 $\Rightarrow SNM_{avg} = SNM_0 + \frac{1}{2} \left\{ \left( \frac{\partial^2 SNM}{\partial V t h_1^2} + \frac{\partial^2 SNM}{\partial V t h_2^2} \right) \sigma_{drv}^2 - \left( \frac{\partial^2 SNM}{\partial V t h_3^2} + \frac{\partial^2 SNM}{\partial V t h_4^2} \right) \sigma_{ld}^2 + \left( \frac{\partial^2 SNM}{\partial V t h_5^2} + \frac{\partial^2 SNM}{\partial V t h_6^2} \right) \sigma_{ax}^2 \right\}$ (3)

$$\begin{split} \sigma_{SNM}^2 &= \sum_{i=1}^6 \sigma_i^2 \left(\frac{\partial SNM}{\partial V t h_i}\right)^2 = \left\{ (\frac{\partial SNM}{\partial V t h_1})^2 + (\frac{\partial SNM}{\partial V t h_2})^2 \right\} \sigma_{drv}^2 \\ &+ \left\{ (\frac{\partial SNM}{\partial V t h_3})^2 + (\frac{\partial SNM}{\partial V t h_4})^2 \right\} \sigma_{ld}^2 + \left\{ (\frac{\partial SNM}{\partial V t h_5})^2 + (\frac{\partial SNM}{\partial V t h_6})^2 \right\} \sigma_{ax}^2 \end{split}$$

Equations (3) and (4), contain the  $\sigma_{Vth}$  of the driver, access and load transistors, which can calculated by (2).  $SNM_0$  is the simulated SNM at mean  $V_{th}$  values. The partial derivative terms are also computed at mean  $V_{th}$  values, numerically by simulation. In the above two equations, the term 'SNM' can be replaced by Vtrip, Iread or leakage.

The results of modeling have been verified by Monte-Carlo simulations. For example, the SNM average and standard deviation from MC simulations are 131.8mV and 22.1mV, respectively (Fig. 3). The corresponding estimated values from modeling are 132.1mV and 21.6mV, respectively. Similarly, Fig.4(a) shows that the average leakage and standard deviation, from the 45nm bit-cell leakage MC results, are 77.2nA and 8.9nA, respectively. The estimated values from modeling, are 77.7nA and 8.57nA, respectively. For our purpose of modeling, this level of accuracy is sufficient.

The coefficient of  $\sigma_{drv}^2$  on the RHS. (right hand side) of (4) contains the slopes of SNM vs.  $V_{th1}$  and  $V_{th2}$ . These will be interchanged if the opposite data value (1 instead of 0) is stored in the bit-cell, but the overall coefficient of  $\sigma_{drv}^2$ , and hence  $\sigma_{SNM}$  will remain the same, irrespective of the data value stored in the bit-cell.

Within-die variations cause each memory bit-cell of millions in an array to differ slightly from all the others in the SNM, read current, Vtrip and leakage. The number of identical bitcells and the expected electrical yield to the specifications



Fig. 3. SNM (Volts) distribution as obtained from MC simulations.



Fig. 4. Leakage  $(\mu A)$  distribution histogram and normal probability plot for (a)single memory bit-cell and (b)sum of the leakage of 16 bit-cells.

determines the number of sigma  $(N_{\sigma})$ , over which the bitcell must operate properly [13],[11]. Typically, the number of sigmas required for the SNM, Vtrip, read current and bitcell leakage ranges from 4 to 5. This is used in the proposed problem formulation to incorporate the effect of intra-die variations. The constraints of the optimization problem take the following form:



SNM Distribution

Fig. 5. Pictorial representation of the SNM design constraint.

$$\frac{SNM_{avg} - SNM_{residual}}{\sigma_{SNM}} \ge N_{\sigma}.$$
 (5)

In (5), a constraint is put on the SNM yield instead of the SNM value. The value of  $N_{\sigma}$  is selected according to the required yield and redundancy. Both, the  $SNM_{avg}$  and  $\sigma_{SNM}$  are impacted by the choice of the nominal design parameters as observed from (2)-(4). The constraints for Vtrip and read current are formulated in a similar way.

The constraint bounds, also called residuals, such as the  $SNM_{residual}$  in (5) impart the necessary flexibility to design different types of the bit-cell. Such bounds on the read current (lower limit) and leakage (upper limit) enable the design of a bit-cell with a high or moderate performance, and a low or ultra-low leakage. The residuals on SNM and Vtrip are used to build margin for reliability. The constraint is shown in Fig.5.

However, the bit-cell leakage does not have a normal distribution with the  $V_{th}$  variation and (5) cannot be used directly. Leakage has a lognormal distribution because of the exponential relationship of sub-threshold leakage with the threshold voltage (sub-threshold leakage is the primary component). The usage of the central limit theorem [40] helps to model the sum of leakage of a sufficiently large number of bit-cells normally [12]. Fig.4 signifies that the sum of the leakage of 16 bit-cells ( $V_{th}$  of every transistor in each of the 16 bit-cells is a random variable) displays a normal distribution. The use of 16 bit-cells is reasonable, since the deployment of memories as storage elements is justified for only a certain minimum number of bits, because of the associated overhead of the peripherals in memories. The mean and sigma values of the sum of the leakage of 16 bit-cells is given as follows[40]:

$$Leakage\_mean_{16bit-cells} = 16 \times Leakage\_mean_{1bit-cell}$$
$$and \ \sigma^2_{16bit-cells} = 16 \times \sigma^2_{1bit-cell}.$$
(6)

Now, the total leakage of 16 bit-cells can be treated as a design metric, and the corresponding constraint can be expressed as:  $(Ileak\_max_{16bit-cells} - Ileak\_avg_{16bit-cells})/\sigma_{16bit-cells} \ge N_{\sigma}$ , where  $N_{\sigma}$  can range from 4 to 5, depending on the desired yield. Also,  $Ileak\_max_{16bit-cells} = 16 \times Ileak\_max_{1bit-cell}$ .

# B. Yield Maximization

The within-die threshold voltage variations, and the design constraints are a function of the transistor sizes, as shown by equations (2)-(4). The transistor sizes (design parameters) - $\{W_{drv}, W_{ax}, W_{ld}, L_{drv}, L_{ax}, L_{ld}\}$  define a six-dimensional parameter space. Within this parameter space, the design constraints (5) modeling within-die variations, define a feasible region. The nominal design should be chosen somewhere in this feasible region, so as to best satisfy the design constraints. At the same time, the variations in the transistor dimensions should also be considered. This can be done as follows.

If the spread of the design parameters is known (e.g.  $\pm 3\sigma$  value of normally distributed widths and lengths), the nominal design can be specified within a certain imaginary box, called

6



Fig. 6. Simplified yield maximization method in 2-dimensions

the tolerance box. This is depicted graphically in 2-D (for illustration) in Fig.6 for an arbitrary feasible region defined by arbitrary constraints. The dimensions of the tolerance box are determined by the spread of the design parameters. The center of the box is the nominal design (since the design parameters are assumed to have a normal distribution; hence, symmetrical). The smaller dots within this box represent all the possible design realizations. The fraction of the area of the tolerance box that overlaps the feasible region determines the yield. The tolerance box should be moved (the nominal design moves with it) so as to ensure the maximum overlap of the box and the feasible region (region of widths and lengths, which satisfy all the design constraints of within-die variations (equation 5)), thus maximizing the overall yield. The inner box in Fig.6 captures the maximum rectangular overlap that can be attained between the feasible region and the tolerance box, and can be used to estimate the yield directly [37]. In six dimensions (which is the case of bit-cell design problem), the six-dimensional volume of the inner box, henceforth called the yield box, defines the yield.

In this work, the feasible region is determined implicitly; that is, whether or not a design point lies in the feasible region is determined by simulations and by checking if all the design constraints are met. The typical design centering approach needs an explicit feasible region; i.e., the design constraints should be expressed as analytical expressions. This is not always possible, and requires either a polytope approximation or simulations for curve-fitting the data in the region of interest, which becomes difficult with the increase in the number of dimensions [25].

Qualitatively, the problem is reduced to finding  $x^l$  and  $x^h$ , the coordinates of the yield box in Fig.6 such that the following conditions are satisfied: Condition (a) the maximum difference between  $x^l$  and  $x^h$  is not more than the maximum spread in design parameters (the yield box should lie within the tolerance box) and, Condition (b) the yield box lies in the feasible region (all the points lying within the box satisfy all the design constraints that cover within-die variations). If the above mentioned conditions are met, then for the nominal design placed at the center of the yield box, the probability (yield) that the design constraints are satisfied in the presence of parameter variations is estimated in 2-D (for illustrative purpose) as follows:

$$P_{2-D} = P\left( (W_{drv}^{l} \le W_{drv} \le W_{drv}^{h}) and (W_{ax}^{l} \le W_{ax} \le W_{ax}^{h}) \right)$$
$$= P(W_{drv}^{l} \le W_{drv} \le W_{drv}^{h}) \times P(W_{ax}^{l} \le W_{ax} \le W_{ax}^{h})$$
$$= (CDF(W_{drv}^{h}) - CDF(W_{drv}^{l})) \times (CDF(W_{ax}^{h}) - CDF(W_{ax}^{l})),$$

where  $W_{drv}^l$ ,  $W_{drv}^h$ ,  $W_{ax}^l$  and  $W_{ax}^h$  form the coordinates of the yield box (in 2-D as in Fig. 6) and CDF(x) represents the cumulative distribution function of x [33]. In six dimensions (which is the case of our problem),

$$Yield(x^{l}, x^{h}) = \prod_{i=1}^{6} P(x_{i}^{l} \le x_{i} \le x_{i}^{h})$$
$$= \prod_{i=1}^{6} (CDF(x_{i}^{h}) - CDF(x_{i}^{l})), \qquad (7)$$

where  $x_i$  is the  $i^{th}$  design parameter. The solution of (7) involves the evaluation of a multi-dimensional probability integral by quadrature or Monte-Carlo based methods, which is computationally expensive [25]. However, the problem is simplified if a closed form expression for CDF can be used. Because a Gaussian distribution does not have a closed form CDF, a double-bounded probability distribution function (DB-PDF), proposed by Kumaraswamy [26] is employed. This is appropriate for physically bounded dimensions. With this model, the pdf f(z) is given by

$$f(z) = abz^{a-1}(1-z^{a})^{b-1},$$
  
where  $z = \frac{x-x^{lb}}{x^{ub}-x^{lb}}, \ x^{lb} \le x \le x^{ub}.$  (8)

In (8), z is the normalized value of x,  $x^{lb}$  and  $x^{ub}$  are the lower and upper bound, respectively, of the double-bounded random variable x. a and b are the shape parameters and distributions such as uniform, triangular ,gaussian can be obtained by using different values of a and b. A truncated Gaussian shape is used in this work with  $\pm 3\sigma$  variation as the design spread around the nominal design  $x^n$ . Therefore,

$$x^{ub} = x^n + 3\sigma_x , \ x^{lb} = x^n - 3\sigma_x , \ t = x^{ub} - x^{lb} = 6\sigma_x.$$
 (9)

In (9), t represents the maximum spread on the design parameter x. The closed form DB-CDF can be obtained by integrating f(z) and is given by:

$$F(z) = 1 - (1 - z^{a})^{b}.$$
 (10)

Due to the symmetrical nature of the distribution of the design parameters, the final optimized design solution is the center of the yield box and is computed as

$$x^n = \frac{x^l + x^h}{2}.$$
(11)

Therefore, by using DB-CDF and (7), (10) and (11),

$$\begin{aligned} Yield(x^{l}, x^{h}) \\ &= \prod_{i=1}^{6} \left( F(z_{i}^{h}) - F(z_{i}^{l}) \right), \\ &= \prod_{i=1}^{6} \left( F(\frac{x_{i}^{h} - x_{i}^{lb}}{x_{i}^{ub} - x_{i}^{lb}}) - F(\frac{x_{i}^{l} - x_{i}^{lb}}{x_{i}^{ub} - x_{i}^{lb}}) \right) \\ &= \prod_{i=1}^{6} \left( F(\frac{x_{i}^{h} - (x_{i}^{n} - 0.5t)}{t}) - F(\frac{x_{i}^{l} - (x_{i}^{n} - 0.5t)}{t}) \right) \\ &= \prod_{i=1}^{6} \left( F(\frac{x_{i}^{h} - x_{i}^{l} + t}{2t}) - F(\frac{x_{i}^{l} - x_{i}^{h} + t}{2t}) \right). \end{aligned}$$
(12)

Equation (12) gives the probability of finding a design solution in the six-dimensional yield box, given the probability distributions of the widths and lengths of the transistors. Maximizing this, would maximize the yield.

Equation (12) expresses the yield as a function of  $x^{l}$ and  $x^h$ , the coordinates of the yield box. Our intent is to widen the dimensions of the yield box to approach the tolerance box. While doing this, the yield box should lie in the feasible region, the entire time. To achieve this, as a first order condition, it is sufficient to check the constraint violation at the extreme corners of the yield box, which are given by  $\{x^l, x^h\}$ . For example, in two-dimensions there are  $2^2 = 4$ corner points for the yield box, as illustrated in Fig.7 (b). Here  $d_0$  and  $m_0$  are the possible nominal design solutions (centre of the yield box).  $d_1$  -  $d_4$  are the corners of the yield box around  $d_0$ . In six dimensions, for each choice of the nominal design, it is required to simulate at  $2^6 = 64$  corners for each design constraint to ensure that the yield box lies in the feasible region. However, this number can be reduced significantly by applying the design understanding. E.g., as the driver transistor is made stronger, the SNM improves, but degrades with an increase in the strength of the access transistor, as shown in Fig.7 (a). Clearly, the SNM should be checked at only  $\{W_{drv}^l, W_{ax}^h\}$  in 2 -D as depicted in Fig. 7 (b). For the other three corners, the SNM is only going to be better. Using this approach, the number of constraint evaluation corners can be minimized .By using constraint minimization in six dimensions, the total number of constraint evaluations for every choice of the nominal design is reduced to six. Finally, the following additional constraints are added.

$$x^l < x^h \tag{13}$$

and 
$$x^h - x^l \le t.$$
 (14)

Consider Fig.7 (b) again. For the chosen nominal design  $d_0$ , it is considered sufficient to evaluate  $SNM_{avg}$  at  $d_2$ , because SNM is the worst at this point.  $d_2$  and all other design points on or within the yield box occur because of the inter-die variation in the transistor dimensions. Since  $\sigma_{SNM}$  also varies with the transistor dimensions, it is not clear if the constraint in (5) can be checked only at  $d_2$ . To check this, 15 000 points are generated with variation in the transistor dimensions around a nominal design. Around few of these design variants, a population of 5 000 points with random  $V_{th}$  variation in all



Fig. 7. (a) SNM variation with  $W_{drv}$  and  $W_{ax}$  (b) Constraint Minimization

six transistors is generated.  $\sigma_{SNM}$  value obtained from SNM simulations at these 5 000 points matches very well with the  $\sigma_{SNM}$  computed using (2) and (4). Therefore, equations (2) and (4) can be reliably used to evaluate  $\sigma_{SNM}$  at each of the 15 000 design variants. The results, shown in Fig.8, indicate that 99.3% of the points have the SNM sigma within  $\pm 1mV$ of the nominal  $\sigma_{SNM}$  value. This implies that for the inter-die variants around  $d_0$  ( $d_1$ ,  $d_2$  and so on),  $\sigma_{SNM}$  can be assumed to be the same, but if a different nominal design such as  $m_0$ is chosen (Fig.7 (b)), then the  $\sigma_{SNM}$  can change appreciably and should be re-evaluated.



Deviation of  $\sigma_{SNM}$  (mV) from the nominal value

Fig. 8. CDF plot of deviation in  $\sigma_{SNM}$  across dies.  $\sigma_{SNM}$  is a measure of the within-die variation. It is observed that nearly 99.3% of the dies have their  $\sigma_{SNM}$  within  $\pm 1mV$  of the nominal value.

In other words, the within-die variation ( $\sigma_{SNM}$ ) remains the same for all dies. Thus, the constraint for within-die variation, in (5), can be evaluated only at the die which constitutes the global worst-case corner (worst  $SNM_{avg}$ ) such as the point  $d_2$  in Fig.7 (b) in 2-D. The same has been observed in [35].

8

#### C. Final Optimization problem

Fig. 9 shows the bit-cell design procedure in detail, with all the inputs, the objective function (yield), the design constraints, and the steps to compute the design constraints.



Fig. 9. The proposed bit-cell design procedure

A Sequential Quadratic Programming based optimizer [29] is used to solve the constrained optimization problem in six dimensions. The BSIM4 based simulator with predictive 45nm technology models [30],[31] has been used. But the method proposed in this work can be applied to any technology.

## IV. RESULTS AND DISCUSSION

HSPICE template files were prepared to simulate the bit-cell for the SNM, Vtrip, read current and leakage. The optimization engine dynamically provides the lengths and widths of the load, driver and access transistors to these templates for simulation to obtain updated values for the design constraints. This continues till the optimizer arrives at an optimal set of the sizes for the bit-cell transistors. The following set-up has been used in this work:

1)  $Area_{max} = 0.63 \mu m^2$ , and  $\sigma_{Vth0} = 55 mV$  [11]. Since different design constraints are evaluated at different worstcase global corners, the designer can apply a different value of  $\sigma_{Vth0}$  for different constraints.

2) SNM, Vtrip, read current and leakage, cause failure only on one side of the statistical variation. Therefore, 1.35 bitcells per 1000 fall outside the  $\pm 3\sigma$  range [13]. The chosen  $N_{\sigma} = 4.763$ , mathematically, corresponds to a single bit-cell failure in an array of  $1024 \times 1024$ . Typically, the largest size of an embedded SRAM block is 256 Kbits [33]; for larger blocks, performance begins to degrade. Therefore, the chosen value of  $N_{\sigma}$  covers a wide range of embedded memory sizes. With redundancy, the designer can reduce the value of  $N_{\sigma}$ .

3) The transistor dimension is altered in steps of 1nm. This is the step size available for altering the layout geometries in 45nm technology. The ranges for transistor width and length are set to 90nm-500nm and 45nm-80nm, respectively.

4) Usually, the required residuals for the SNM and Vtrip are set to 0 and the designers attempt to obtain a value of 4 to 5 for  $N_{\sigma}$  [11]. However, in this work, the SNM and Vtrip residuals are selected as 15mV and 25mV, respectively to set a higher reliability margin. Note that these are the desired bounds for the worst-case voltage-temperature corners, as in Table I.

# A. General purpose-high performance design

With the previous fixed inputs, different performance (read current) and leakage targets are used to design a general purpose bit-cell on the high performance-moderate leakage side. The results obtained in Table II and the associated trends are analysed to understand how the proposed method works. For a maximum bit-cell leakage target of 60nA (the conventional leakage values mentioned in [36]), the read current residual is increased from  $30\mu A$  to  $45\mu A$ . The corresponding transistor sizes and bit-cell areas are shown in columns 2 and 3 of Table II, respectively. As explained in Section IIIA,  $N_{\sigma}$  is a measure of the maximum intra-die variation (within-die) that the design can tolerate. This number, measured at the worst-case interdie variant (global corner) is used as a typical figure of merit in the industry for the electrical yield (e.g. ,SNM yield, Vtrip yield). The output  $N_{\sigma}$  for the respective design constraints are shown in columns 4 to 8. The yield, mentioned in the last column, is obtained by running Monte-Carlo simulations with 10 000 points, where Gaussian variation is applied to the widths, lengths and threshold voltages of all six transistors. The yield is equal to the percentage of bit-cells that satisfy the desired bounds for all the design constraints.

Several interesting observations can be made from these results. Fig.10 (a) shows the read current and leakage values at the obtained nominal designs. Any gain in performance is accompanied by an increase in the nominal leakage. The initial target of  $Iread_{residual} = 30\mu A$  is relaxed and is achieved with a small area. This target allows for a low nominal leakage value and the  $Ileak_{max} = 60nA$  bound is not violated for



TABLE IIGENERAL PURPOSE-HIGH PERFORMANCE DESIGN, OPTIMIZATION RESULTS FOR VARYING READ CURRENT BOUNDS WITH  $Ileak_{max} = 60nA$ .

Fig. 10. For varying read current residuals: (a) nominal read current ( $\mu A$ ) and bit-cell leakage (nA) (b) cell ratio and  $W_{ld}/L_{ld}$ , and (c) nominal SNM (mV)



Fig. 11. Variation of  $\sigma_{Vth}$  of driver, access and load transistors (normalized) with increasing read current requirement

up to 14.536 sigmas (column 5 of Table II) of within die variation due to RDF at the worst-case global corner. However, as the read current residual target is increased, the nominal bitcell leakage also increases (approaches  $Ileak_{max}$ ) and fewer sigmas of within-die leakage variation can be tolerated by the design. A similar trend is observed for the read current  $N_{\sigma}$  values (column 4). For a small  $Iread_{residual}$ , the chosen nominal design can afford to have both the  $Iread_{nominal}$ and  $Ileak_{nominal}$  at a safe  $N_{\sigma}$  distance from their respective bounds. However, with an increase in the  $Iread_{residual}$ , the  $Iread_{nominal}$  has to be designed to be closer to  $Iread_{residual}$ , so that the  $Ileak_{nominal}$  value does not increase too much. Therefore, the  $N_{\sigma}$  constraint value for the read current also reduces progressively. The  $\sigma_{Iread}$  values for the 4 cases in Table II are as follows: 5.25, 5.18, 5.42 and 5.77  $\mu A$ .

Fig.10 (b) shows that the cell ratio  $\beta$  reduces with the increasing read current residuals. A higher read current can be achieved by sizing up the driver and the access transistor. But driver transistors contribute heavily to total leakage. As a result, the optimizer increases the strength of the access

transistor more than that of the driver. Another reason for this is the chosen layout topology, in which driver and access are parallel to each other. Therefore, for a chosen large driver, the width of the access transistor can be increased to some extent without impacting bit-cell area (see (1) for x1 and x2).

9

It is expected that reducing the cell ratio would have a detrimental effect on the nominal SNM value. However, Fig.10 (c) indicates that the SNM for the obtained design solutions degrades by only a few mV with the falling  $\beta$ . The plot of W/L of the load transistor in Fig.10 (b) explains this observation. A stronger PMOS not only results in a stronger "1" at VR, but also increases the gate drive of M1 for a stronger "0". This improves SNM as well as read current.

TABLE IIILOW LEAKAGE BIT-CELL DESIGN RESULTS,  $Iread_{Residual} = 10 \mu A$ , $Ileak_{max} = 25 n A$ .

| $W_{drv}, W_{ax}, W_{ld}$      | 149, 174, 175, |  |
|--------------------------------|----------------|--|
| $L_{drv}, L_{ax}, L_{ld}$ (nm) | 75, 79, 54,    |  |
| Area ( $\mu m^2$ )             | 0.4405         |  |
| Yield (%)                      | 96.245         |  |

TABLE IV Low Leakage Bit-Cell Design Results,  $Iread_{Residual} = 10 \mu A$ ,  $Ileak_{max} = 25 n A$ .

| Design Constr. | Nominal Value | Nσ     |
|----------------|---------------|--------|
| Ileak          | 15nA          | 4.7773 |
| Iread          | 22.021 µA     | 4.9708 |
| SNM(LV)        | 152.2mV       | 4.8280 |
| SNM(HV)        | 177.2mV       | 4.9351 |
| Vtrip          | 226.7mV       | 6.6629 |

Fig.11 shows the variation in  $\sigma_{Vth}$  of the bit-cell transistors. Because of the smaller channel area of the load transistor,

10



Fig. 12. Yield and average values of SNM, Vtrip, read current and bit-cell leakage obtained by Monte-Carlo simulations for the low leakage bit-cell. In every figure, only one of the bit-cell dimensions is varied while the other dimensions are kept equal to the obtained optimal value.

 $\sigma_{Vth}$  of the load has a much higher absolute value than that of the driver or access transistors. Therefore, even though the SNM sensitivity to the  $V_{th}$  variation in the load transistor is relatively small, a reduction in the  $\sigma_{Vth}$  of the load helps to reduce  $\sigma_{SNM}$  (from (4)). Hence, increasing the strength of the load transistor helps to mitigate the decline in the average SNM value as well as the SNM  $N_{\sigma}$  (column 6). It is evident, that the proposed problem formulation works well to provide robust design solutions.

# B. General purpose-low leakage design

A general purpose bit-cell on the moderate performance-low leakage side is also designed and Tables III and IV summarise the results. The resultant transistor sizes defy conventional sizing strategy for the bit-cell. The low leakage requirement  $(Ileak_{max} = 25nA)$  drives the optimizer to reduce the size of the driver significantly. This, however, considerably increases  $\sigma_{Vth}$  of the driver transistor and thus,  $\sigma_{SNM}$ , which can potentially degrade the SNM yield unless the average SNM value is also raised. Consequently, the load transistor is sized to be the largest to achieve a nominal SNM value of 152.2mV (Table IV), which is larger than those depicted in Fig.10 (c). It is difficult to analyse all the trade-offs and manually arrive at these transistor sizes. These results establish the benefits of the proposed method. A generic formulation of the bit-cell design optimization problem is presented here. To take care of the specific foundry guidelines to minimize the layout-induced variations, different variables and additional constraints can be introduced. For example, for the horizontal poly-silicon of the driver and load transistor (Fig. 1 (b)), the designer can keep  $L_{drv} = L_{ld} = L_{inv}$  in order to get rid of the awkward polyshape (L-shape) which will result because of different lengths for the driver and the load. These modifications in the problem formulation, to a great extent, would depend on the chosen bit-cell layout topology and the extent of available advanced lithography correction mechanisms.

Fig.12 shows the variation of the Monte-Carlo yield in the neighborhood of the obtained low leakage design solution. snml and snmh refer to SNM at low voltage, high temperature and high voltage, high temperature conditions, respectively as per Table I. Running the Monte-Carlo simulations to explore the entire search space in order to find the globally maximum yield point is infeasible. Therefore, in Fig.12, it is verified that the proposed optimization approach leads to, at least, a locally maximum yield. In each figure in Fig.12, one design parameter is varied, whereas the others are kept constant at the values mentioned in Table III. For example, Fig.12 (a) shows the Monte Carlo yield and the average values of the design constraints when  $W_{drv}$  is varied from 143nm to 155nm, while  $W_{ax} = 174nm, W_{ld} = 175nm, L_{drv} =$  $75nm, L_{ax} = 79nm$  and  $L_{ld} = 54nm$ . It is evident that the yield degrades as the design point is moved away from the obtained optimum.

# V. CONCLUSION

In this paper, a statistical method to design the SRAM bitcell is proposed. The method accounts for the manufacturing variability in the transistor dimensions, and the intrinsic  $V_{th}$ variations due to RDF. In addition, the widths and lengths of the transistors are chosen so as to satisfy the constraints of SNM, write trip voltage, read current, leakage and area. The developed method is flexible, involves a small infrastructure in terms of mathematical computations, and uses readily available models and tools in the industry, so that the extent of approximation in the proposed method is small. Robust bit-cell designs for high performance-moderate leakage and moderate performance-low leakage have been developed and analysed to demonstrate the working of the proposed method. The method can be upgraded to include the impact of other sources of variability such as the oxide thickness fluctuation and lineedge roughness. It is necessary to evaluate how much these other sources impact the threshold voltage of the transistor and how they can be modeled mathematically.

## ACKNOWLEDGMENT

The authors would like to thank the Associate Editor and the reviewers who helped improve the quality of this paper. The authors would also like to thank Prof. P. Kumaraswamy for providing the preliminary optimization code.

## REFERENCES

- B. Cheng, S. Roy, A. Asenov, "The Impact of Random Doping Effects on CMOS SRAM Cell," *ESSCIRC*, 2004, pp. 219-222.
- [2] A. B. Kahng, Y.C. Pati, "Sub wavelength Lithography and its potential impact on Design and EDA," DAC, 1999, pp. 799-804.
- [3] Mukong Choi, Linda Milor, "Impact on Circuit Performance of Deterministic Within-Die Variation in Nanoscale Semiconductor Manufacturing," *TCAD*, 2005, pp. 1350-1366.
- [4] K.Bernstein et al., "High Performance CMOS variability in the 65nm regime and beyond," *IBM Journal of Research and Development*, Sept. 2006, pp. 433-449.
- [5] Asen Asenov, "Random dopant induced threshold voltage lowering and fluctuations in sub 50nm MOSFETs: A statistical 3D atomistic simulation study," *Nanotechnology*, 1999, pp. 153-158.
- [6] Y. Taur and T. Ning, Fundamentals of Modern VLSI Devices, Cambridge, U.K.: CUP, 1998.
- [7] A. Keshwarzi, G.Schrom, "Measurements and Modeling of Intrinsic Fluctuations in MOSFET Threshold Voltage," *ISLPED*, 2005, pp. 26-29.
- [8] X. Tang, V. De, J. Meindl, "Intrinsic MOSFET Parameter Fluctuations due to Random Dopant Placement," *IEEE Trans. VLSI*, Dec. 1997, pp. 369-375.
- [9] D. Burnett, K.Erington et al., "Implications of Fundamental Threshold Voltage Variations for High Density SRAM and Logic Circuits," VLSI Symp. Dig., 1994, pp. 15-16.
- [10] A. Bhavnagarwala et al., "The Impact of Intrinsic Device Fluctuations on CMOS SRAM Cell Stability", IEEE JSSC, April 2001, pp. 658-665.
- [11] F.Tachibana, T. Hiramoto, "Re-examination of Impact of Intrinsic Dopant Fluctuations on SRAM Static Noise Margin," *Japanese Journal of Applied Physics*, 2005, pp. 2147-2151.
- [12] S. Mukhopadhyay, H. Mahmoodi, Kaushik Roy, "Modeling of Failure Probability and Statistical Design of SRAM Array for Yield Enhancement in Nanoscaled CMOS," *TCAD*, 2005, pp. 1859-1879.
- [13] R. Heald, P. Wang, "Variability in Sub-100nm SRAM Designs," IEEE/ACM ICCAD, 2004, pp. 347-351.
- [14] D. Redwine, "SRAM cell with independent static noise margin, trip voltage and read current optimization," U.S. Patents, 2005.
- [15] K.Zhang, U. Bhattacharya, et.al, "SRAM Design on 65-nm CMOS technology with Dynamic Sleep Transistor for Leakage Reduction," *IEEE JSSC*, April 2005, pp. 895-900.
- [16] E. Seevinck et al., "Static Noise Margin Analysis of MOS SRAM Cells," *IEEE JSSC*, Oct. 1987, pp. 748-754.

- [17] Jan Lohstroh, E. Seevinck, "Worst-Case Static Noise Margin Criteria for Logic Circuits and their Mathematical Equivalence," *IEEE JSSC*, Dec. 1983, pp. 803-806.
- [18] R.W. Mann et al.,"Ultralow-power SRAM technology", *IBM Journal of Research and Development*, Sept. 2003, pp. 553-563.
- [19] F. Boeuf et al., "0.248um2 and 0.334um2 Conventional Bulk 6T-SRAM bit-cells for 45nm node Low Cost - general Purpose Applications", Sym. On VLSI Tech. Digest of Tech. Papers, 2005, pp. 130-131.
- [20] M. Lavin et al., "Backend CAD Flows for Restrictive Design Rules", *IEEE*, 2004, pp. 739-746.
- [21] L. Liebmann, A. Barish et al., "High-performance circuit design for the RET-enabled 65-nm technology node", *Proc. SPIE Vol. 5379*, May 2004, pp. 20.
- [22] P. Gelsinger, Keynote address to 41st DAC, 2004.
- [23] Online reference, http://www.pimrc2006.org/Buss\_sLocosto.pdf.
- [24] A. Srivastava, D. Sylvester, "Statistical Optimization of Leakage Power considering process variations using Dual-Vth and sizing", *DAC*, Jun. 2004, pp. 773-778.
- [25] S. Director, P. Feldmann, "Optimization of Parametric Yield: A tutorial," *IEEE Custom Integrated Circuits Conference*, 1992, pp. 3.1.1-3.1.8.
- [26] P. Kumaraswamy, "A generalized probability density function for double-bounded random processes," *Journal of Hydrology*(46), 1980, pp. 79-88.
- [27] A.Seifi, P. Kumaraswamy et.al, "A Unified Approach to Statistical Design Centering of Integrated Circuits with Correlated Parameters," *IEEE Trans. Circuits and Systems*, Jan. 1999, pp. 190-196.
- [28] Predictive Technology Models for 45nm, http://www.eas.asu.edu/ ptm.
- [29] T. F. Coleman, Y. Zhang, Optimization Toolbox for use with Matlab, The Math works Inc., 2005.
- [30] HSPICE Manual.
- [31] M.Dunga, et al, BSIM4.6 MOSFET model, University of California, Berkeley, http://www-device.eecs.berkeley.edu/ bsim3/bsim4.html.
- [32] International Technology Roadmap for Semiconductors, http://www.itrs.net/, 2006.
- [33] J.M. Rabaey, A. Chandrakasan Digital Integrated Circuits, a design perspective, Prentice Hall, 2003, pp. 628-630.
- [34] Y. Tsukamoto et al, "Worst-case Analysis to Obtain Stable Read/Write DC Margin of High-Density 6T SRAM Array with Local Vth Variability," *International Conference on Computer-Aided-Design*, 2005, pp. 398-405.
- [35] B. H. Calhoun and A. Chandrakasan, "Analysing Static Noise Margin for Sub-threshold SRAM in 65nm CMOS," ESSCIRC, 2005.
- [36] A. Goel and B. Mazhari, "Gate Leakage and its Reduction in Deep-Submicron SRAM,"Intl. Conf. VLSI Design, VLSID, 2005, pp. 606-611.
- [37] J. Jaffari and M. Anis, "Variability Aware Device Optimization under Ion and leakage current constraints,"*ISLPED*, 2006, pp. 119-122.
- [38] R. Venkatraman and R. Castagnetti, "The Design, Analysis and Development of Highly Manufacturable 6-T SRAM Bitcells for SoC applications" *IEEE Trans. Electron Devices*, 2005, pp. 218-224.
- [39] M. Craig and J. Petersen, "Robust Methodology for State of the Art Embedded SRAM BitCell Design"*Proc. SPIE*, vol. 4692, 2002, pp. 380-389.
- [40] A. Papoulis, Probability, Random Variables and Stochastic Process, New York, McGraw Hill, Third Edition.
- [41] H. Nho, S.S.Yoon et al, "Numerical Estimation of Yield in Sub-100-nm SRAM Design Using Monte Carlo Simulation"*IEEE TCAS*, vol. 55,2008, pp 907-911.
- [42] R.E. Aly and M. A. Bayoumi, "Low-Power Cache Design Using 7T SRAM Cell"*IEEE TCAS*, vol. 54,2007, pp 318-322.
- [43] S. Cerveny et al, "Locally switched and limited source-body bias and other leakage reduction techniques for a low-power embedded SRAM"*IEEE TCAS*, vol. 52,2005, pp 636-640.



Vasudha Gupta received the B.E. degree (with honors) in electronics and communication engineering from University of Delhi, Delhi, India, in 2002, and the MASc. degree in electrical and computer engineering from the University of Waterloo, Waterloo, ON, Canada, in 2008. She worked as a Senior Design Engineer with Texas Instruments, Bangalore, India from 2002 to 2006, to develop embedded single and multi-port memories, and memory flows. Her research interests include memory design, statistical design methodologies and OLED displays.



**Mohab Anis** (S'98-M'03) received the B.Sc. degree (with honors) in electronics and communication engineering from Cairo University, Cairo, Egypt, in 1997 and the M.A.Sc. and Ph.D. degrees in electrical engineering from the University of Waterloo, Waterloo, ON, Canada, in 1999 and 2003, respectively. He is currently an Assistant Professor and the Codirector of the VLSI Research Group, Department of Electrical and Computer Engineering, University of Waterloo. He has authored/coauthored over 60 papers in international journals and conferences and

is the author of the book Multi-Threshold CMOS Digital CircuitsManaging Leakage Power (Kluwer, 2003). His research interests include integrated circuit design and design automation for VLSI systems in the deep submicrometer regime. He is the Cofounder of Spry Design Automation. He is an Associate Editor of the Journal of Circuits, Systems and Computers, ASP Journal of Low Power Electronics, and VLSI Design. Dr. Anis is an Associate Editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSII. He is also a member of the program committee for several IEEE conferences. He was awarded the 2004 Douglas R. Colton Medal for Research Excellence in recognition of excellence in research leading to new understanding and novel developments in Microsystems in Canada and the 2002 International Low-Power Design Contest.