# Impact of BTI and HCI on the reliability of a Majority Voter

A. Santana-Andreo\*, E. Roca, R.Castro-Lopez and F.V. Fernandez

Instituto de Microelectrónica de Sevilla (IMSE), Universidad de Sevilla - CSIC, Seville, Spain

\*Corresponding author: santana@imse-cnm.csic.es

Abstract—Triple Modular Redundancy is a commonly used hardware technique in mission- and safety-critical systems to ensure reliability. Although a simple circuit, the majority voter can be the weak link in this system and different designs have been proposed to increase its robustness to single event effects and permanent faults. However, no study has been performed to analyze the effect of Bias Temperature Instability (BTI) and Hot Carrier Injection (HCI) on a majority voter, which can lead to timing failures or exacerbate other failure mechanisms like single-event upsets (SEUs). This work uses a state-of-the-art aging simulator to estimate the effects of aging on a majority voter.

## Keywords—Majority voter, aging, BTI, HCI, SEU

## I. INTRODUCTION

Circuit reliability has become a growing concern in modern nanometer-scale CMOS technologies as circuits have become more error-prone because of a variety of factors, such as the decrease in power supply voltage, the push for increased performance, the transistor size reduction or the consequently increased complexity of routing [1][2].

In a system where the temporary or permanent fault of a module can be disastrous, a common approach to ensure reliability is the use of Triple Modular Redundancy (TMR) [3]-[6]. The premise is simple but effective: the critical module is repeated thrice and the outputs of these three modules are combined through a majority voter, as shown in Fig. 1. A majority voter decides the output based on the Boolean majority. Accordingly, to ensure the correct operation of the system, at least two of the three modules must be error-free. This is, of course, if the majority voter works correctly. Any errors originating in the majority voter defeat the purpose of using a TMR system and it must be carefully evaluated and designed to ensure that it does not become a source of errors.

Types of failures, defined as deviations of the expected service of a component, can be generally divided into four groups[7]:

• Parametric variations, which appear at the point of production. They involve the differences between instances of nominally identical components, caused



Fig. 1. Triple Modular Redundancy (TMR) scheme.

This work was supported by grant PID2019-103869RB-C31 funded by MCIN/AEI/10.13039/501100011033 and by Consejería de Economía, Conocimiento, Empresas y Universidad de la Junta de Andalucía and P.O. FEDER under project US-1380876. Andrés Santana Andreo was supported by grant PRE-2020-093167 funded by MCIN/AEI/10.13039/501100011033, and by "ESF Investing in your future".

by what is known as Time Zero Variability (TZV). This variability is extensively known, characterized and modeled and its impact can be estimated via Monte-Carlo simulations during circuit design to, for instance, implement guardbands so that the circuit meets design specifications.

- Parametric degradations, which manifest over time as a gradual drift in transistor parameters (e.g., threshold voltage, Vth) that may eventually lead to a critical failure [2]. This type of failure is caused by what is known as Time Dependent Variability (TDV). It includes phenomena such as Bias Temperature Instability (BTI) or Hot Carrier Injection (HCI). BTI comes in two types: PBTI (Positive-Bias Temperature Instability), present in NMOS transistors, and NBTI (Negative-Bias Temperature Instability), present in PMOS transistors
- Transient faults [3]-[5], which are computational, soft errors that do not cause permanent degradation of parameters. A clear example is a single-event upset (SEU): when an energized particle hits a sensitive area of a circuit, it may produce a voltage glitch, a socalled single-event transient (SET). To produce a SET, the particle must carry a charge above the socalled critical charge of a node. In a combinational circuit (e.g., the majority voter) this glitch may propagate through the logic and be sampled by a memory element, causing an SEU.
- Permanent faults [3][4][6], which implicate lasting, sudden damage that completely compromises the operation of a transistor. They may be caused either by pre-existing defects during fabrication or by phenomena such as Electromigration (EM) or Time Dependent Dielectric Breakdown (TDDB).

Both the effects of transient and permanent faults have been studied in the literature to evaluate their impact on majority voters and design solutions have thus been proposed to ensure reliability [1][4][6][8][9]. However, the effects of parametric degradation, i.e., BTI and HCI have been, to the best of our knowledge, largely ignored. The reduction in transistor size has revealed the stochastic nature of BTI and HCI, which complicates modeling their effects and may explain why they have not been considered for majority voter architectures. The increase in V<sub>th</sub> can lead to timing failures in digital circuits, as it leads to decreased drain current and, consequently, increased circuit delay. Additionally, it has been shown that it increases the circuit sensibility to SEU [10], as the critical charge strongly depends on the ability of the pull-up/pull-down transistors to restore the appropriate voltage of a node.

This paper aims at studying the impact of BTI and HCI on a majority voter. To perform this study, a state-of-the-art TDV simulator, named CASE [11] is employed. The rest of the paper is structured as follows: Section II introduces CASE, Section III shows the degradation present in the majority voter under a variety of conditions, Section IV shows the impact of this degradation on circuit delay and sensitivity to SETs and Section V presents the conclusions.

## II. CASE: A TDV RELIABILITY SIMULATOR

Although some commercial tools offer TDV simulations, they generally do not consider the stochastic nature of TDV. Most are based on deterministic models, which have been proven inaccurate in modern technology nodes. Additionally, they do not accurately consider the correlation between TZV and TDV, which may play an important role in the final parametric degradation of the devices.

CASE is a powerful alternative to the commercial tools designed to tackle these deficiencies. TDV (BTI and HCI) degradation is simulated using the Probabilistic Defect Occupancy model [12], which considers the stochastic nature of these phenomena. It evaluates the combined effect of TZV and TDV in a computationally efficient manner [13] and employs a size-adaptative time-step algorithm [14] to accurately update bias conditions without incurring prohibitive computational costs.

## III. TDV DEGRADATION IN A MAJORITY VOTER

## A. Majority Voter Circuit

The majority voter design employed in this work is the AO222 complex gate design in a 65-nm technology node operating at 1.2 V. The gate-level and transistor-level schematic of this design is shown in Fig. 2. The first stage includes five PMOS transistors (M1-M5) which will be referred to as the *pull-up network* and five NMOS transistors (M6-M10), the *pull-down network*. The second stage is simply an *inverter* with its corresponding PMOS (M11) and NMOS (M12) transistor.

## B. Transistor degradation under nominal conditions

Since aging degradation is very dependent on bias conditions [2], it is important to carefully consider all factors that may affect circuit operation when performing an aging simulation. Ten years of operation is commonly used as a benchmark for long-term performance estimation, and it will be also employed here. To consider the effect of TDV and TZV combined, 1,000 Monte-Carlo instances of the circuit are aged in 1,000 time steps. To improve the accuracy of the performance estimations, parasitics are also included by performing a parasitic extraction of the cell's layout. Naturally, the workload of the majority voter and the resulting stress ratio of the transistors plays a critical role in the final degradation. For a majority voter in a TMR scheme, it can be assumed that all 3 modules in Fig.1 have the same output to simplify the aging scenario [3]. Accordingly, the only input combinations considered are ABC=111 and ABC=000. All other input combinations occur only when one of the modules fails, an event that may happen at some points during the lifetime of the circuit but very rarely for the entire lifetime. Fig. 3 shows the mean relative degradation of each transistor for a duty cycle of 50%. As expected, the degradation in PMOS transistors is considerably larger than in NMOS, as the impact of PBTI is generally stronger than the impact of NBTI at this technological node [2].

## C. Transistor degradation and duty cycle

Changing the duty cycle of the aging simulation clearly shows how the duty cycle drives transistor degradation, as



Fig. 2. Transistor-level (a) and Gate-level (b) schematic of the complex logic gate AO222. Internal nodes are indicated in italics, while input/outputs are indicated in bold.



Fig. 3. Change in threshold voltage after 10 years of aging for a duty cycle of 50%.

illustrated in Fig. 4. The average degradation of pull-up and pull-down transistors is grouped for clarity. For a duty cycle of 0% (all inputs are 0), only the pull-up network and the NMOS inverter transistors are active, and they are the ones that suffer degradation. As the duty cycle rises, they start degrading less while the rest start degrading more, until for a duty cycle of 100%, where only the pull-down network and the PMOS inverter transistors degrade. Accordingly, one must consider the expected duty cycle of the majority voter for each specific implementation to properly assess aging degradation.

## D. Transistor degradation and temperature

Temperature is another relevant factor to consider, as it has a significant impact on both BTI and HCI. This impact is modeled through the well-known Arrhenius equation [15]. Since TMR systems are employed for mission- and safetycritical scenarios (e.g., in military or aerospace applications [4]) it is important to consider the effects of extreme temperatures on TDV degradation. To this end, aging simulations have been performed at different temperatures, as shown in Fig. 5. It is clear how increasing temperature results in increased degradation. The effect can result in up to a fivefold increase in degradation when comparing the degradation at 125 °C with the degradation at 25 °C.

### IV. IMPACT OF AGING

# A. Delay

TDV degradation has a direct impact on the delay of combinational circuits since an increase in threshold voltage translates into a decrease in the drain current and delay is inversely proportional to this current. If this effect is not considered during design, it may lead to timing violations during the lifetime of the circuit [2]. Two delays are

considered here: Rise delay, when the output changes from 0 to 1, and Fall delay, when the output changes from 1 to 0. Rise (Fall) delay will increase by the degradation of the pulldown (pull-up) network and the PMOS (NMOS) inverter transistor. Fig. 6 shows the change in the delay before and after aging for 25 °C and a duty cycle of 50%. It is clear how the initial TZV distribution is shifted because of TDV. Considering that the pull-up network degrades the most, as shown before, it is logical that fall delay degradation is larger than rise delay degradation. In Fig. 7 it is clear how, for high temperatures, delay degradation increases significantly and how it decreases for lower temperatures. Finally, Fig. 8 shows why it is important to consider duty cycles. The worst-case for fall delay at nominal temperature is for a 0% duty cycle, while the worst case for rise delay is for a 100% duty cycle. This is consistent with the results presented in [16]. Looking at the histograms with mean negative delay change, it is interesting to see how rise and fall delays can be reduced due to TDV degradation, since only the transistors that oppose the transition from low to high or from high to low are degraded, resulting in a faster transition.

Another important factor to consider regarding delay is the operation conditions, namely the slew of its input signals and the load capacitance [17]. The delay at nominal conditions for this cell is given for input slew ranging from 2ps to 560ps, and output capacitance ranging from 1fC to 39.21fC. Table I shows the mean change in delay for different combinations of operating conditions. It appears that rise delay degradation stays roughly the same, while fall delay degradation decreases with higher capacitance or slew. It is clear how the delay is significantly affected by TDV degradation, which may represent a significant threat to the reliability of the circuit. To counteract the change in delay, the values after degradation could be seamlessly integrated into the digital design flow to ensure that the majority voter will meet timing constraints [16].

## B. Sensitivity to SETs

Since TMR systems are used for applications in harsh environments where circuits are more exposed to radiation,



Fig. 4. Mean change in threshold voltage after 10 years of aging applying different duty cycles.



Fig. 5. Mean change in threshold voltage after 10 years of aging applying different temperatures for a duty cycle of 50%.

the sensitivity of majority voters to SETs has been a topic of interest in the literature [3]-[5]. The impact of an energetic particle on a circuit node can be modeled with a double-exponential current pulse [10]:







Fig. 7. Delay before (green) and after (red) aging degradation for 125  $^{\rm o}{\rm C}$  and 55  $^{\rm o}{\rm C}.$ 



Fig. 8. Delay before (green) and after (red) aging degradation for 0% duty cycle and 100% duty cycle.

 TABLE I. MEAN CHANGE IN DELAY TIME FOR DIFFERENT OPERATING

 CONDITIONS.

| Change in Fall/Rise Delay (%)   |                    |                         |                      |  |  |  |
|---------------------------------|--------------------|-------------------------|----------------------|--|--|--|
|                                 | Input Delay<br>2ps | Input Delay<br>219.38ps | Input Delay<br>560ps |  |  |  |
| Output Capacitance<br>1fC       | 9.43/3.87          | 9.22/4.24               | 6.38/3.68            |  |  |  |
| Output Capacitance<br>8.70777fC | 8.01/3.68          | 8.68/3.95               | 6.31/3.89            |  |  |  |
| Output Capacitance<br>39.2119fC | 5.48/3.84          | 5.83/4.00               | 5.03/3.85            |  |  |  |

$$I(t) = \frac{Q}{\tau_{\alpha} - \tau_{\beta}} \left( e^{-t/\tau_{\alpha}} - e^{-t/\tau_{\alpha}} \right)$$
(1)

Where Q is the amount of charge collected by the node,  $\tau_{\alpha}$  is the collection time constant of the junction and  $\tau_{\beta}$  is the ion-track establishment time constant, with values given in **;Error! No se encuentra el origen de la referencia.** The transient pulse can have a positive or a negative magnitude, depending on whether the charged particle hits a PMOS or an NMOS transistor.

It is important to consider which nodes in the circuit are relevant depending on whether there is a logical sensitized path between the node and the output [5]. Considering the nodes in the schematic of Fig. 2 and a situation where all three inputs are the same, a SET in nodes PU1, PU2, PD1 and PD2 does not propagate to the output, since the other branch of the network would still be operational and forcing the correct value on the node K. That leaves only the node K and the output node Out as sensible to SETs. This sensitivity can be quantified by performing simulations gradually increasing the charge Q that strikes the circuit. The first value of Q resulting in an upset at the output is defined as the critical charge [18], and it provides an estimation of the minimum amount of energy required to produce a SET.

To evaluate the impact of TDV on this sensitivity, the critical charge is measured for the fresh netlist and after suffering degradation using different duty cycles, as shown in Table II. When the inputs are at 0, K is 1 and can be affected by a hit on an NMOS transistor, while Out is affected by a hit on the PMOS transistor. The opposite applies when the inputs are at 1. Analyzing the results, the K node for input ABC=000 is more sensible when the PMOS inverter transistor degrades the most, after a duty cycle of 100%, since it needs to make this transistor start conducting to change the value at the output (and this would be easier when its Vth value has increased due to aging). When the input is ABC=111, the node is more sensible when the NMOS inverter transistor degrades more, for a duty cycle of 0%. Considering the Out node, the sensitivity depends on the degradation of the inverter transistor that is driving current: for input ABC=000, it is the NMOS, so the lowest critical charge is for a duty cycle of 0% and for input ABC=111, it is the PMOS so the lowest critical charge corresponds to a duty cycle of 100%.

In any case, the degradation due to TDV has a noticeable effect on the critical charge. The lowest critical charge is 22.9fC for the fresh circuit and 21.6fC for a duty cycle of 100%, a reduction of 5.7%.

#### V. CONCLUSIONS

This work presents BTI and HCI simulation results for a majority voter, and their impact on reliability. Multiple simulations with different parameters are considered with a state-of-the-art simulator, providing useful insights into the

TABLE II. CRITICAL CHARGE (fC) VARYING NODES, DEGRADATION AND INPUTS

| naeib           |         |          |         |          |  |  |
|-----------------|---------|----------|---------|----------|--|--|
| Netlist         | ABC=000 |          | ABC=111 |          |  |  |
|                 | K NMOS  | Out PMOS | K PMOS  | Out NMOS |  |  |
| Fresh           | 22.9    | 39.6     | 30.8    | 27.6     |  |  |
| Duty Cycle 0%   | 22.2    | 34.4     | 28.7    | 27.8     |  |  |
| Duty Cycle 50%  | 22.0    | 35.7     | 28.9    | 26.7     |  |  |
| Duty Cycle 100% | 21.6    | 39.7     | 30.2    | 25.7     |  |  |

different aging scenarios possible. Degradation directly impacts the delay of the circuit, which must be considered during design to ensure that no timing violations occur during the lifetime of the circuit. It also affects the circuit sensitivity to SETs, increasing its vulnerability to these phenomena.

#### REFERENCES

- A. Mukherjee, "Defect tolerant approach for reliable majority voter design using quadded transistor logic," IEEE Reg. 10 Annu. Int. Conf. Proceedings/TENCON, pp. 165–169, 2020
- [2] B., Halak, Ed. "Ageing of Integrated Circuits: Causes, Effects and Mitigation Techniques", Springer, pp. 35-43, 2020
- [3] I. F. V. Oliveira, R. B. Schvittz, and P. F. Butzen, "Fault masking ratio analysis of majority voters topologies," 2018 IEEE 19th Latin-American Test Symposium (LATS), pp. 1–6, 2018
- [4] P. Balasubramanian and K. Prasad, "A Fault Tolerance Improved Majority Voter for TMR System Architectures," ArXiv, vol. abs/1605.03771, 2016
- [5] I. F. V. Oliveira, R. B. Schvittz, and P. F. Butzen, "Single event transient sensitivity analysis of different 32 nm CMOS majority voters designs," Microelectron. Reliab., vol. 100–101, p. 113369, 2019
- [6] R. V. Kshirsagar and R. M. Patrikar, "Design of a novel fault-tolerant voter circuit for TMR implementation to improve reliability in digital circuits," Microelectron. Reliab., vol. 49, no. 12, pp. 1573–1577, 2009
- [7] M. A. Alam, K. Roy, and C. Augustine, "Reliability- and Processvariation aware design of integrated circuits - A broader perspective," IEEE Int. Reliab. Phys. Symp. Proc., pp. 353–363, 2011
- [8] T. Ban and L. A. de Barros Naviner, "A simple fault-tolerant digital voter circuit in TMR nanoarchitectures," in Proceedings of the 8th IEEE International NEWCAS Conference, pp. 269–272, 2010
- [9] T. Arifeen, A. S. Hassan, and J. A. Lee, "A fault tolerant voter for approximate triple modular redundancy," Electron., vol. 8, no. 3, 2019
- [10] D. Rossi, M. Omaña, C. Metra, and A. Paccagnella, "Impact of Bias Temperature Instability on Soft Error Susceptibility," IEEE Trans. Very Large Scale Integr. Syst., vol. 23, no. 4, pp. 743–751, 2015
- [11] P. Martin-Lloret et al., "CASE: A reliability simulation tool for analog ICs," SMACD 2017 - 14th Int. Conf. Synth. Model. Anal. Simul. Methods Appl. to Circuit Des., pp. 42–45, 2017
- [12] J. Martin-Martinez et al., "Probabilistic defect occupancy model for NBTI," in 2011 International Reliability Physics Symposium, p. XT.4.1-XT.4.6, 2011
- [13] A. Toro-Frias et al., "Including a stochastic model of aging in a reliability simulation flow," SMACD 2017 - 14th Int. Conf. Synth. Model. Anal. Simul. Methods Appl. to Circuit Des., pp. 27–30, 2017
- [14] P. Martin-Lloret et al., "A size-adaptive time-step algorithm for accurate simulation of aging in analog ICs," Proc. - IEEE Int. Symp. Circuits Syst., pp. 12–15, 2017
- [15] JEP122G, "Failure Mechanisms and Models for Semiconductor Devices", Arlington: JEDEC Solid State Technology Association, 2011
- [16] V. M. Van Santen, H. Amrouch, and J. Henkel, "New worst-case timing for standard cells under aging effects," IEEE Trans. Device Mater. Reliab., vol. 19, no. 1, pp. 149–158, 2019
- [17] H. Amrouch, B. Khaleghi, A. Gerstlauerz, and J. Henkel, "Reliabilityaware design to suppress aging," Proc. - Des. Autom. Conf., vol. 05-09-June, 2016
- [18] H. Cha and J. H. Patel, "A logic-level model for alpha-particle hits in CMOS circuits," IEEE International Conference on Computer Design ICCD'93, pp. 538-542, 1993