RTD-CMOS Pipelined Networks for Reduced Power Consumption

Juan Núñez, María J. Avedillo and José M. Quintana

Abstract—The incorporation of Resonant Tunnel Diodes (RTDs) into III/V transistor technologies has shown an improved circuit performance, producing higher circuit speed, reduced component count, and/or lower power consumption. Currently, the incorporation of these devices into CMOS technologies (RTD-CMOS) is an area of active research. Although some studies have concentrated on evaluating the advantages of this incorporation, more work in this direction is required. In this paper, we compare RTD-CMOS and pure CMOS realizations of a logic gate network which can be operated in a gate-level pipeline. Significantly lower average power is obtained for RTD-CMOS implementations.

Index Terms—Resonant tunneling diode, Nano-pipeline, Emerging technologies, Logic circuits, Power efficiency.

I. INTRODUCTION

Resonant Tunneling Diodes (RTD) exhibit a Negative Differential Resistance (NDR) characteristic. Many circuits taking advantage of this characteristic have been reported, covering different applications (memories, logic, oscillators, A/D converters) and addressing different goals (high speed, low power). Specifically, the NDR current-voltage (I-V) characteristic of RTDs can be exploited in logic design to increase the functionality implemented by a single gate (in comparison with CMOS and bipolar technologies) [1]. In addition, those gates can be directly pipelined, resulting in logic networks in which each gate-level is a pipelined stage (nanopipeline) allowing very high throughput [1], [2]. RTDs fabricated in III-V are undoubtedly the most mature and most reported circuits based on resonant tunneling use them combined with different types of transistors. Since the currently dominant technologies use silicon, a lot of effort is being devoted to developing devices with negative resistance in this material. The manufacture of tunnel diodes in silicon is currently a very active area of research in which great progress is expected. In fact, it has been suggested that the addition of RTDs to CMOS technology could extend CMOS life and even make investment in it more profitable [3].

Recent works have focused on evaluating the advantages of incorporating RTDs into CMOS technologies [4], [5]. However, in our opinion, additional work in this direction is required. In particular, in the field of logic circuits, estimations of performance improvements obtained by adding RTDs have been evaluated for a set of logic functionalities (combinational gates and flip-flops) [3], [6], [7], but without taking into account their usage in gate networks. This is a key point, because each gate is a pipelined stage and should thus be compared to CMOS logic styles operating in a similar way. Moreover, as far as we know, there is a lack of recent studies in this area. This paper addresses these issues and contributes to provide results on how RTD-CMOS realizations compare to pure CMOS gate-level pipelines when implemented in a commercial technology. The paper is organized as follows: Section II describes RTD logic networks based on the MONostable to BIsstable operation principle. In Section III, we present the RTD-CMOS network which has been evaluated. A brief description of the experiment is also given. In Section IV, a comparison in terms of average power consumption with pure CMOS realizations of these structures is described. Finally, some key conclusions are given in Section V.

II. RTD-BASED MOBILE LOGIC NETWORKS

Logic circuit applications of RTDs are mainly based on the MOnostable-toBIsstable Logic Element (MOBILE) which exploits the negative differential resistance of their I-V characteristic (Fig. 1a). The MOBILE (Fig. 1b) is a rising edge-triggered current controlled gate which consists of two RTDs connected in series and driven by a switching bias voltage, $V_{ck}$. When $V_{ck}$ is low, both RTDs are in the on-state (or low resistance state) and the circuit is monostable. Increasing $V_{ck}$ to an appropriate maximum value ensures that only the device with the lowest peak current switches (quenches) from the on-state to the off-state (high resistance state). Output is high if RTD$_{b}$ is the one which switches and low if it is RTD$_{a}$ that switches. Logic functionality can be achieved if the peak current of one of the RTDs is controlled by an input.

In the configuration of the rising (falling) edge-triggered MOBILE inverter shown in Fig. 1c (Fig. 1d), the peak current of the driver (load) RTD can be modulated using the external input signal. RTD peak currents are selected in such a manner that the value of the output depends on whether the external input signal is “1” or “0”. Assuming the same peak current density, $J_{p}$, for all the RTDs, the peak current is proportional to their area. The figures depict the required area relationships. For $V_{ck}$ high (low), the output node of the rising (falling) edge-triggered MOBILE maintains its value even if the input changes. That is to say, this circuit structure is self-latching, allowing to implement pipeline at the gate level. In other words, network operation speed is independent of logic depth but is determined
by the clock frequency at which single gates can be operated. Cascade rising edge-triggered MOBILE gates operated in a pipelined fashion use a four-phase clocking scheme. It has been demonstrated that a network of MOBILE-based gates can be operated with a single clocked bias signal [8]. To do so, rising edge-triggered gates and falling edge-triggered gates are alternated and latches are added at each stage to remove the return-to-reset behavior.

However, we have realized that it is not necessary to remove the return-to-reset behavior to ensure the correct operation of a MOBILE gate network which alternates rising and falling edge-triggered stages. It is enough merely to maintain the output of each MOBILE stage until it has been evaluated by the next one. Thus, each latch is replaced by a static inverter.

The operation of this simplified single phase architecture is shown through the connection of four binary inverter, as depicted in Fig. 2a. The first and the third inverter are rising edge-triggered, whereas the second and the fourth are falling edge-triggered. Fig. 2b shows HSPICE simulation results.

III. STUDY DESCRIPTION

The power performance of a network of inverters implemented with RTDs and commercial CMOS transistors has been evaluated. We compared it with True Single Phase Clock (TSPC) CMOS realizations, since they too implement gate-level pipeline. Variations of TSPC have been proposed and widely applied in the design of high-speed applications. The study used transistors from a standard 130nm CMOS process. For the RTDs, we started the experiments using a model from LOCOS [9]. This model corresponds to a III- 

\[
I_{DD} = I_{DD} \text{ (constant)}
\]

which has been experimentally validated \( (I_{DD} = 21 \mu A/cm^2, V_{DD} = 0.21V, C = 4 \Omega \mu F/cm^2) \). The transistors were compared using HSPICE parametric simulations. 4-stage chains of inverters with fan-out 1 (LOAD1), 2 (LOAD2) and 3 (LOAD3) at each stage were simulated at three normalized frequencies, \( f_{norm} \in \{0.16, 0.20, 0.24\} \). This frequency range was selected on the basis of reported pipelined circuits in similar CMOS technologies. For each architecture we varied parameters taking a discrete number of samples of each one in a given range. Of all the simulated circuits, the one such that its Monte-Carlo simulation (modeling both transistor and RTD intra-chip variations) shows correct operation and minimizes average power was selected. Parameters included in the design space exploration for each logic style (RTD-CMOS and TSPC) are described below, together with the simulation conditions.

A. RTD-based circuits

Supply voltage was explored in the range from 0.6V to 0.8V. Transistor lengths and widths were set to the minimum values associated with the technology \((L_{min} = 0.12 \mu m \text{ and } W_{min} = 0.16 \mu m)\). PMOS transistor width was \( K \) times the NMOS transistor width \((K = 3.5 \text{ for this technology})\). We varied the RTD areas \( \frac{f_{xR}}{f_{xR}} \text{ and } \frac{f_{xL}}{f_{xL}} \) assuming that: \( f_{xR} = f_{xR} \text{ and } f_{xL} = 1.5 f_{xL} \) for the rising edge-triggered inverter and \( f_{xR} = f_{xR} \text{ and } f_{xL} = 1.5 f_{xL} \) for the falling edge-triggered inverter. \( f_{xR} \) and \( f_{xL} \) were varied from 0.04 \( \mu m^2 \) to 0.4 \( \mu m^2 \).

B. TSPC circuits

For the TSPC network, \( V_{DD} \) was varied by taking nine equispaced points from 0.6V to 1.4V. Transistor lengths were fixed to the minimum. A typical CMOS sizing scheme was assumed. A design parameter \( W \) was defined which corresponded to the width of a basic NMOS transistor. When \( m \) transistors are connected in series their widths are multiplied by \( m \). \( W \) was varied from 0.16 \( \mu m \) to 1.6 \( \mu m \).

C. Simulation Setup

Ideal clock waveforms for each structure were applied. For the RTD-based circuits, we considered a clock in which the rising, \( f_{norm} = FO4 \), where \( FO4 \) is the FO4-inverter delay of the technology. \( \uparrow \) This lower limit was fixed assuming that the minimum RTD that could be fabricated would be 0.2 \( \mu m \)-0.2 \( \mu m \) (0.04 \( \mu m^2 \)) to match the technology node we are using for transistors.
falling, hold and reset times were the same. In TSPC structures a pulse train clock with a duty cycle of 50% was used. Standard mismatch models from the technology were used with the MOS transistors. Since there are no mismatch models available for the RTDs, Gaussian distributions (relative error of ±10%) were associated to the peak voltage, intrinsic capacitance and the peak current density of each device. Variations of the supply voltage of ±10% around its nominal value) of its nominal value were considered.

IV. EVALUATION

The circuits derived from the above described design exploration experiment were evaluated and compared. Table I shows simulation results corresponding to the ratio between the power consumptions of the TSPC and the RTD-CMOS chains. For each case, we have included (inside [ ] ) the value of the supply voltage $V_{DD}$ for the optimum solution, in terms of power consumption, of the TSPC network. For the RTD-based structure, $V_{DD}$=0.7V was selected, which is the minimum value that ensures a correct operation at all frequencies and under all load conditions (output logic swing of 0.66V is able to switch next stage). Significant improvement in the performance of the RTD-CMOS networks was obtained for high clock frequency values. These differences were even more remarkable when load conditions were increased.

The RTD-CMOS inverter chains were also evaluated, modifying the PVCR value in the LOCOM RTD model. A PVCR value greater than 2.5 is required for correct operation. However, as we did not modify peak currents, power increased with the reduction of the PVCR. To give a quantitative idea, at $f_{NORM}$=0.24 and $V_{DD}$=0.8V, with a PVCR=3.5 (half the nominal value of LOCOM RTD) and LOAD1, power improvement reduced from 2.50 to 1.94.

RTD-CMOS chains using an RTD model exhibiting the characteristics of a silicon RTD ($J_p$=218KA/cm², $C$=60fF/µm²) [10], were also designed and evaluated. The sizing experiments did not derive any solution using minimum transistors. This is due to the fact that $J_p$ is now one order of magnitude larger and a wider transistor is therefore required. Experiments using a transistor three times wider derived solutions with the minimum allowed value of $f_p$ (0.04µm²), unlike those obtained for LOCOM chains. This is also explained by the higher current peak density, which means that peak current differences between the load and the driver required to operate at a given frequency are achieved with smaller RTD areas. It is in agreement with previous works on MOBILE operation speed [11]. Power advantages with respect to TSPC were reduced, as shown in Table II. The circuits operate with higher current levels since RTD areas were not reduced enough to compensate the increment in $J_p$ (this would have required an RTD smaller than the minimum RTD assumed in this experiment). However, maximum operating frequency did not increase with respect to their LOCOM counter-part, due to higher transistor parasitic capacitances which impact the complex dynamic MOBILE behavior. These results suggest that LOCOM RTD, in spite of having a smaller $J_p$, better matches the transistor technology used. A trade-off value of $J_p$ (60KA/cm²) improves power performance even with minimum-sized transistor and RTDs (Table II, inside []).

V. CONCLUSIONS

Realizations of RTD-CMOS logic networks working on the basis of the MOBILE operating principle have been introduced. A comparison with transistor-only implementations using TSPC logic style, well suited to gate-level pipelines, like the proposed RTD structures, has been carried out. The operation of RTD-CMOS realizations is static, overcoming one of the main drawbacks of TSPC CMOS. Very significant power savings were obtained for the RTD-CMOS at the target frequencies, and this compares favorably with other TSPC-based logic styles (alternative ways of parameterizing the circuits could provide better results). In spite of the limitations of this preliminary study, the results obtained suggest that exploration of architectures using RTDs for power efficient pipelined structures is worthwhile.

REFERENCES


### Table I. Network Using the LOCOM RTD

<table>
<thead>
<tr>
<th>$f_{NORM}$ (f)</th>
<th>$I_{IV}$ (mA)</th>
</tr>
</thead>
<tbody>
<tr>
<td>0.16</td>
<td>2.36 (1.0V)</td>
</tr>
<tr>
<td>0.20</td>
<td>2.65 (1.1V)</td>
</tr>
<tr>
<td>0.24</td>
<td>2.94 (1.2V)</td>
</tr>
</tbody>
</table>

### Table II. Network using the RTD reported in [10] [RTD with $J_P=60KA/cm²$]

<table>
<thead>
<tr>
<th>$f_{NORM}$ (f)</th>
<th>$I_{IV}$ (mA)</th>
</tr>
</thead>
<tbody>
<tr>
<td>0.16</td>
<td>0.96 (2.24)</td>
</tr>
<tr>
<td>0.20</td>
<td>1.18 (2.51)</td>
</tr>
<tr>
<td>0.24</td>
<td>1.28 (2.77)</td>
</tr>
</tbody>
</table>