

## Impact of Low-k ILD and Cu on Circuit Performance

Sam Nakagawa, Khalid Rahmat, Soo-Young Oh, Gary Ray, Ultra Large Scale Integration Laboratory Rajendra Kumar Computer Systems Laboratory HPL-95-74 July, 1995

low-k ILD, Cu interconnect, circuit performance Low dielectric constant of organic interlevel dielectrics (ILD's) and low resistivity of copper can provide substantial improvement in the RC time constant of onchip interconnect. However, the ultimate impact of these materials must be evaluated in terms of real circuit performance, not just in terms of interconnect RC constant. In this paper we first delimit the design space of driver size and load device capacitance where these new materials make the most impact on circuit performance. The performance advantage varies from no impact at all in short, local routing to 30~40% improvement in a long, interconnect dominated circuit. Then using numerical simulation we evaluated performance improvement in two important circuits in the design of high performance microprocessors, cache random access memory (RAM) and the clock distribution circuit. The improvement varied from about 10% in cache RAM to about 40% in the clock distribution circuit.

Internal Accession Date Only

© Copyright Hewlett-Packard Company 1995

## Impact of Low-k ILD and Cu on Circuit Performance

O. S. Nakagawa<sup>†</sup>, K. Rahmat, S.-Y. Oh, G. Ray, R. Kumar<sup>‡</sup> Hewlett Packard Laboratories ULSI Research Laboratory / Computer Systems Laboratory<sup>‡</sup> 3500 Deer Creek Road Palo Alto, CA 94304 Voice: (415) 857-8632 FAX: (415) 857-6684<sup>†</sup> e-mail: nakagawa@saiph.hpl.hp.com<sup>†</sup>

## ABSTRACT

In recent years demand for higher frequency and larger chip size on ULSI circuits has prompted extensive studies on new interconnect materials, such as organic interlevel dielectrics (ILD's) [1] and Cu metallization [2]. Low dielectric constant (k=2-3) of organic ILD and low resistivity of Cu ( $\rho$ ,bulk=1.6x10<sup>-6</sup> $\Omega$ cm) can provide substantial improvement in the RC time constant of interconnect. However, the ultimate impact of these materials must be evaluated in terms of real circuit performance, not just in terms of interconnect RC constant. In this paper we first delimit the design space of driver size and load device capacitance (Fig. 1) where these new materials make the most impact on circuit performance. Then we present our assessment of performance improvement in two realistic circuits, cache random access memory (RAM) and the clock distribution circuit, both of which are important in the design of high performance microprocessors.

For low-k ILD and Cu to effectively improve circuit performance, interconnect length  $(L_m)$  must be longer than the critical length, Lc<sup>\*</sup> and Lr<sup>\*</sup>, respectively (Fig. 2 and 3). Lc<sup>\*</sup> is the interconnect length at which interconnect capacitance equals load capacitance  $C_L$ . Similarly, Lr<sup>\*</sup> is the interconnect length at which interconnect resistance is equal to effective driver source resistance Rs [3]. For the 0.35µm technology generation Lc<sup>\*</sup> for k=2.5 is typically on the order of 10~100µm (Fig. 4). This makes low-k ILD attractive in global and semi-global routing where interconnect performance is improved linearly with k (Fig. 5). Low-k ILD, however, is not very effective in local routing because the total load capacitance is dominated by  $C_L$ . In contrast Lr<sup>\*</sup> for the 0.35µm technology generation is typically longer, 1~10mm for a reasonable driver (Fig. 6), making Cu ineffective for local routing as well as some semi-global routing. Although both Lc<sup>\*</sup> and Lr<sup>\*</sup> decrease as metal pitch is reduced in future technology generations (Fig4 and 7), low-k and Cu continue to be ineffective in reducing delay in local routing. It is interesting to note that low-k ILD is not effective in reducing cross-talk noise when interconnect length is much longer than Lc<sup>\*</sup> (Fig. 8 and 9). This is because the maximum cross-talk noise is a function of [Cinterlayer / Cground] [4] and reducing k does not change this ratio.

To gauge the impact of the low-k ILD and Cu on realistic circuits we studied two examples, a cache RAM and a clock distribution circuit using numerical simulation tools. As the cache RAM is fairly small (1-2mm in side length) the improvement in the circuit performance by the introduction of low-k ILD and Cu is about  $4\sim13\%$  (Table 1). This improvement is modest as compared to the interconnect RC constant improvement by the materials; however, it still rivals the 15% improvement obtained by BiCMOS without the new materials [5]. Also, the improvement is likely to be better in the DRAM case where the role of interconnect is more dominant. Note that the use of low-k dielectric has a much larger impact than copper since  $Lr^*>Lc^*$  in the cache RAM. For the clock distribution circuit we simulated a clocking scheme similar to that used in the Alpha chip [6] [7], where a distributed large buffer driver is used to drive a single clock wire to the whole chip. The effective load capacitance on the distributed driver is 3.2nF, which is much larger than the capacitance of the clock wire itself, while the resistance of the clock wire (> 7mm in length) is much larger than the driver resistance. For such a circuit, the use of copper significantly reduces worst-case clock skew, while low-k dielectric has a much smaller impact (Table 2).

In summary we defined the space of driver size and load device capacitance where the improvement of the interconnect RC time constant by low-k ILD and Cu make an impact on circuit performance. The benefit derived in a circuit from the new materials depends on the relative importance in its critical path of the interconnect as compared to driver size and load device capacitance. The performance advantage varies from no impact at all in short, local routing to 30~40% improvement in a long, interconnect dominated circuit.

## REFERENCES

- [1] C. H. Ting, Conference Program and Abstracts, Advanced Metalization for ULSI Applications Conference, 1994.
- [2] Jian Li, T. E. Seidel, and J. W. Mayer, MRS Bulletin, p15, August 1994.
- [3] J. T. Watt and J. D. Plummer, IEEE Trans. Electron Devices, vol. 36, no. 8, p1510, 1989.
- [4] T. Sakurai, IEEE Trans. Electron Devices, vol. 40, no. 1, p118, 1993.
- [5] R. P. Colwell, R. L. Steck, ISSCC Proceedings, February 1995.
- [6] M. Horowitz, 1992 Symposium on VLSI Circuit, Digest of Technical Papers, p50, 1992.
- [7] D. W. Dobberpuhl, et. al., IEEE Journal of Solid-State Circuits, vol. 27, no. 11, p.1555, 1992.



Fig. 1 Schematic of interconnect circuit. Effective source resistance of the driver (Rs) is in series with interconnect resistance (R), and load device capacitance is in parallel with interconnect capacitance  $C_{interlayer} + C_{interlevel}$ .



Fig. 2 Switching time improvement with low-k ILD (k=2.5) over oxide (k=4.0) as a function of normalized length. Switching time is the sum of rise time and fall time. Lc\*=84um. k/koxide = 37.5% is the improvement limit.



Fig. 4 Critical length Lc\* for various normalized load device capacitance. Cgo is the gate capacitance per micron gate width.



Fig. 3 Switching time improvement with Cu over Al as function of normalized length. Lr\*=1500um.  $\rho_{al} / \rho_{cu} = 33\%$  is the improvement in this numerical simulation.







Fig. 6 Critical length Lr\* for various driver gate width. The plot shows that Cu will not improve the switching time when interconnect length is less than 100um.



Fig. 8 Cross-talk noise as a function of normalized interconnect length. Cross-talk is improved significantly when  $L < Lc^*$ , but the magnitude of cross-talk is small there.







Fig. 9 Cross-talk noise improvement as a function of normalized interconnect length. k=2.5. No significant improvement is seen when the circuit is interconnect dominated (L >> Lc\*).

Table 1 Cache RAM access time for various sizes for low-k ILD and Cu The numbers in () represent improvement from the nominal case.

| Size     | Nominal | Low-k, k=2.5   | Cu            |
|----------|---------|----------------|---------------|
| 128 x 64 | 1.02ns  | 0.94ns (8.2%)  | 0.98ns (3.8%) |
| 256 x 64 | 1.48ns  | 1.33ns (10.2%) | 1.41ns (4.9%) |
| 512 x 64 | 2.57ns  | 2.24ns (12.7%) | 2.38ns (7.3%) |

Table 2 Worst-case skew in the clock circuit with large buffer driversThe numbers in ( ) represent improvement from the nominal case.

| Nominal | Low-k ,k=2.5 | Cu          |
|---------|--------------|-------------|
| 270ps   | 247ps (8%)   | 169ps (37%) |