# A 0.011 mm<sup>2</sup> PVT-Robust Fully-Synthesizable CDR with a Data Rate of 10.05 Gb/s in 28nm FD SOI

Aravind Tharayil Narayanan, Wei Deng, Dongsheng Yang, Rui Wu, Kenichi Okada and Akira Matsuzawa

Department of Physical Electronics, Tokyo Institute of Technology

2-12-1-S3-27, Ookayama, Meguro-ku, Tokyo 152-8552, Japan

Email: aravind-tn@ssc.pe.titech.ac.jp

Abstract—This paper presents a fully-synthesizable clock and data recovery circuit using injection locking technique. The challenges presented by automated place and route for high speed applications is overcome using background calibration mechanism. The fully-synthesizable all-digital architecture presented in this work is fabricated in 28nm FDSOI technology. The system has a top data-rate of 10.05Gb/s while consuming 16mW power from 1.0V suppy.

# I. INTRODUCTION

Clock and data recovery (CDR) plays a crucial role in a wide range of communication applications. These applications include chip-to-chip interconnects, optical communications such as synchronous optical networks (SONET) etc. Recent years have witnessed an explosive growth in the data rates of SerDes devices. Many of the chip-to-chip interconnects have already crossed the 10Gbps mark and in the case of optical networks, 40Gbps data rates are common nowadays. Thanks to the advancements in the CMOS technology, it is possible for digital circuits to catch up with such high data rates. Recent publications have shown a trend towards alldigital fully-synthesized systems [1]. Implementing a CDR using only digital components can save both design time and cost. But the choice of a suitable architecture for an all digital implementation is of high importance. Even though a wide variety of topologies have been proposed for implementing CDR [2] the cost factor involved and the technical limitations render many of those topologies unusable at very high data rates. Recently, CDR based on injection-locking technique has gained a lot of attention from the research community. The main attractions of injection-lock CDR are the ability to lock on to the data stream without any delay and a complete absence of jitter peaking phenomenon.

Fig. 1 shows the phase-locked-loop CDR (PLL-CDR) and the proposed injection-lock CDR (IL-CDR). The advantages of IL-CDR over PLL-CDR can be identified by considering jitter transfer function. In order to make a stable system, it is common practice to add zeros while designing phase locked loops. The presence of zeros results in an amplification of the jitter present in the incoming data stream [4]. Jitter peaking phenomenon will have adverse effects in systems such as SONET where many CDRs are cascaded for long distance communication. Another issue with PLL based CDR is the trade-off between fast settling times and large bandwidth. A large bandwidth is desirable for faster locking but large



Fig. 1. Comparison between different CDR topology (a) PLL based CDR (b) Proposed IL-CDR with PVT calibration.

bandwidth will result in larger amounts of jitter transfer. The inherent nature of the IL-CDR make it immune to jitter peaking. This is because IL-CDR does not depend on a feedback for achieving phase lock, rather IL-CDR relies on a feed forward mechanism for this purpose. Although the IL-CDR proposed in the past [3] has many advantages over the other CDR topology, it also suffers from some critical issues. One of the major issues of the conventional IL-CDR topology can be identified when PVT variations are considered. Since the oscillator used in the frequency lock loop (FLL) and the oscillator used in the injection lock path are different, there could be an offset in their operating frequencies because of PVT. Voltage and temperature variations while operating the circuit might result in further worsening of this offset. The effects of PVT variations becomes more severe in fully synthesized system, which is the current trend for minimizing cost and reducing design time [1]. The reason for this is the limited control over the placement and routing of components in a fully synthesizable system.

This work presents a PVT-robust fully-synthesizable CDR using injection-locking technique. The proposed CDR topology with calibration is capable of tracking variations in the voltage and temperature during the circuit operation in addition to compensating for the process variations.

This paper is organized as follows: Section II presents the proposed CDR architecture and briefly explains its advantages over other topologies . In section III, the proposed CDR system is explained in detail. This is followed by section IV, where the experimental results are presented. Finally, the conclusion is given in section V.

# II. PROPOSED INJECTION LOCKED CDR

# A. Architecture of CDR

Fig. 1(b) shows the basic architecture of the CDR presented in this paper. A frequency locked loop is used to ensure frequency lock to the incoming data stream. The control code of the calibration oscillator in the FLL is used for generating the control code for the primary and secondary oscillators in the injection locked path. Injection locked path is responsible for achieving the phase lock to the incoming data. The purpose of the secondary oscillator in the system is to filter out the momentary variations in the amplitude as a result of injection and to provide a constant amplitude during the operation [3]. Also, by adjusting the strength of injection from the primary oscillator to the secondary oscillator, large cycle-to-cycle jitter resulting from duty cycle distortion [8] can be filtered out. The large number of oscillators involved in the architecture makes it extremely crucial to ensure correct operating frequency of each of the oscillators. This requirement is further enforced by the feed-forward nature of the injection locked systems, which is inherently incapable of ensuring frequency lock. Any mismatch between the free running frequencies of the primary, secondary and the calibration oscillator will result in an injection at an improper time, which will have adverse effects on the recovered data stream. A calibration mechanism used in this architecture ensures that all three oscillators runs at exactly the same frequency. It also ensures that local voltage and temperature differences are also tracked and compensated thus creating a system that is robust over wide PVT variations.

## B. Details of Implementation

Detailed implementation of the CDR is shown in Fig. 2. As explained in the previous section, the system ensures frequency lock with the help of a FLL. The three feedback loops shown in the block diagram helps in calibration of each of primary DCO and secondary DCO to work in unison with the calibration DCO. The details of the calibration technique used is presented in the later sections. The phase lock is achieved by injection locking the incoming data stream to the primary VCO. The phase filtering technique for filtering out large cycle-to-cycle jitter is implemented as a multi-phase injection. The system comprises of three feed-back loops and one feed-forward path. The three feedback loops ensures that all the oscillators in the system works at the same free run frequency and thus it makes sure that mistimed injections are avoided. The feed-forward injection path in the system then recovers the clock by phase locking to the incoming data stream. The digital controlled oscillator (DCO) used in the system is implemented as a three - three stage ring oscillator coupled to a nine-stage interpolative ring [1]. The phase-interpolative outer ring acts as a coupler for coupling the three independant stages of the ring oscillator. This architecture helps provide better phase balance in and hence is good choice for fully synthesized systems where the components are automatically placed and routed. It is worth mentioning here about the placement and routing of the DCOs in the system. Since the CDR system presented in this paper is fully synthesized, the control over the placement and routing is very limited. Any difference in the placement and routing



Fig. 2. Detailed block diagram of injection-locked CDR with PVT calibration.

between the primary DCO, secondary DCO and the calibration DCO in the system could result in tuning characteristics that vary from DCO to DCO. Even though the mismatch between DCOs can be calibrated, the same layout is applied to DCOs. Once the DCO layout is generated individually, and all blocks are combined together using the DCO layout.

Fig. 3 shows the two main featues incorporated in the design, namely phase filtering [8] and edge injection [1]. When used together, these features can help reduce the duty cycle distortion caused by the injection signal in conventional injection lock based CDR. From the detailed analysis done by the authors of [8], it was found that using a pulse for injection has the disadvantage of creating a distroted pulses. This is the result of injection signal trying to correct the jitter accumulated over a some period of time. This instantaneous correction of phase will result in a duty cycle distortion and which appears as a shortened pulse width which will tighten the requirements for the downstream circuitry. Detailed working principle of the edge injection can be understood by studying Fig. 3(b)[1]. When a data transition occurs, the injection control detects the event and creates a window pulse. The DCO is momentary suspended when the window pulse is asserted. A delayed version of the data edge is then used for resuming the normal



Fig. 3. (a) Conceptual block diagram of phase-filtering using multi-path injection. (b) Basic working principle of edge injection mechanism.



Fig. 4. Implementation details of the DCO and DAC using standard cells.



Fig. 5. Conceptual diagram of the calibration mechanism.

functioning of the DCO. This operation forces the edge of the DCO to be aligned to the edge of the incoming data signal thus achieving phase lock. The sudden disturbance in the DCO signals while using pulse injection can be minimized using this technique.

A standard-cell based DAC is also implemented for achieving wide operation range for the CDR. Fig. 4 shows the implementation details of the DAC [1]. It can be observed from the figure that the amount of current delivered to the load by the DAC can be controlled by controlling the control code D0 through D4. The feedback connected NAND gate acts as a current mirror for mirroring the current generated by the driving NAND gate cells.

The detailed working principle of the calibration technique used in the system can be understood from Fig. 5. When the system is initialized, counters in the feedback paths determine the frequency of oscillation of the calibration-DCO, primary-DCO and secondary-DCO. The count values are compared with the frequency control word (FCW) for achieving frequency lock. The count values are then stored in corresponding accumulators. An add-subtract module then compares the instantaneous count values of the primary-DCO counter and secondary-DCO counter with the calibration-DCO counter. This helps the system determine the offset values to be added or substracted in order to componsate for the frequency difference caused by PVT variations. It can be noticed from Fig. 5 that the acquisition phase will take a considerable amount for achieving the frequency lock, but this is a one time process and will not slow the system in anyway during its operation phase.

### **III. MEASUREMENT RESULTS**

The fully synthesized CDR presented in this paper is fabricated in 28nm FDSOI process. The chip micrograph is shown in Fig. 7. The area occupied by the core is  $140\mu m \times 80\mu m$ . Fig. 7 also shows the layout view of the presented fully syntesized CDR. The usage of identical layout for DCOs can be noticed from this figure. The usage of identical layout for DCOs help maintai identical tuning characterestics as discuessed in the previous sections. The CDR functionality at the designed data rate of 10.05Gbps was validated by observing the eye pattern of the recovered data. The above mentioned eye pattern is shown in Fig. 6. A pulse pattern



Fig. 6. Measured eye-pattern of the recovered data stream at 10.05Gbps.



Fig. 7. Chip micrograph and the layout. DCOs in the system are identical in placement and route.

TABLE I. COMPARISON OF PRESENTED IL-CDR WITH STATE-OF-THE-ART HIGH DATA RATE CDRS.

|           | Technology  | Data Rate [Gb/s] | Feature                   | Power<br>Consumption<br>[mW] | Locking Time<br>[bits] | Area [mm <sup>2</sup> ] |
|-----------|-------------|------------------|---------------------------|------------------------------|------------------------|-------------------------|
| [3]       | 90nm CMOS   | 20               | Analog                    | 175                          | 1                      | 0.96                    |
| [5]       | 180nm CMOS  | 10               | Analog                    | 200                          | 32                     | 3.4                     |
| [6]       | SiGe BiCMOS | 10.3             | Analog                    | 405                          | N/A                    | 1.31                    |
| [7]       | 130nm CMOS  | 10               | Analog                    | 1200                         | 5                      | 6.25                    |
| [9]       | 130nm CMOS  | 5.0              | Digital                   | 120                          | N/A                    | 0.56                    |
| [10]      | 130nm CMOS  | 2.5              | Analog/Digital hybrid     | 7.0                          | N/A                    | 0.2                     |
| This work | 28nm FDSOI  | 10.05            | Digital Fully-Synthesized | 16                           | 1                      | 0.011                   |

generator was used to generated a 2<sup>7</sup>-1 PRBS data stream which is used as the input to the system. The CDR achieved a phase noise performance that is well below -110dBc/Hz from 10kHz. The calculated integrated jitter from 10kHz to 40MHz is 0.02ps when ideal data stream is used. The measured phase noise under free-run and locked conditions are shown in Fig. 8. Table I compares the performance of the presented fully-synthesizable CDR with other state-of-theart CDRs. The advantage of a fully-synthesizable system over the conventional custom-designed CDRs is evident from the drastic reduction in the chip area. The high efficiency and simplicity of injection-locking technique can be appreciated by comparing the power consumption.

# IV. CONCLUSION

This paper presents a fully-synthesizable PVT-robust CDR using injection locking technique. The challenges to be overcome while implementing a fully-synthesizable all-digital system is briefly discussed. A solution is then presented to overcome these challenges. Also a CDR incorporating these special techniques is implemented in silicon to verify the claims. The system achieves the top of the class area and power efficiency



Fig. 8. Measured phase noise under freerun (represented by grey line) and injection locked (represented by black line) conditions. Injection is done from a continuously toggling data stream.

by using the fully-synthesizable all digital architecture. The calibration scheme used in the system makes it very robust over wide PVT variations by tracking and compensating for the process voltage and temperature variations.

#### ACKNOWLEDGMENT

This work was partially supported by MIC, SCOPE, MEXT, STARC, and VDEC in collaboration with Cadence Design Systems, Inc., and Synopsys, Inc. The authors thank Prof. Nishiyama for his measurement support.

#### REFERENCES

- W. Deng, et al., "A 0.0066mm<sup>2</sup> 780μW Fully Synthesizable PLL with a Current-Output DAC and an Interpolative Phase-Coupled Oscillator Using Edge-Injection Technique," *IEEE International Solid-State Circuits Conference (ISSCC)*, pp. 266-267, Feb. 2014.
- [2] M.-T. Hsieh, and G. Sobelman "Architectures for Multi-Gigabit Wire-Linked Clock and Data Recovery," *IEEE Circuits Syst. Mag*, vol.8, pp. 45-57, Aug. 2008
- [3] J. Lee, and M. Liu, "A 20-Gb/s Burst-Mode Clock and Data Recovery Circuit Using Injection-Locking Technique," *IEEE Journal of Solid-State Circuits*, vol.43, no.3, pp.619-630, Mar. 2008.
- [4] F. Samii, et al., "Jitter Peaking Investigation in Charge Pump Based Clock and Data Recovery Circuits," AFRICON 2007, Sep. 2007.
- [5] C. F. Liang, et al., "A 10Gbps Burst-Mode CDR in 0.18μm CMOS," IEEE Custom Integrated Circuits Conference(CICC), pp. 559-602, Sep. 2006.
- [6] J. Zhan, et al., "A Full-Rate Injection-Locked 10.3Gb/s Clock and Data Recovery Circuit in a 45GHz-fT SiGe Process," *IEEE Custom Integrated Circuits Conference(CICC)*, pp. 557-560, Sep. 2005.
- [7] M. Nogawa, et al., "A 10Gb/s Burst-Mode CDR IC in 0.13μm CMOS," IEEE International Solid-State Circuits Conference (ISSCC), pp. 228-229, Feb. 2005.
- [8] H.-T. Ng, et al., "A Second-Order Semidigital Clock Recovery Circuit Based on Injection Locking," *IEEE Journal of Solid-State Circuits*, vol.38, no.12, pp. 2101-2110, Dec. 2003.
- [9] J. L. Sonntag, et al., "A Digital Clock and Data Recovery Architecture for Multi-Gigabit/s Binary Links," *IEEE Journal of Solid-State Circuits*), vol.41, no.8, pp. 1867-1875, Aug. 2006.
- [10] W. Yin, et al., "A TDC-less 7mW 2.5Gb/s Digital CDR with Linear Loop Dynamics and Offset-Free Data Recovery," *IEEE International* Solid-State Circuits Conference (ISSCC), pp. 440-442, Feb. 2011.