DLL-based clock recovery cranks up net bandwidths
Ransom Stephens- July 2, 2012
High-speed serial technology relies on its ability to recover a data-rate clock from logic transitions. Not only does this “embedded clocking” eliminate the need for an extra lane to carry a clock signal, it makes receivers more tolerant to jitter. By recovering a data-rate clock from data transitions, jitter at frequencies below the bandwidth of the clock-recovery circuit persists on both the clock that positions the time-delay of the sampler and on the data itself. When the jitter of the data and sampling point are in tune, that jitter doesn’t cause errors.
The idea of clock recovery is to synchronize a local oscillator to incoming data transitions. It follows that clock-recovery circuits require sufficient transitions to lock. To this end, HSS standards use data-encoding techniques, sometimes combined with scrambling. Early generations of HSS standards like PCIe Gen 2, SAS/SATA, and so forth, used 8B/10B encoding, which assured at least one transition in every ten bits. That’s 25% overhead (8 information bits +25% encoding bits = 10 bits) which reduces 2.5 Gb/s to a net 2 Gb/s.
Standard clock-recovery schemes like PLLs (phase locked loops) and their digital cousins, PIs (phase interpolators) lose lock without a transition every several bits. As data rates climb, every signal impairment is more egregious–noise and jitter, crosstalk and inter-symbol interference &mdash eye diagrams close, and logic transitions lose their sharp edges. Rather than a tidy digital rise or fall, we’re lucky if transitions surmount half the voltage swing.
To meet the bandwidth demands, ever more robust clock recovery schemes are necessary. But, 25% overhead is too high.
Enter DLLs (delay locked loops), whose reduced power, size, and complexity combine with simpler transfer functions and improved stability enable clock-recovery circuits to remain locked after dozens of identical bits. A DLL is like a PLL with a delay line replacing the VCO (voltage controlled oscillator). DLLs compare the timing of logic transitions to a set of fixed phase intervals, determining whether the clock phase is ahead or behind the transition. A running sum of leads and lags accumulates over a time interval, effectively the reciprocal of the clock recover bandwidth, and then the phase of the recovered clock is appropriately advanced or delayed. The beauty of DLL-based clock recovery is its tolerance to long sequences without a transition.
Now combine DLL clock recovery with self-synchronized data scrambling, which reduces the probabilities of extremely long runs of consecutive identical bits to negligible levels (or so it is said in the literature, I’m looking for a source that has the cumulative distribution functions of runs of consecutive identical bits, if you have one, please post it as a comment!), and the overhead of requiring high local transition densities is knocked way back. PCIe Gen 3 combines 128B/130B encoding and self-synched scrambling to beat the overhead down to 1.5%.