March 02, 202620 Min Read
Timing 201 #15: The Case of the Suicide Clock
Author: Kevin G. Smith
Introduction
A typical first step in architecting a system level clock tree is to identify all the clocks in a block diagram and list them in a spreadsheet. Such a spreadsheet will generally itemize the clock names, frequencies, signal formats, and jitter requirements. This process helps guide the selection of appropriate clock components.
It is also recommended early on to “step back” so to speak and identify what clocks are critical to the system and when they need to be operational. This system level view is necessary to identify any critical clocks and to minimize the risk of their interruption. In particular, we need to be on the lookout for any “circular dependencies” that may inadvertently create a “suicide clock” which can take down the system. How this can happen, and what to do about it, is the subject of this post, “The Case of the Suicide Clock”.
A Simple Example
Consider the simplified block diagram below. There is a single Clock Device on the right hand side (a clock generator or jitter attenuator) which provides a SYSTEM_CLK to the system FPGA/MCU/SoC on the left hand side for basic operation after power-up. For the sake of this example, assume that if the SYSTEM_CLK is lost for some reason, then the FPGA/MCU/SoC cannot function. This makes SYSTEM_CLK a critical clock and its loss is a single point of failure for the system. Further, even if the clock device is otherwise still functioning, input clock INCLK2 to the Clock Device will be lost, and all INCLK2 contingent clocks will go into free-run or holdover.
If the clock device’s output SYSTEM_CLK is independent and powers up straight away, what is the issue? In general, there is none. However, what if we place the clock device under serial bus control by the FPGA/MCU/SoC master device itself?
Now the system has a vulnerability due to the I2C/SPI bus and the resulting potential feedback loop. The master device may program the clock device in such a way that SYSTEM_CLK is disabled or interrupted, bringing down the master device itself. This arrangement is an example “circular dependency” after the language used in the Fifth Generation DSPLL® Wireline Jitter Attenuator Reference Manual, https://www.skyworksinc.com/-/media/SSi5361-62-63_SKY63104-05-06_RM.pdf . See the highlighted text below from Section 9.5.
When the RSTb pin is held low or when the RESTART command is issued, all output clocks are disabled. Ensure avoidance of the circular dependencies of supplying clocks to the FPGA/SoC/MCU that programs the clock device.
The primary way a critical clock may be interrupted is if the master device intentionally writes a new volatile frequency plan to the clock device supplying its own critical clock. All the output clocks will be disabled during re-configuration since the device has to go through the initialization process similar to a hard reset. The difference is that dynamic programming bypasses the NVM download.
A secondary way critical clock loss may occur is due to “extracurricular” activity on the control bus, not directly associated with dynamically programming the clock device. For example, the Linux command-line tool i2cdetect notes the risk of bus confusion, data loss or worse in its Linux man page when scanning the I2C bus for devices. There is always some risk to a critical clock anytime the serial bus is connected when a circular dependency is present.
The result of this “circular dependence” vulnerability is a ”suicide clock”, identified when a critical system component (FPGA/MCU/SoC) effectively controls its own critical clock, the absence of which can bring the system down. Recovery may require resetting or powering-up both devices. In the specific example above, SYSTEM_CLK is a “suicide clock” since the master device is able to commit functional “suicide” by disabling or temporarily interrupting its own SYSTEM_CLK.
A Simple Solution
So how can we rid a system of its “suicide clock”? We have to break the risky “circular dependency”. If a clock is critical, and general clock device programming flexibility is required, then the preferred approach is to supply it from a dedicated stand-alone non-programmable oscillator such as the XO (Crystal Oscillator) shown in the amended figure below.
This eliminates the problem entirely. The master device can program the clock device as needed without interrupting its own critical SYSTEM_CLK. Further, even if inadvertent bus-writes interfere with the clock device it doesn’t “brick” the master device.
Normally, we encourage the application of flexible low jitter clock devices everywhere to support clock trees. However, a genuinely critical system clock is a legitimate exception and a good candidate for an independent XO.
Other Mitigation Options
Are there any other mitigation options than using an independent XO? Yes, there are other approaches one can take in some cases, with varying degrees of flexibility and security. However, none of these options are as complete a solution as the independent XO.
- BOM Options Approach
If you only need programming flexibility during development, you can design and lay out the PCB using either 0 Ω resistors or jumpers to connect the I2C/SPI bus between the master and clock devices. Populate these components during development and no-pop them in production. These components can still be installed for troubleshooting purposes.
- Frequency-on-the-Fly Approach
Depending on the frequency changes you need to make, you may be able to implement some output clock changes without disturbing others such as a critical system clock. See for example these application notes depending on your device:
However, this approach will not support all frequency plan changes in general and cannot prevent risk due to serial bus mischief.
- Si5332 Approach
Finally, the Skyworks Si5332 clock generator uniquely supports the ability to disable the I2C bus via NVM when it is time to create an OPN. This feature is selected as noted below on the Host Interface page in CBPro. After the frequency plan has been settled during development with a base or other programmable Si5332, you can create a secure Si5332 OPN with I2C access permanently disabled, even if the bus is connected in hardware.
Summary
This post has briefly covered how to identify and eliminate or mitigate “suicide clocks”. I hope you found it useful.
As always, if you have topic suggestions or questions appropriate for this blog, please send them to kevin.smith@skyworksinc.com with the words Timing 201 in the subject line. I will give them consideration and see if I can fit them in. Thanks for reading.
Cheers,
Kevin
By Kevin G. Smith
Sr. Principal Applications Engineer