Metastable signals from different clock domains ( CDC ( Clock Domain Crossing ) ) have become an issue in today's FPGA designs. Traditional structure-based verification alone is not effective in verifying CDC signals. This column explains the CDC problem and its verification method in four parts.
Part 1: What are setup time and hold time?
Part 2: What is a metastable?
Part 3: General metastable measures and their problems
Part 4: How to verify metastables
Part 2: What is a metastable?
Figure 2-1 shows data that straddles two clocks in an asynchronous relationship. In such a case, a metastable may occur in the FF (flip-flop) in the latter stage.
Figure 2-1 Circuit example where metastable occurs
As explained in the previous column, FFs generally hold data in a loop circuit using two inverters. If the input signal is closed before the data has completed one round of the loop, the output signal may be at an intermediate potential (see Figure 2-2). This is the metastable.
Figure 2-2 Waveform of Metastable
If the output signal is unfortunately balanced at an intermediate potential between Low(0) and High(1), the metastable state will continue for a long time, just like surfing a good wave. When the balance is lost, the waveform settles to Low or High because it is a CMOS circuit.
Problems with Metastable
Metastable is problematic in many ways. In many cases, metastable is the cause of non-reproducible errors that are thought to be software errors or device malfunctions. Here are some of the problems with metastable.
1. metastable state can be considerably longer than the FF delay.
2. even after the metastable state ends, it is not known whether it will settle to Low or High.
3. it cannot be confirmed by simulation or actual device verification.
4. it is not reproducible.
5. it is difficult to determine if CDC countermeasures are effective. It is difficult to determine if CDC measures are effective or not.
Reasons for the increase in metastables
The reasons for the recent increase in metastables are as follows: 1.
1. increase in operating frequency ⇒ increase in the number of clock cycles, which increases the probability of metastable
2. increase in the number of clock domains
3. increase in the number of registers
Verification of metastable is very difficult
Metastable is impossible to verify by logic or timing simulation because it is caused by a delicate balance of manufacturing process, temperature, and voltage. And while a circuit with a probability of occurring once a minute can be verified on actual equipment, it is difficult to verify even on actual equipment for a circuit with a probability of occurring once every several months.
It is also difficult to determine whether the countermeasures are effective or not. Therefore, failure analysis using metastables is very time-consuming.
For example, an image processing IP created by our group company had only three clock domains, but the CDC caused frequent failures. It took us about a month and a half to find out that CDC was the cause of these failures.
Generally, we use a tool to extract possible metastable parts from the circuit structure, and more than several hundred parts are reported as errors or warnings.
It is very difficult to manually analyze each and every one of these reports without missing anything.
That's all for this issue. In the next issue, we will explain general measures for metastables and their problems.
Previous Article Asynchronous Clocks and Verification Methods Next Article