If you need to review any of the prerequisite material, please look at the existing TimeQuest timing analyzer and constraining source synchronous interface online trainings.
Let’s now go through a quick over view
In many cases, people think of DDR as a memory interfaces, while this is true, DDR interfaces are not restricted to memory interfaces and we are going to talk about it from a generic point of view. If you happen to use one of Altera DDR memory controller IPs, you don’t need to write your own SDC for those memory interfaces, you can continue using the generated .SDC files included with those IPs, but if you choose to take a look in side it you’ll find constructs very similar to those we’ll cover in this class.
Because data is launched and latched on both rising and falling edges of the clock there are additional clock edge transfers we need to analyze.
So setup and hold analysis are analyzed on both rising edge and falling edges of the clock.
We also need to place additional timing exceptions as you’ll see later to cut off analysis of unnecessary clock edge relationships.
And with regards to the PLL, mostly likely in DDR applications you’ll use a PLL on both the receiving and transmitting side of the interface to improve margins on the timing path.
The transmit device provides data on both edges of the clock. Using a mux, data is presented on the transmit device pins using two IO registers data high triggers on the positive edge of the clock and data low triggers on the negative edge of the clock.
On the receiving end there are also two registers being used to capture the data, one for the positive edge of the clock and one for the negative.
The clocks for the interface may be transmitted either center aligned or edge aligned with respect to the clock. If it’s center aligned, the transmit devices needs to shift the clock by 90 degrees to achieve the alignment and in the case of edge alignment the receiving device shifts the phase of the clock by 90 degrees in order for data to be properly captured.
The first three steps are exactly the same as single data rate source synchronous interfaces, You would first create virtual clocks that represent the clock on the upstream device launching the data for the interface. In our diagram this would be the ASSP or application specific standard product shown on the left.
The second step is create a clock on the input clock IO of the FPGA, depending on the alignment of the clock respective to data, you may have to specify a phase here.
Once the clocks are define, define input delays relative to the virtual clock, you would use the appropriate formula to calculate the input delay depending on the type of specification available to you.
The last two steps are specific to DDR interfaces. Here you would need to have additional set input delay statements to represent the data being launched on the falling edge of the clock.
Then the last step is the specify DDR specific exceptions to cut of unnecessary clock edge transfers in a DDR interface.
Then since most DDR interface would incorporate the use of a PLL in source synchronous mode to preserve the alignment of the clock and the data at the interface for the input registers. We also have a generated clock command which creates the clock at the output tap of the PLL based on the input of the PLL with no phase relationship.
These two lines represent what we have to do for the input clock in the center aligned case.
Here’s an example of the clock constraints you would use if the FPGA receives edge-aligned clock and data and then shifts the clock by 90 degrees in the PLL.
Edge alignment means a 0 degree phase shift at the interface so first is the virtual clock declaration then the input clock is created with no phase shift. Finally, there is a generated clock on the PLL output, with a phase shift of 90 degrees.
This is because the clock and data are edge aligned outside the FPGA, then shifted by the PLL, the 90 degree phase shift is on the PLL output, not the clock input.
The first two commands here creates the max and min delays for data launched on the rising edge of the virtual clock as that’s the default for the set input delay command. Notice here the target is the data_in IOs, the max and min delays are values we’ll calculate based on available information and the reference clock is vir clk in.
The third and forth commands listed here denote the same set of max and min delays but for data luanched on the falling edge of the clock as shown with the clock_fall option, here, a add_delay options is also needed because this set of delays is on the same set of IOs and we don’t want to override the existing set of delays we’ve specified in the first two lines of the SDC.
Again the actual value of the min and max delays can be calculated based on the formula listed on the next slide. The values for the data launched on the rising edge vs falling edge of the clock should be exactly the same since they use the same exact data trace.
In the first method, TCO spec of the upstream device along with the board traces are available to you. In this case you will simply add the TCO value of the data respective to the output clock of the source device to the data trace subtracting the clock trace to get the max and min delays.
In the second method, Setup and hold requirements for the FPGAs are given. Here you’ll need to convert these numbers to data delay outside of the FPGA using the formula here. In this case you’ll also need to know the setup and hold latch vs setup launch edges. For center aligned cases, the setup latch minus setup launch and hold launch minus hold latch are both Time Unit Interval divide by 2 or Period divide by 4.
For edge aligned case the setup latch minus setup launch is 0 since it’s the same clock edge while the hold launch minus hold launch is clock period over two.
The last method of calculating input delays is using maximum skew spec, here the input delay max is simply the skew value while the input delay min is the negative skew value.
Again the decision to use any of these formulas is based on the information available to the FPGA designer.
In the same edge transfer case, data launched on a type of clock edge are meant to be launched in on the same edge, in this case only same edge setup calculations need to be made so opposite edge setup analysis can be cut. For hold analysis, we want to make sure the clock transfer does not corrupt the data on the previous clock edge so here for the same edge transfer case, hold analysis between same edges can be cut while hold analysis across opposite edges are preserved.
For opposite edge transfers, the exact opposite is true where setup between same edges can be cut and hold between opposite edges can be cut. We’ll examine this in detail next two slides.
First we cut off setup calculation rising from the virtual clock and falling to the receiving pll clock.
Then we cut off setup calculation falling from the virtual clock going to the rising receiving pll clock.
For hold, same edge transfers are cut, so rise from virtual clock rising to rx pll clock is cut and same is true for fall from vir clock falling to rx pll clock.
Since now setup is on opposite edge, same edge setup analysis needs to be cut and the first two lines of SDC here does that … we false path setup rise from virtual clock and rise to rx pll clock and we also false path setup fall from the virtual clock and fall to rx pll clock.
While for hold, opposite edge analysis needs to be cut so rise from virtual clock and fall to rx pll clock and fall from virtual clock and rise to rx pll clock are cut.
This is true for both center aligned and edge aligned interfaces.
At the top we create a virtual clock to illustrate the launch of the data on the upstream devices.
The second SDC line creates the clock for the input clock, here since we have a center aligned relationship, the rise and fall time are listed to represent the 90 degree phase shift.
The third line here brings the clock across a PL in source synchronous mode, note here there’s no phase specified by the PLL compensation will be part of the clock path.
The derive cock uncertainty command is a Altera specific command that derive different types of clock variations associated with the clocks on the FPGA.
With the clock defined, the first set of set input delay commands derives the maximum and minum delays associated with the data in IOs from the Setup and Hold requirements given, relates them to the virtual clock launch on the positive edge of the clock as that’s the default.
The second set of set input delay commands derives the same set of delays but the clock fall command means that the data delay here is associated with data being launched on the falling edge of the virtual clock and the add delay prevents the overriding of the existing delays.
Finally because this is assumed to be a same dedge captuer interface, opposite edge setup transfer and same edge hold transfer are false pathed because it’s not necessary to analyze those paths.
Follow the steps here to properly constraining the interface, steps 1,2, and 4 are the same as Single Data Rate interfaces.
Here we first create a generated clock on the output IO of the FPGA for the clock out signal, this is so the path to the clk out IO is incorporated in the clock path.
Then, you would specify a output delays for the data out signals relative to the generated output clock based on the formulas we’ll show in a little bit.
After that to denote that the data out signal is being latched in on both the rising and falling edge of the clocks, we need an additional set of output delays with the clock fall and add delya options.
Lastly, add the appropriate exceptions as necessary.
On this slide we see how to constrain a PLL generated clock output. In this example a clock comes into the PLL and the PLL generates two output taps, one going to the data registers and the other is to be used as clock out.
To constrain the clocks in this case, first use create clock for the clock coming into the FPGA, then the next two create generated clock commands would propagate the clock to the two output taps of the PLL, notice the source for both of these are the input of the PLL, finally the last create generated clock command generates a clock at the clock output IO with the second output tap of the PLL as the source.
Alternatively the middle two commands could have been replaced by the derive pll clocks command but you would lose the ability to name each of the clocks at the output tap of the PLL
In this example our first create clock SDC command brings the clock into the FPGA in our case a 10ns clock at the clock in IO.
Then the second and third command uses the generated clock commands to propogate the clock to the output taps of the PLL, notice the source is simply the input of the PLL and the target are the two output taps of the PLL.
Then the last generated clock command creates a clock at the clock out IO of the FPGA.
In this case we’re using two taps of the PLL which gives us the flexibility of controlling the exact phase of the output clock vs the tx data clock which is needed for center aligned interfaces. If we’re using a edge aligned interface, we could’ve also used just the one PLL tap to save PLL resources.
The first two commands here creates the max and min delays for data latched on the rising edge of the output clock as that’s the default for the set output delay command. Notice here the target is the data_out IOs, the max and min delays are values we’ll calculate based on available information and the reference clock is clock out.
The third and forth commands listed here denote the same set of max and min delays but for data latched on the falling edge of the clock as shown with the clock_fall option, here, a add_delay options is also needed because this set of delays is on the same set of IOs and we don’t want to override the existing set of delays we’ve specified in the first two lines of the SDC.
Again the actual value of the min and max delays can be calculated based on the formula listed on the next slide. The values for the data latched on the rising edge vs falling edge of the clock should be exactly the same since they use the same exact data trace.
In the first method, downstream device setup and hold specs along with the board traces are available to you. In this case use the equation to convert these system centric specifications into output delay max and min used by SDC constraints.
In the second method, the desired skew is provided at the FPGA output interface, again we would convert that to output maximum and minimum delays with respect to the latch clock.
These formulas are exactly the same as Single Data Rate interfaces except that the latch and launch edges are now different because we’re in a double data rate interface and Time Unit Interval is now cut in half due to the nature of the interface.
The decision to use any of these formulas is based on the information available to the FPGA designer.
First, as in the DDR input interface, you’ll need to cut off analysis for clock edge transfers that’s not necessary. So for same edge transfer since only setup analysis for same edge transfers are needed opposite edge setup and same edge hold analysis needs to be cut. For opposite edge transfers, same edge setup and opposite edge hold needs to be cut.
Just like Single Data rate interfaces, if your output interface is edge aligned, then you’ll also need a multicycle setup of 0. This is true for both same edge and opposite edge transfers.
Finally because we have an clock being sent off chip. It does not need to be analyze as a data so we’ll need to false path that.
First in same edge transfers, setup needs to be analyzed on the same edge, but because the previous data is on opposite edge, hold needs to be analyzed on the opposite edge, so we would false path setup rising from the data clock falling to clock out and falling from data clock rising to clock out. For hold rising to rising and falling to falling edges are cut.
Finally if your interface is edge aligned with no positive PLL offset adjustment, you’ll need a multicycle setup of 0 applied only to same edge transfers since default setup relation ship of greater than 0 is not what we want, we want the same edge setup relationship. It’s important to not apply this to opposite edge transfers because that would move the default hold and the default hold is already valid we do not want to move it for opposite edge transfers.
So in this example we apply a multicycle setup of 0 for rising to rising clock transfers and also to falling edge to falling edge transfers.
If this is an edge aligned interface with no positive PLL offset, we all also need setup multicycle of 0 on the rise to fall and fall to rise setup edges.
With these exceptions we will only analyze the paths we want at the clock edges we need.
First we assume clocks coming into the FPGAs are configured, we need to use a generated clock command to bring the clock to the output IO so it shows up in the clock path of the output interface analysis.
Then we use the four set output delay commands to calculate the max and min external delay for the data out Ios, remember we need two sets of output delays, one for the rising edge latch of the data and one for the falling edge latch of the data.
Then we use the 4 false path command to cut of the clock edge transfers that do not correspond same edge capture we want, so we cut opposite edge setup and same edge hold calculations.
Then since this is an edge aligned interface, we’ll need a setup multicycle of 0 on the rising to rising and falling to falling edges.
And lastly we use a false path command to cut off the clk out as data analysis.
In the last portion of this presentation we’ll briefly take a look at analyzing DDR source synchronous interface in timequest.
When you’re running reports for DDR interfaces, remember you have the option of specifying specific edge transfers, for example rise from rise to fall from or fall to clock, when running the report timing command so you’ll have absolute control over exactly what’s displayed.
Once you understand the exact edge to edge transfers analyzing DDR interfaces is no different from analyzing any other timing path.
If you would like to learn more about DDR source synchronous interfaces and other timing analysis techniques from an in-person instructor please sign up for the advanced timing analysis instructor-led training.
If you like to review the source synchronous interface concepts see the constraining source synchronous interfaces online training.
If you would like to learn more about tcl syntax, see our free introduction to tcl online training class.
There are some addition resources for constraining DDR source synchronous interfaces on the altera wiki site and finally altera provides app note 433 which documents how to constrain and analyze source synchronous interfaces.