Constraining Double Data Rate Source Synchronous Interfaces

Menu
Notes

1. Constraining Double Data Rate Source Synchronous Interfaces
1.1. Objectives
1.2. Prerequisites
2. DDR Overview
2.1. Source Synchronous DDR Interfaces
2.2. Double Data Rate Complexities
2.3. DDR Input and Output Logic
3. Input Interface
3.1. DDR Input Interface Constraints
3.2. Input Clock – Center-Aligned Clock
3.3. Input Clock – Edge-Aligned Clock
3.4. Setting DDR Input Delay Constraints
3.5. DDR Input Delay Value (Summary Table)
3.6. Timing Exceptions for DDR Inputs
3.7. Same-Edge Transfer False Paths
3.8. Opposite-Edge Transfer False Paths
3.9. DDR Input Timing Example
4. Output Interface
4.1. DDR Output Interface Constraints
4.2. PLL Generated Clock Output
4.3. Toggling Clock Output Register
4.4. Setting DDR Output Delay Constraints
4.5. DDR Output Delay Value (Summary Table)
4.6. Timing Exceptions for DDR Outputs
4.7. Same-Edge Transfers False Paths
4.8. Opposite Edge Transfer False Paths
4.9. Output Clock False Path
4.10. DDR Output Timing Example
5. Analysis
5.1. Output Rising-Edge Setup Timing Report
5.2. Output Rising-Edge Hold Timing Report
6. Additional Information
7. Learn More Through Technical Training
8. Give Us Your Feedback
9. Thank You

Welcome to Altera’s constraining double data rate source synchronous interfaces online training. My name is Karl

The objectives of this presentation are simple. At the end of the presentation, you will be able to constrain a double data rate source synchronous interface both inputs and outputs. And you will be able to use the TimeQuest timing analyzer to report and analyze timing on double data rate source synchronous interfaces.

Very Important, there is some basic information you should know, and some skills you should have, to get the most out of this presentation. You should understand source synchronous interface theory, including how data is transferred and the general structure of double data rate IO cells. You should know how to constrain single data rate source synchronous interfaces in SDC and specifically how to derive input and output data delays based on the specs given. In this course, we will not derive those in detail. You should understand static timing analysis concepts like slack and input and output delays. You should know how to create SDC constraints for clocks and IOs. This includes base and generated clocks and input and output delays. You should also have experience using the TimeQuest timing analyzer. You should know how to use the TimeQuest timing analyzer to analyze timing on designs.
If you need to review any of the prerequisite material, please look at the existing TimeQuest timing analyzer and constraining source synchronous interface online trainings.

This is the agenda for the presentation. I’ll start with a quick introduction to double data rate source synchronous interfaces and describe some of the challenges involved in constraining them. Then I’ll cover the various ways of constraining input and output interfaces. Finally, I’ll finish up with information about how to analyze source synchronous interface timing with the TimeQuest timing analyzer.

Let’s now go through a quick over view

Double data rate source synchronous interface is a special type source synchronous interface where data is sent on both rising and falling edges of the clock. Like it’s single data rate counter part, clock and data are sent together from the source device and clock and data can be either edge aligned <click> or center aligned. Where as single data rate center aligned interfaces have the clock shifted 180 degrees at the interface to get the clock center aligned with respect to the data, in double data rate cases, since data is launched on both rising and falling edges and the data valid window is twice as short, in DDR center aligned interfaces have clocks shifted 90 degrees.

In many cases, people think of DDR as a memory interfaces, while this is true, DDR interfaces are not restricted to memory interfaces and we are going to talk about it from a generic point of view. If you happen to use one of Altera DDR memory controller IPs, you don’t need to write your own SDC for those memory interfaces, you can continue using the generated .SDC files included with those IPs, but if you choose to take a look in side it you’ll find constructs very similar to those we’ll cover in this class.

In addition too everything you need to be aware of when constraining single data rate source synchronous interfaces. When constraining double data rate interfaces, there are some additional complexities you need to be aware of.

Because data is launched and latched on both rising and falling edges of the clock there are additional clock edge transfers we need to analyze.
So setup and hold analysis are analyzed on both rising edge and falling edges of the clock.
We also need to place additional timing exceptions as you’ll see later to cut off analysis of unnecessary clock edge relationships.

And with regards to the PLL, mostly likely in DDR applications you’ll use a PLL on both the receiving and transmitting side of the interface to improve margins on the timing path.

Here’s a quick example of a DDR interface.

The transmit device provides data on both edges of the clock. Using a mux, data is presented on the transmit device pins using two IO registers data high triggers on the positive edge of the clock and data low triggers on the negative edge of the clock.

On the receiving end there are also two registers being used to capture the data, one for the positive edge of the clock and one for the negative.

The clocks for the interface may be transmitted either center aligned or edge aligned with respect to the clock. If it’s center aligned, the transmit devices needs to shift the clock by 90 degrees to achieve the alignment and in the case of edge alignment the receiving device shifts the phase of the clock by 90 degrees in order for data to be properly captured.

Now that we understand DDR interface basics. Lets start talking about constraining DDR input interfaces to the FPGA.

Here’s an overview of the steps on how to constrain double data rate source synchronous interfaces.

The first three steps are exactly the same as single data rate source synchronous interfaces, You would first create virtual clocks that represent the clock on the upstream device launching the data for the interface. In our diagram this would be the ASSP or application specific standard product shown on the left.

The second step is create a clock on the input clock IO of the FPGA, depending on the alignment of the clock respective to data, you may have to specify a phase here.

Once the clocks are define, define input delays relative to the virtual clock, you would use the appropriate formula to calculate the input delay depending on the type of specification available to you.

The last two steps are specific to DDR interfaces. Here you would need to have additional set input delay statements to represent the data being launched on the falling edge of the clock.

Then the last step is the specify DDR specific exceptions to cut of unnecessary clock edge transfers in a DDR interface.

They way you define an input clock for the FPGA depend on the type of clock data alignment you have. In a center aligned case, when you create the clock, make sure the virtual clock does not have a phase, and you have to incorporate the 90 degree phase shift through the use of the create clock waveform option for the input clock. In the example presented here, the clock has a period of 10 ns so a 90 degree phase shift would have a rise time of 2.5 and fall time of 7.5 as denoted in the first line of the SDC on this slide.

Then since most DDR interface would incorporate the use of a PLL in source synchronous mode to preserve the alignment of the clock and the data at the interface for the input registers. We also have a generated clock command which creates the clock at the output tap of the PLL based on the input of the PLL with no phase relationship.

These two lines represent what we have to do for the input clock in the center aligned case.

If your DDR input clock arrives edge aligned, you have to use the PLL to apply a phase shift to capture the data. In this case use the create generated clock phase option to specify that 90 degree phase shift amount. Again, do not specify the phase shift with the virtual clock.
Here’s an example of the clock constraints you would use if the FPGA receives edge-aligned clock and data and then shifts the clock by 90 degrees in the PLL.
Edge alignment means a 0 degree phase shift at the interface so first is the virtual clock declaration then the input clock is created with no phase shift. Finally, there is a generated clock on the PLL output, with a phase shift of 90 degrees.
This is because the clock and data are edge aligned outside the FPGA, then shifted by the PLL, the 90 degree phase shift is on the PLL output, not the clock input.

Once the virtual and input clocks are defined, we’ll have to set the input delay for the data IO with respect to the virtual clock. The command to specify the maximum and minimum delays are stated here. We need two sets of max and min delays, the first set to denote that data coming in are launched by the rising edge of the clock and the second set denote that data being launched by the falling edge of the clock.

The first two commands here creates the max and min delays for data launched on the rising edge of the virtual clock as that’s the default for the set input delay command. Notice here the target is the data_in IOs, the max and min delays are values we’ll calculate based on available information and the reference clock is vir clk in.

The third and forth commands listed here denote the same set of max and min delays but for data luanched on the falling edge of the clock as shown with the clock_fall option, here, a add_delay options is also needed because this set of delays is on the same set of IOs and we don’t want to override the existing set of delays we’ve specified in the first two lines of the SDC.

Again the actual value of the min and max delays can be calculated based on the formula listed on the next slide. The values for the data launched on the rising edge vs falling edge of the clock should be exactly the same since they use the same exact data trace.

Here’s a summary table of the three different ways to calculate max and min delay based on specifications available. We won’t derive each of these as that’s done in the Constraining Source Synchronous Interfaces online training. Please refer to that training if you want to know how each of the values are derived.

In the first method, TCO spec of the upstream device along with the board traces are available to you. In this case you will simply add the TCO value of the data respective to the output clock of the source device to the data trace subtracting the clock trace to get the max and min delays.

In the second method, Setup and hold requirements for the FPGAs are given. Here you’ll need to convert these numbers to data delay outside of the FPGA using the formula here. In this case you’ll also need to know the setup and hold latch vs setup launch edges. For center aligned cases, the setup latch minus setup launch and hold launch minus hold latch are both Time Unit Interval divide by 2 or Period divide by 4.

For edge aligned case the setup latch minus setup launch is 0 since it’s the same clock edge while the hold launch minus hold launch is clock period over two.

The last method of calculating input delays is using maximum skew spec, here the input delay max is simply the skew value while the input delay min is the negative skew value.

Again the decision to use any of these formulas is based on the information available to the FPGA designer.

For Double Data Rate input interfaces, exceptions are also needed. Because data is being launched on both rising and falling edges and data is being latched in on both rising and falling edges across the same line. Time quest by default analyzes all of the possible edge transfers which is rising to falling, falling to rising, rising to rising, and falling to falling for both setup and hold. But this is an overkill since for DDR interfaces can be either same edge transfer or opposite edge transfer not both.

In the same edge transfer case, data launched on a type of clock edge are meant to be launched in on the same edge, in this case only same edge setup calculations need to be made so opposite edge setup analysis can be cut. For hold analysis, we want to make sure the clock transfer does not corrupt the data on the previous clock edge so here for the same edge transfer case, hold analysis between same edges can be cut while hold analysis across opposite edges are preserved.

For opposite edge transfers, the exact opposite is true where setup between same edges can be cut and hold between opposite edges can be cut. We’ll examine this in detail next two slides.

Here we illustrate same edge transfer in detail. Data launched by the rising edge of the clock are meant to be capture by the rising edge of the clock while data launched on the falling edge of the clock is meant to be captured on the falling edge of the clock. And regardless of whether we have a center aligned or edge aligned analysis, only rise to rise and fall to fall setup analysis needs to be done and only rise to fall and fall to rise hold analysis needs to be done. So to setup the false paths we use the 4 commands here.

First we cut off setup calculation rising from the virtual clock and falling to the receiving pll clock.
Then we cut off setup calculation falling from the virtual clock going to the rising receiving pll clock.
For hold, same edge transfers are cut, so rise from virtual clock rising to rx pll clock is cut and same is true for fall from vir clock falling to rx pll clock.

For interface meant for opposite edge transfers meaning data launched on the rising edge of the clock is latched in on the falling edge of the clock and visa versa. You need the false path statements listed here.

Since now setup is on opposite edge, same edge setup analysis needs to be cut and the first two lines of SDC here does that … we false path setup rise from virtual clock and rise to rx pll clock and we also false path setup fall from the virtual clock and fall to rx pll clock.

While for hold, opposite edge analysis needs to be cut so rise from virtual clock and fall to rx pll clock and fall from virtual clock and rise to rx pll clock are cut.

This is true for both center aligned and edge aligned interfaces.

Here we see the complete DDR input timing examine using a center aligned interface with same edge capture where Setup and Hold requirements of the FPGA are given.

At the top we create a virtual clock to illustrate the launch of the data on the upstream devices.
The second SDC line creates the clock for the input clock, here since we have a center aligned relationship, the rise and fall time are listed to represent the 90 degree phase shift.
The third line here brings the clock across a PL in source synchronous mode, note here there’s no phase specified by the PLL compensation will be part of the clock path.
The derive cock uncertainty command is a Altera specific command that derive different types of clock variations associated with the clocks on the FPGA.

With the clock defined, the first set of set input delay commands derives the maximum and minum delays associated with the data in IOs from the Setup and Hold requirements given, relates them to the virtual clock launch on the positive edge of the clock as that’s the default.

The second set of set input delay commands derives the same set of delays but the clock fall command means that the data delay here is associated with data being launched on the falling edge of the virtual clock and the add delay prevents the overriding of the existing delays.

Finally because this is assumed to be a same dedge captuer interface, opposite edge setup transfer and same edge hold transfer are false pathed because it’s not necessary to analyze those paths.

With an understanding of the input interfaces, let’s now look at constraining double data rate output interfaces.

On the output side of the interface, the FPGA is outputting both a data out signal and a clock out signal.

Follow the steps here to properly constraining the interface, steps 1,2, and 4 are the same as Single Data Rate interfaces.

Here we first create a generated clock on the output IO of the FPGA for the clock out signal, this is so the path to the clk out IO is incorporated in the clock path.
Then, you would specify a output delays for the data out signals relative to the generated output clock based on the formulas we’ll show in a little bit.
After that to denote that the data out signal is being latched in on both the rising and falling edge of the clocks, we need an additional set of output delays with the clock fall and add delya options.

Lastly, add the appropriate exceptions as necessary.

For DDR source synchronous interfaces, you can generate the clock output one of two ways, either with the PLL or with a DDIO output registers.

On this slide we see how to constrain a PLL generated clock output. In this example a clock comes into the PLL and the PLL generates two output taps, one going to the data registers and the other is to be used as clock out.

To constrain the clocks in this case, first use create clock for the clock coming into the FPGA, then the next two create generated clock commands would propagate the clock to the two output taps of the PLL, notice the source for both of these are the input of the PLL, finally the last create generated clock command generates a clock at the clock output IO with the second output tap of the PLL as the source.

Alternatively the middle two commands could have been replaced by the derive pll clocks command but you would lose the ability to name each of the clocks at the output tap of the PLL

The second way to generate a clock output is to use the DDIO registers, since DDIO registers are what’s used to launch the data, using the same type of circuitry for the clock would ensure excellent timing between the clock and the data.

In this example our first create clock SDC command brings the clock into the FPGA in our case a 10ns clock at the clock in IO.
Then the second and third command uses the generated clock commands to propogate the clock to the output taps of the PLL, notice the source is simply the input of the PLL and the target are the two output taps of the PLL.

Then the last generated clock command creates a clock at the clock out IO of the FPGA.

In this case we’re using two taps of the PLL which gives us the flexibility of controlling the exact phase of the output clock vs the tx data clock which is needed for center aligned interfaces. If we’re using a edge aligned interface, we could’ve also used just the one PLL tap to save PLL resources.

Much like the input side, once the output clock is defined, we’ll have to set the output delay for the data IO with respect to the output clock. The command to specify the maximum and minimum delays are stated here. We need two sets of max and min delays, the first set to denote that data going out are latched by the rising edge of the clock and the second set denote that data is being latched by the falling edge of the clock.

The first two commands here creates the max and min delays for data latched on the rising edge of the output clock as that’s the default for the set output delay command. Notice here the target is the data_out IOs, the max and min delays are values we’ll calculate based on available information and the reference clock is clock out.

The third and forth commands listed here denote the same set of max and min delays but for data latched on the falling edge of the clock as shown with the clock_fall option, here, a add_delay options is also needed because this set of delays is on the same set of IOs and we don’t want to override the existing set of delays we’ve specified in the first two lines of the SDC.

Again the actual value of the min and max delays can be calculated based on the formula listed on the next slide. The values for the data latched on the rising edge vs falling edge of the clock should be exactly the same since they use the same exact data trace.

Here’s a summary table of the three different ways to calculate max and min output delay based on specifications available. We won’t derive each of these as that’s done in the Constraining Source Synchronous Interfaces online training. Please refer to that training if you want to know how each of the values are derived.

In the first method, downstream device setup and hold specs along with the board traces are available to you. In this case use the equation to convert these system centric specifications into output delay max and min used by SDC constraints.

In the second method, the desired skew is provided at the FPGA output interface, again we would convert that to output maximum and minimum delays with respect to the latch clock.

These formulas are exactly the same as Single Data Rate interfaces except that the latch and launch edges are now different because we’re in a double data rate interface and Time Unit Interval is now cut in half due to the nature of the interface.

The decision to use any of these formulas is based on the information available to the FPGA designer.

Once the delays are set, you’ll need to add several timing exceptions as listed here.

First, as in the DDR input interface, you’ll need to cut off analysis for clock edge transfers that’s not necessary. So for same edge transfer since only setup analysis for same edge transfers are needed opposite edge setup and same edge hold analysis needs to be cut. For opposite edge transfers, same edge setup and opposite edge hold needs to be cut.

Just like Single Data rate interfaces, if your output interface is edge aligned, then you’ll also need a multicycle setup of 0. This is true for both same edge and opposite edge transfers.

Finally because we have an clock being sent off chip. It does not need to be analyze as a data so we’ll need to false path that.

This slide illustrates Same Edge exceptions needed.

First in same edge transfers, setup needs to be analyzed on the same edge, but because the previous data is on opposite edge, hold needs to be analyzed on the opposite edge, so we would false path setup rising from the data clock falling to clock out and falling from data clock rising to clock out. For hold rising to rising and falling to falling edges are cut.

Finally if your interface is edge aligned with no positive PLL offset adjustment, you’ll need a multicycle setup of 0 applied only to same edge transfers since default setup relation ship of greater than 0 is not what we want, we want the same edge setup relationship. It’s important to not apply this to opposite edge transfers because that would move the default hold and the default hold is already valid we do not want to move it for opposite edge transfers.

So in this example we apply a multicycle setup of 0 for rising to rising clock transfers and also to falling edge to falling edge transfers.

For opposite edge transfers, everything here is very similar to same edge transfers but because we’re doing opposite edge transfers, we only care about setup analysis of the opposite edge transfer and hold analysis of same edge transfers. So we use false path to cut off rise to rise and fall to fall setup calculations and we also cut rise to fall and fall to rise hold calculations.

If this is an edge aligned interface with no positive PLL offset, we all also need setup multicycle of 0 on the rise to fall and fall to rise setup edges.

With these exceptions we will only analyze the paths we want at the clock edges we need.

Just like the corresponding single data rate interface outputs, In DDR case we also have a output clock going to device IO. In this case timequest things all outputs are going to registered logic and expects a delay delay constraing or other wise it’s going to flag this as an unconstrained path. Since I the clock out case, the clock out is only used as a clock, we can cut off the data path analysis using the set false path command going to the clk out port as shown here.

Here’s a complete example of constraining a edge aligned same edge capture interface with fpga skew requirements given.

First we assume clocks coming into the FPGAs are configured, we need to use a generated clock command to bring the clock to the output IO so it shows up in the clock path of the output interface analysis.

Then we use the four set output delay commands to calculate the max and min external delay for the data out Ios, remember we need two sets of output delays, one for the rising edge latch of the data and one for the falling edge latch of the data.

Then we use the 4 false path command to cut of the clock edge transfers that do not correspond same edge capture we want, so we cut opposite edge setup and same edge hold calculations.

Then since this is an edge aligned interface, we’ll need a setup multicycle of 0 on the rising to rising and falling to falling edges.

And lastly we use a false path command to cut off the clk out as data analysis.

Now that we have covered both DDR inputs and output constraints.
In the last portion of this presentation we’ll briefly take a look at analyzing DDR source synchronous interface in timequest.

Here’s a report timing report for a DDR interface output setup timing report. Notice here that this is a center aligned interface so the launch edge and latchedge are offset by 90degrees. And also notice the output delay specified shows up in the data required path in both the data path and also the waveform tabs.

When you’re running reports for DDR interfaces, remember you have the option of specifying specific edge transfers, for example rise from rise to fall from or fall to clock, when running the report timing command so you’ll have absolute control over exactly what’s displayed.

Here is an output interface hold timing report for a center aligned interface, notice again here the latch and launch and relationship is exactly what we want and the minimum output delay is shown in data required path.

Once you understand the exact edge to edge transfers analyzing DDR interfaces is no different from analyzing any other timing path.

Now we’ve completed our presentation. Here are some additional references that maybe relevant to you.

If you would like to learn more about DDR source synchronous interfaces and other timing analysis techniques from an in-person instructor please sign up for the advanced timing analysis instructor-led training.

If you like to review the source synchronous interface concepts see the constraining source synchronous interfaces online training.
If you would like to learn more about tcl syntax, see our free introduction to tcl online training class.

There are some addition resources for constraining DDR source synchronous interfaces on the altera wiki site and finally altera provides app note 433 which documents how to constrain and analyze source synchronous interfaces.

If you would like to receive additional training on other topic involving FPGA design. Please see altera.com/training for a complete list of our offerings. Altera offers instructor led trainings, virtual class room trainings with a live instructor over the internet, and over 150 free online training.

One more thing: when you registered for this on-line training, you should have received a link to a short survey where you can provide feedback on the course. We’d greatly appreciate it if you’d fill out that survey now. We’re constantly updating and improving our training materials, and your feedback helps us create the materials that you want!

Thank you very much for attending the Constraining Double Data Rate Source Synchronous Interfaces online training. My Name is Karl and best of luck with all of your designs.

FINISH

SUBMIT

Title

Title

Title