Adapted from the webinar How to Debug PCI Express Power Management and Dynamic Link Behaviors by Patrick Connally and Gordon Getty

Introduction

Power management is a key consideration for PCI Express® (PCIe®). Consequently, PCIe specifies the L1 low-power state. When a link is in L1, no data transfer takes place in either direction, so that a PCIe device in the L1 state consumes less power than when in the active L0 state.

L1 substates (designated L1.1 and L1.2, with the original L1 renamed L1.0) offer even deeper power savings than L1, an especially important feature for laptops, tablets, and other battery-powered devices. But all power-saving states impose an exit latency—the time required for a device to resume normal full-power operation when leaving the low-power state. Device designers must measure the latency between low-power states to evaluate the tradeoffs and optimize performance.

This is difficult to do using either a protocol analyzer or an oscilloscope alone. Protocol analyzers can trigger on sequences of events and take very long captures, but they cannot capture analog events. Oscilloscopes can capture analog events, but the acquisitions are very short and difficult to correlate to protocol events. Fortunately, the two instruments complement each other very well to examine events triggered by higher layer processes, but which have an effect on the physical layer—like L1 substates. The Teledyne LeCroy CrossSynch™ PHY for PCIe software framework synchronizes triggering, acquisition and analysis on the two instruments to provide total link visibility.

Examples in this application note demonstrate how to perform timing measurements for L1 substates based on logic states in the data-link layer or the physical layer’s logical subblock and the corresponding waveforms at the physical layer’s electrical subblock.

Overview of L1 Substates

A device enters the L1 state through one of two mechanisms: Active State Power Management (ASPM) or PCI Power Management (PCI-PM). A device will indicate its support for L1 substates and entry mechanisms in its configuration space, and it will make use of the clock-request signal (CLKREQ#, asserted when low) for exit and entry into an L1 substate.

The data-link layer in the PCIe protocol stack handles link-management tasks such as the initialization of flow-control credits, the update of flow-control credits as the link is active in the L0 state, and the acknowledge and negative-acknowledgement mechanisms to make sure the packets maintain integrity across the link. The data-link layer also manages requests to enter L1 and its substates for low-power operation.

Configuration space indicating support for L1 substates and entry mechanisms.
Figure 1: Configuration space indicating support for L1 substates and entry mechanisms.

Measuring L1 Substate Timing

To make L1 substate timing measurements, you employ a PCIe protocol analyzer such as the Teledyne LeCroy Summit T54 along with an oscilloscope such as the Teledyne LeCroy LabMaster 10Zi-A. In addition, an interposer monitors communications with the device under test and provides data to the protocol analyzer as well as the oscilloscope.

Test setup for L1 substate measurements using CrossSynch PHY for PCIe software.
Figure 2: Test setup for L1 substate measurements using CrossSynch PHY for PCIe software.

Specific L1 substate timing measurements we will make with this configuration include:

  • Time from CLKREQ# assertion to the time of valid system reference clock (System REFCLK)
  • Time from System REFCLK becoming valid until the beginning of TS1 training sequences
  • Time from System REFCLK becoming valid until the return to L0

Setting Up Trigger and Acquisition

Configure the oscilloscope to acquire four signals: CLKREQ# (C1), lane 0 upstream data (C2), lane 0 downstream data (C3) and System REFCLK (C4). Place all four signals in a time-locked multi-zoom group so that cursors placed on any one trace will measure the differential on each.

Because we care only about the time around the clock request assertion and deassertion, the oscilloscope can be set to sequence mode acquisition of two segments.

Set up the protocol analyzer to trigger the oscilloscope on entry to and exit from the L1 substates by monitoring for the clock-request (CLKREQ#) signal being deasserted and asserted. Entry into the L1 substates correlates with toggling of the clock-request signal, with CLKREQ# being desasserted, at which point the protocol analyzer triggers the oscilloscope the first time. When it is time to exit the L1 substate, the protocol analyzer triggers the oscilloscope again, allowing it to capture CLKREQ# being asserted.

Trigger setup to capture entry to and exit from L1.
Figure 3: Trigger setup to capture entry to and exit from L1.

Probing

A Teledyne LeCroy PE210UIA-4PHY interposer, connected to a Summit T54 protocol analyzer, was used to monitor the communications between the PCIe 4.0 M.2 solid-state drive and host system. This interposer provides various probing points for the oscilloscope to capture analog signals. The CLKREQ# signal was probed with a standard 10:1 passive probe, the lane 0 upstream and lane 0 downstream signals were probed with DH series 30 GHz active differential probes, and the system REFCLK signal was connected directly to a 50ohm coaxial input on the oscilloscope.

Oscilloscope acquiring CLKREQ#, lane 0 upstream data, lane 0 downstream data and System REFCLK.
Figure 4: Oscilloscope acquiring CLKREQ#, lane 0 upstream data, lane 0 downstream data and System REFCLK.
Entry to the L1 substate correlated across both traces and occurring when CLKREQ# is deasserted.
Figure 5: Entry to the L1 substate correlated across both traces and occurring when CLKREQ# is deasserted.
Exit from the L1 substate correlated across both traces and occurring when CLKREQ# is asserted.
Figure 6: Exit from the L1 substate correlated across both traces and occurring when CLKREQ# is asserted.

Measuring CLKREQ# Assertion to Valid REFCLK

One key L1 substate timing measurement is the time from the clock request being asserted (indicating the exit from L1.2 and moving from L1.2 idle to L1.2 exit) to the point when the reference clock starts.

Place oscilloscope horizontal cursors on the clock-request line and the reference clock line to make an accurate measurement of the time from the segment 1 trigger point on the CLKREQ# trace to the start of the REFCLK signal—in this case, 301 ms.

Using cursors, measure the time from trigger on CLKREQ# to start of valid REFCLK.
Figure 7: Using cursors, measure the time from trigger on CLKREQ# to start of valid REFCLK.

Measuring Valid REFCLK to TS1 Packet

Another key measurement is the time from which the reference clock is valid until the start of recovery is identified, with recovery initiating training sequences that are observable on the protocol-analyzer trace.

By selecting the packet on the protocol analyzer display, we can see the time at which REFCLK becomes valid on the oscilloscope trace, and then we can also see exactly when the first training sequence (TS1) happened. Again, horizontal cursors are used to measure the time between the two—in this case 53 ms.

Using cursors, measure the time from trigger on CLKREQ# to start of valid REFCLK.
Figure 8: Using cusors, measure the time from valid REFCLK to TS1 packet (identified using protocol analyzer trace).

Measuring Valid REFCLK to L0

A third key timing measurement is the time from a valid REFCLK to the start of the L0. This measurement requires a three-step process: find the start of the valid REFCLK, search for the SDS packet (which indicates the return to L0), then get the time of the SDS packet and compare it with the start of REFCLK.

The oscilloscope can show when the reference clock starts, but it cannot readily indicate when the link reaches L0. After training sequences, a device will send an SDS (Start of Data Stream), and at that point the link will reach L0. Use the protocol analyzer search capability to find the next SDS ordered set. This will provide the time stamp of when the link entered L0 to allow for an accurate cursor measurement from valid REFCLK to L0—in this case 384 ms.

Using cursors, measure time from valid REFCLK to start of L0 (identified by searching for SDS).
Figure 9: Using cursors, measure time from valid REFCLK to start of L0 (identified by searching for SDS).
Using protocol analyzer search to find the next SDS ordered set.
Figure 10: Using protocol analyzer search to find the next SDS ordered set.
Getting the time of the SDS packet and comparing it to the start of REFCLK.
Figure 11: Getting the time of the SDS packet and comparing it to the start of REFCLK.

Conclusion

The L1 substates can provide considerable power savings on PCIe links. The combination of the oscilloscope, protocol analyzer, interposer and CrossSync PHY for PCIe software provides an effective way of navigating to any point in link operation to conduct detailed analysis of the timing of functions such as the entry into or exit from the L1 substates.

More information on the Teledyne LeCroy CrossSync PHY software can be found on our website.