Introduction

In 2010, PCI Express 3.0 introduced the concept of Link Equalization (LEQ) to the PCI Express (PCIe) specification. At PCIe 1.0 and PCIe 2.0 data rates (2.5GT/s and 5GT/s respectively) signal integrity was a relatively straightforward consideration. Data could be sent and received across the channel between two link partners with minimal impact on the end-to-end system bit error ratio (BER). However, PCIe 3.0 and PCIe 4.0 transmit data over the same infrastructure as PCIe 1.0 and PCIe 2.0. At nearly double and triple the data rate of PCIe 2.0, transmitting data at 8GT/s and 16GT/s over a channel not designed for signal integrity poses many challenges, hence the need for Link Equalization.

Link Equalization involves a precisely timed dynamic negotiation between an Upstream Port and Downstream Port to optimally tune both transmitter (Tx) and receiver (Rx) equalization filters so that the link will operate at a BER of 10-12– 1 bit error in 1,000,000,000,000 bits received – or better. This application note will henceforth refer to the Tx and Rx equalizers as TxEQ and RxEQ respectively.

The equalization negotiation occurs simultaneously in both the electrical and protocol level. Viewing the PCIe bus activity on a protocol analyzer or an oscilloscope alone doesn’t tell the entire story. As we will see, at the protocol level, the protocol analyzer trace may report a device’s firmware did indeed change the TxEQ filter settings to what was requested by the link partner, only to observe the contrary when viewing the electrical response on the oscilloscope. Having the ability to capture the entire electrical equalization link training sequence, and to then decode the waveform to view a comprehensive protocol trace in one time-synchronous application provides the engineer with a detailed perspective on the behavior of the PCIe ecosystem. Teledyne LeCroy’s ProtoSync application software provides this capability.

Link Training Status State Machine (LTSSM) Overview – Speed and Equalization Negotiation

The PCIe 3.0 and PCIe 4.0 Link Equalization process occurs at run time. When a Downstream Port is partnered with an Upstream Port, the designer of the product has no prior knowledge about the channel length and environment it will operate in. The Downstream Port could be used on a motherboard with short traces, or implemented someplace on a 24” server blade for example. Therefore, the equalization settings on both ends of the link must be configurable when the system is powered on to compensate for signal impairments due to channel effects. Once powered on, the link will transition through each state in the LTSSM to the Recovery State. In Recovery, the Upstream and Downstream Ports transmit and receive data using the link and lane established in the Configuration State. Recovery allows the link to adjust the data rate of operation, re-establish symbol and bit lock, block alignment, and to enter into loopback.

Figure 1:

LTSSM Flow Diagram

The Recovery.Equalization sub-state is where the dynamic equalization tuning occurs in four phases for PCIe 3.0 implementations.

Figure 2:

Four Phases of Recovery.Equalization

For PCIe 4.0 implementations, the process of changing speeds to 16GT/s begins with a speed change from 2.5GT/s to 8GT/s as described by the four phases of the Recovery.Equalization procedure. If the Downstream Port enters into Recovery with the intention of executing the 8GT/s LEQ procedure, it must not advertise 16GT/s speed capability while in Recovery. Instead, if the Downstream Port intends to operate at 16GT/s, upon the successful execution of the 8GT/s equalization procedure, the Downstream Port must make the transition from Recovery to L0, then transition back to Recovery and advertise 16GT/s capability. During link negotiation, if both the Downstream and Upstream Ports advertise 16GT/s capability, the 16GT/s LEQ procedure is performed through Recovery.Equalization bypassing Phase 0 (Only a speed change from 8GT/s to 16GT/s is necessary – the link does not retrain down to 2.5GT/s).

Figure 3:

Teledyne LeCroy PCIe Protocol Analysis 16GT/s LEQ Trace

Figure 4:

Teledyne LeCroy PCIe Protocol Analysis 16GT/s LEQ LTSSM State Machine Transition History

Link Training Characterization and Debug

The equalization negotiation occurs simultaneously in both the electrical and protocol level. Teledyne LeCroy’s ProtoSync software allows the user to capture the electrical signal on the oscilloscope and decode the trace using Teledyne LeCroy’s PCIe Protocol Analysis software for an in-depth low level understanding of the protocol behavior. Using the oscilloscope application, a side-by-side electrical and protocol view of the PCIe traffic can be rendered simply by ticking the ‘Show Protocol Analysis’ box within the ‘Decode Setup’ dialog .

Figure 5:

Side-by-side view of the electrical and protocol response of a PCIe Gen 3 LEQ negotiation using ProtoSync within the oscilloscope application.

Figure 5 depicts the dynamic link negotiation between a Downstream and Upstream Port from an electrical and protocol perspective simultaneously. The Downstream Port requested 5 variations of cursor coefficients before settling on pre-cursor C-1 = 4, cursor C0 = 16, and post-cursor C+1 = 4. The variation of the signal amplitude on Trace F1 clearly illustrates the Upstream Port changing its TxEQ settings 5 times in response to the Downstream Port’s requests in Phase 2. In addition, it is clear by looking at the amplitude difference on Trace F3 that the Downstream Port changed its TxEQ setting from Preset 7 to Preset 1 in Phase 3. The protocol trace is dynamically linked to the electrical waveform captured on the oscilloscope. The row labeled 57 TS Packets 6004-6120 on the protocol trace depicts the instant when the Upstream Port physically changed its TxEQ settings to pre-cursor C-1 = 4, cursor C0 = 16, and post-cursor C+1 = 4 at the request of the Downstream Port as shown in Figure 6. Clicking on the collapsed view of the 57 TS Packets 6004-6120 row will automatically zoom the electrical waveform on the oscilloscope.

Figure 6:

Upstream Port changes its TxEQ to Pre-cursor C-1 = 4, Cursor C0 = 16, and Post-Cursor C+1 = 4

It is important to analyze both the protocol and electrical traffic to gain a complete understanding of what is happening on the bus. In some cases, the device firmware can be programmed to request a specific TxEQ setting isn’t implementing the request correctly at run time. For example, in order to operate at or better than a BER of 10-12, a developer’s Downstream Port requires the Upstream Port to set its TxEQ to Pre-cursor C-1 = 3, Cursor C0 = 21, and Post-cursor C+1 = 0. Figure 7 depicts the Downstream Port requesting the correct TxEQ settings. Z1 is a zoom of the Upstream Port’s electrical waveform and clearly shows the signal being equalized. The link negotiation completed successfully. As a result, the system will operate with a low probability of bit errors.

Figure 7:

Upstream Port changes its TxEQ to Pre-cursor C-1 = 3, Cursor C0 = 21, and Post-Cursor C+1 = 0

The same Downstream Port can, at times, request incorrect equalization settings from the Upstream Port. This results in a BER of roughly 10-10 when measured on Teledyne LeCroy’s Protocol Enabled Receiver Transmitter Tolerance Tester, PeRT3. This is an unacceptable BER for effective data transfer between both ends of the link. Figure 8 shows the same Downstream Port in Figure 7 negotiating the link, however, in this case, the Downstream Port does not request the correct equalization settings from its link partner. In fact, no equalization was requested.

Figure 8:

Downstream Port requests no equalization resulting in an unacceptable system BER.

The analysis in the previous examples points to a problem with the Downstream Port’s firmware. For characterization purposes, having an oscilloscope with extremely long acquisition memory allows the user to capture and decode the entire link equalization training sequence. A 500us capture allows the engineer to view every TxEQ setting requested by the Downstream Port. In the case shown in Figure 9, the Downstream Port requested 42 different TxEQ combinations before settling on an optimal combination.

Figure 9:

Deep oscilloscope acquisition memory provides in-depth insight into electrical and protocol behavior. Every TxEQ setting combination requested by the link partner is captured and decoded within the oscilloscope application.

In another example of faulty Downstream Port firmware behavior, the protocol trace appears to behave as expected. However, the electrical signal indicates no electrical response change to the Upstream Port’s request for the Downstream Port to change its TxEQ settings from Preset 7 to Preset 1.

Figure 10:

Protocol confirms TxEQ change, but the electrical response reflects no change on the signal.

Figure 11 depicts another example of how the protocol trace can indicate normal behavior when the physical layer is problematic. In this case, a request is sent for preset 4, but the device erroneously attenuates the output to 20 mV -- a device malfunction which is diagnosed with a combination of synchronized protocol and physical layer views on the oscilloscope.

Figure 11:

Upstream Port requests Preset 4, but Downstream Port erroneously attenuates its output signal to 20mV.

Many developers would like to understand which TxEQ settings result in optimal system performance for their design. Teledyne LeCroy offers a full range of debug and characterization tools to help the engineer understand the limitations and best operating conditions for their implementation. The first instrument of its kind, the PeRT3 is a protocol aware BERT that can communicate in and understand PCIe protocol. The PeRT3 can link train and negotiate either an Upstream or Downstream Port up to 8GT/s and is equipped with a full range of debug capability to aid the engineer with their design and characterization efforts. The PeRT3 has the ability to negotiate a PCIe Gen 3 Upstream or Downstream port into loopback, set and sweep signal Amplitude, De-Emphasis and Pre-Shoot, three bands of Random Jitter, three bands of Sinusoidal Jitter, Differential Mode Interference and Common Mode Interference.

Figure 12:

PeRT user-adjustable signal sources enable an end user to sweep jitter frequencies to create a BER map under various operating conditions.

Figure 13 is an example of sweeping de-emphasis and pre-shoot as a function of BER with the PeRT. This allows the user to understand the best and worst case TxEQ operating conditions for their design.

Figure 13:

PeRT TxEQ sweep as a function of BER for design optimization.

Summary

Teledyne LeCroy’s PCIe link equalization test and measurement tools provide visibility into the interplay between the protocol and electrical layers. Having access to this low level understanding of the bus operation is crucial for characterization and debug purposes. Both Root Complex and Endpoint must work in tandem to effectively negotiate the optimal receiver and transmitter equalizer filter settings in order to minimize the probability of bit errors on either side of the link. Teledyne LeCroy’s Oscilloscope toolset, ProtoSync software, and PeRT3 provides the ability to simultaneously view and analyze the protocol and electrical response of the PCIe bus providing quick and valuable insight into complex PCIe system behavior.