AN 846: Intel Stratix 10 Forward Error Correction
Introduction
Forward error correction is a powerful method of correcting errors that can occur on a serial link. Although very useful, it can be costly in both area and power when implemented in soft logic. For this reason, E-Tile and H-Tile devices provide hardened FEC blocks to address many important applications, such as:
- 10 Gigabit Ethernet (GbE) (H-Tile)
- 25GbE (E-Tile)
- 100GbE (E-Tile)
- 24.3 Gbps Common Public Radio Interface (CPRI) (E-Tile)
- 128 gigabit fibre channel (GFC) (E-Tile)
H-Tile | E-Tile | |
---|---|---|
Fire Code—NRZ | Reed Solomon (RS) Code—NRZ | Reed Solomon (RS) Code—PAM4 |
|
|
|
Necessity of Error Correction
Transmitting data introduces many challenges. Among these challenges is the noise in a communication channel, which can result in errors in the transmission of bits. There are many types of noise.
Noise Type | Errors |
---|---|
Random or shot | Uncorrelated errors |
Crosstalk | Correlated and uncorrelated errors |
Return loss | Mostly correlated errors |
Insertion loss | Uncorrelated errors |
Decision Feedback Equalizer (DFE) error propagation | Burst errors |
Solutions to Transmission Errors
Parity and Cyclic Redundancy Check
Error Correction Code (ECC) Block Coding
- Hamming code
- Low density parity check codes (LDPC)
- Convolutional codes
- Viterbi
- Various FEC codes
Error Correction Codes
In binary codes, the encoder and decoder operate on a bit basis. The 10GBASE-KR Fire Code FEC is an example of a binary code. In non-binary codes, the encoder and decoder operate on a byte or symbol basis. Symbols may be any number of bits. Galois finite field arithmetic is used and the Reed Solomon code is an example of a non-binary code.
Cyclic block codes are defined by a generator polynomial g(x). Encoding consists of adding a set of parity bits or symbols onto the data to create a code word, also called a codeword or a block. The parity is the remainder of the block from the polynomial division of the data bits by g(x). This is easily implemented using a linear feedback shift register (LFSR). Error detection and correction calculates the syndrome of the received code word. The syndrome is the difference between the locally-generated and received parity. If the syndrome is zero, the code word is correct. If the syndrome is non-zero, then the syndrome can determine the most likely error.
What is FEC?
The receiver analyzes the check bit information to locate and correct errors. This correction allows systems to operate at higher bit error rates (BER).
While FEC provides a performance increase, it also introduces increased power consumption, increased latency, and an increased number of gates.
The extra data added to the real data protects the real data from getting corrupted.
FEC Definitions
- Reed Solomon
- Bose-Chadhuri-Hocquenghem (BCH)
- Concatenated codes
The type you select depends on:
- The overhead your design permits
- Burst handling capability
- Gain versus complexity (number of gates, memory, power, and so on)
- Latency considerations
There are bit error and burst limits to each code. FEC complexity increases non-linearly as you approach the Shannon limit. The Shannon limit, sometimes called Shannon's theorem, establishes that for any given degree of noise contamination of a communication channel, it is possible to communicate discrete data (digital information) nearly error-free up to a computable maximum rate through the channel.
FEC allows detection and correction of X bits or symbols in a block. There are limits to its correcting capability.
Code Type | Parameter Description |
---|---|
Binary | n = block length |
k = message length | |
Non-binary (RS-FEC, for example) | n = block length |
k = message length | |
t = correctable symbols (n-k)/2 | |
m = symbol size |
FEC Selection
This section also includes an Ethernet benchmarking example to help you select the best FEC mode for the 100GbE application in this application note.
Key Considerations when Choosing a FEC
The primary considerations when choosing a FEC include:
- Hardware complexity
- Coding gain
- Latency
- Power
Coding Gain
Generally, the performance of a transmission line is characterized by the BER, where BER is the ratio of bits that have errors with respect to the total number of bits received over a transmission line. Additionally, the performance of a data transmission code is characterized as a function of the average energy per data bit (Eb) to noise power spectral density (N0) of the waveform. Eb can be expressed as the signal power (S) times the bit time (Tb). N0 can be expressed as the noise power (N) divided by the bandwidth. Therefore, Eb/N0 is equal to the SNR (bandwidth/bit rate).
The effectiveness of a FEC code is determined by the reduction in the Eb/N0 needed to ensure the specific BER. Coding gain is the reduction in the required Eb/N0 at the same BER for an uncoded versus a coded system. For example, an uncoded communication system operates at a BER of 10−5 at an Eb/N0 of 10 dB. Adding a strong FEC code to this communication system could reduce the ratio of Eb/N0.
Net coding gain (NCG) accounts for the bandwidth expansion needed for the FEC code, and this is associated with increased noise in the receiver side. Coding gain does not account for this. This means that the data rate had to increase by a certain percentage in order to transmit both the real data and the extra data (FEC).
The lower the latency, the better it is from the application’s perspective. However, a small latency limits the block size of the FEC code, which in turn limits the performance of the code, and can also impact the decoder complexity.
The higher the clocking rate (the more redundancy you add, for example), the more coding gain you can achieve.
The larger the block size, the higher the coding gain, but also the higher the processing latency.
More parallelism reduces processing latency, but increases hardware complexity.
FEC in Intel Stratix 10 H-Tile Devices
Fire Code (802.3ap, 10GBASE-KR)
The code encodes 2080 bits of payload (or information symbols) and adds 32 bits of overhead (or parity symbols). The code is systematic—meaning that the information symbols are not disturbed in the encoder, and the parity symbols are added separately to the end of each block.
The (2112,2080) code is constructed by shortening the cyclic code (42987, 42955). The shortened cyclic code (2112,2080) is guaranteed to correct an error burst of up to 11 bits per block. It is a systematic code that is well suited for correction of the burst errors typical in a backplane channel resulting from error propagation in the receive equalizer.
FEC Block Format
At the end of each block there is 32-bit overhead or parity check bits. Transmission is from left to right within each row, and from top to bottom between rows. The payload bits carry the information symbols from the PCS layer.
T0 | 64-bit payload Word 0 | T1 | 64-bit payload Word 1 | T2 | 64-bit payload Word 2 | T3 | 64-bit payload Word 3 |
T4 | 64-bit payload Word 4 | T5 | 64-bit payload Word 5 | T6 | 64-bit payload Word 6 | T7 | 64-bit payload Word 7 |
T8 | 64-bit payload Word 8 | T9 | 64-bit payload Word 9 | T10 | 64-bit payload Word 10 | T11 | 64-bit payload Word 11 |
T12 | 64-bit payload Word 12 | T13 | 64-bit payload Word 13 | T14 | 64-bit payload Word 14 | T15 | 64-bit payload Word 15 |
T16 | 64-bit payload Word 16 | T17 | 64-bit payload Word 17 | T18 | 64-bit payload Word 18 | T19 | 64-bit payload Word 19 |
T20 | 64-bit payload Word 20 | T21 | 64-bit payload Word 21 | T22 | 64-bit payload Word 22 | T23 | 64-bit payload Word 23 |
T24 | 64-bit payload Word 24 | T25 | 64-bit payload Word 25 | T26 | 64-bit payload Word 26 | T27 | 64-bit payload Word 27 |
T28 | 64-bit payload Word 28 | T29 | 64-bit payload Word 29 | T30 | 64-bit payload Word 30 | T31 | 64-bit payload Word 31 |
32 parity bits |
Total FEC block length = (32 × 65) + 32 = 2112 bits.
FEC Block Composition
Instead, the FEC sublayer compresses the sync bits from the 64B/66B encoded data provided by the PCS to accommodate the addition of 32 parity check bits for every block of 2080 bits.
The BASE-R 64B/66B PCS maps 64 bits of scrambled payload and 2 bits of unscrambled synchronization header into 66-bit encoded blocks. The 2-bit synchronization header allows the PCS synchronization process to establish the 64B/66B block boundaries. The synchronization header is 01 for data blocks and 10 for control blocks. The synchronization header is the only position in the PCS block that always contains a transition, and this feature of the code establishes the 64B/66B block boundaries.
The FEC sublayer compresses the 2 bits of the synchronization header to one transcode bit. The transcode bit carries the state of BASE-R synchronization bits for the associated payload. This is achieved by eliminating the first bit in 64B/66B block, which is also the first synchronization bit, and preserving the second bit. The value of the second bit defines the value of the removed first bit uniquely, because it is always an inversion of the first bit. The transcode bits are further scrambled (as explained in IEEE 802.3ap Clause 74.7.4.2) to ensure DC balance.
The 32 sequential 64B/66B blocks are transcoded in this fashion, and then 32 bits of FEC parity are computed for them. The 32 transcoded words and the 32 FEC parity bits comprise a FEC block. The error detection property of the FEC cyclic code establishes block synchronization at FEC block boundaries at the receiver. If decoding passes successfully, the FEC decoder produces 32 65-bit words, the first decoded bit of each word being the transcode bit. Then, the inversion of the transcode bit constructs the first synchronization bit in the 64B/66B code, and the value of the second synchronization bit is equal to the transcode bit.
FEC Sublayer for BASE-R PHYs
L-Tile/H-Tile Implementation
The KR FEC blocks in the Enhanced PCS are designed in accordance with the 10GBASE-KR FEC and 40GBASE-KR FEC specification of the IEEE 802.3 specification. The KR FEC implements the FEC as a sublayer between the PCS and PMA sublayers.
The FEC sublayer is optional and you can bypass it. When used, it provides additional margin to allow for variations in manufacturing and environmental conditions. FEC can achieve the following objectives:
- Support a forward error correction mechanism for the 10GBASE-R/KR and 40GBASE-R/KR protocols
- Support full duplex mode of the Ethernet MAC
- Support the PCS, PMA, and Physical Medium Dependent (PMD) sublayers defined for the 10GBASE-R/KR and 40GBASE-R/KR protocols
KR FEC improves the BER performance of the system.
Transcode Encoder
The transcode bit is generated from a combination of 66 bits after the 64B/66B encoder which consists of a 2-bit synchronization header (S0 and S1) and a 64-bit payload (D0, D1,…, D63). To ensure a DC-balanced pattern, the transcode word is generated by performing an XOR function on the second synchronization bit S1 and payload bit D8. The transcode bit becomes the LSB of the 65-bit pattern output of the transcode encoder.
KR FEC Encoder
The code is a shortened cyclic code (2112, 2080). For each block of 2080 message bits, the encoder generates another 32 parity checks to form a total of 2112 bits. The generator polynomial is:
g(x) = x32 + x23 + x21 + x11 + x2 +1
KR FEC Scrambler
KR FEC TX Gearbox
The KR FEC TX gearbox aligns with the FEC block. Because the encoder output (also the scrambler output) has its unique word size pattern, the gearbox is specially designed to handle that pattern.
KR FEC RX Gearbox
Transcode Decoder
FEC in Intel Stratix 10 E-Tile Devices
Types of RS-FEC
RS-FEC | Parameter Name | NRZ PHY | PAM4 PHY | |
---|---|---|---|---|
FEC encoding | — | RS (528, 514, t=7, m=10) | RS (544, 514, t=15, m=10) | |
Total symbols | n | 528 | 544 | |
Message symbols | k | 514 | 514 | |
Parity symbols | n-k | 14 | 30 | |
Bits per symbol | m | 10 | 10 | |
Correctable symbols | t | 7 | 15 | |
Coding gain | DFE | — | 4.9 dB @ 1E-15 | 5.4 dB @ 1E-15 |
Random | — | 5.3 dB @ 1E-12 | 6.5 dB @ 1E-12 |
RS (528, 514, t = 7, m = 10)
If 1 bit or all the m bits of a symbol are corrupt, this accounts for one symbol error. Symbols correlate well into burst errors.
RS-FEC can correct any seven single bit errors.
RS (528, 514) can correct up to seven symbols. If all bits are error bits, for all seven symbols, then the total number of correctable bits is 70.
RS (544, 514, t = 15, m = 10)
If 1 bit or all the M bits of a symbol are corrupt, this accounts for one symbol error. Symbols correlate well into burst errors.
RS-FEC can correct any 15 single bit errors.
RS (544, 514) can correct up to 15 symbols. If all bits are error bits, for all 15 symbols, then the total number of correctable bits is 150.
Supported RS-FEC Modes in E-Tile Devices
Client Type | FEC Code | Number of Physical Lanes | Marker Size (bits) | Synchronization Type |
---|---|---|---|---|
100GbE | RS528 | 4 | 1285 | AM |
100GbE with KP-FEC | RS544 | 2 | 1285 | AM |
128GFC | RS528 | 4 | 514 | AM |
25GbE | RS528 | 1 | 257 | CWM |
32GFC | RS528 | 1 | — | SnT |
Legend:
- RS528 = RS(528, 514)
- RS544 = RS(544, 514)
- AM = Alignment Markers
- CWM = Codeword Marker
- SnT = Scramble-and-Test
The RS-FEC core supports the following standards:
- 100GbE: IEEE 802.3 Clause 91
- 100GbE with KP-FEC: IEEE 802.3 Clause 91
- 128GFC: Fibre Channel Framing and Signaling - 4 (FC-FCS-4) Clause 5.6
- 25GbE: IEEE 802.3 Clause 108
- 32GFC: Fibre Channel Framing and Signaling - 4 (FC-FCS-4) Clause 5.4
100GbE with KP-FEC uses two physical PAM4 coded lanes, also called, 100 Gigabit Attachment Unit Interface (CAUI-2). It uses the RS(544,514). The two physical lanes are supported by bit-multiplexing the RS-FEC Core’s four PMA lanes pairwise outside of the RS-FEC Core. The remaining defined clients use the RS(528,514) FEC.
In the CPRI standard, the CPRI FEC refers to 32GFC. CPRI is like 32GFC except for the line rate, which is 24 Gbps.
100GBASE-KR4
100GBASE-KR4 is a non-binary code (528, 514, 7, 10). 100GBASE-KR4 features:
- 514 data symbols per codeword
- 528 data plus parity symbols per codeword
- Codeword size = 10 * 528 = 5280 bits
- Correcting capability up to seven symbols within a codeword
- 5 to 5.5 dB gain
- NRZ modulation
- 25.78125 Gbps bit rate
- BER of 10-12 or better (after FEC correction)
100GBASE-KR4 Mapping (IEEE802.3bj Clause 91)
- RS (528, 514) FEC
- Four lanes running at 25.78125 Gbps
- No rate expansion
- Data is stripped per symbol across four lanes
100GBASE-KP4
100GBASE-KP4 is a non-binary code (544, 514, 15, 10). 100GBASE-KP4 features:
- 514 data symbols per codeword
- 544 data plus parity symbols per codeword
- Codeword size = 10 * 544 = 5440 bits
- Correcting capability up to 15 symbols within a codeword
- 6 to 6.5 dB gain
- PAM4 modulation
- 26.5625 Gbps bit rate
- BER of 10-12 or better (after FEC correction)
100GBASE-KP4 Mapping (IEEE802.3bj Clause 91)
- RS (544, 514, 15, 10) FEC
- 26.5625 Gbps
- 3.03% rate expansion
- Data is stripped per symbol across four lanes
RS(544, 514) requires additional room to accommodate 5440 bits instead of 5280 bits. After transcoding, it must additionally make room for approximately 3% more bits of overhead. The precise overhead is calculated as 1/33; new rate = old rate * 34/33. This result is overspeed for PAM4. For example:
- Payload data rate = 50 Gbps
- Encoding it to 66b encoding: 50*66/64 = 51.5625 Gbps
- Adding FEC expansion: 51.5625*(34/33) = 53.125 Gbps
Intel® Stratix® 10TX devices do not support the 100GBASE-KP4 physical medium dependent (PMD).
FEC Decoders
Decoder Type | Description |
---|---|
Hard decision FEC | Makes exact decisions of 1s or 0s. Good gain versus complexity. Broadly used in most applications. 10GBASE-KR/KR4/KP4 are all examples of hard decision FECs. Used in Intel® Stratix® 10 TX devices. |
Soft decision FEC | Makes decisions based on probabilities of a 1 or 0. Provides higher gain and allows you to get closer to the Shannon limit. Complex design used in higher end optical transport networking (OTN) systems, specifically coherent systems. Normally used in cellular communications using the Viterbi algorithm. |
Specifications
The IEEE802.3ap specification defines an insertion loss and return loss of 25 dB at 5.15625 GHz. The 1e-12 BER requirement is a system specification that is met with or without FEC.
The IEEE802.3bj specification specifies 100GBASE-KR4 for 100 Gbps operation using NRZ over four differential pairs where the insertion loss does not exceed 35 dB at 12.9 GHz. 100GBASE-KR4 uses:
- The PCS defined in Clause 82
- The RS-FEC defined in Clause 91
- The PMA defined in Clause 83
- The PMD defined in Clause 93
IEEE802.3bj also specifies 100GBASE-KP4 for 100 Gbps operation using PAM4 over two differential pairs where the insertion loss does not exceed 33 dB at 7 GHz. 100GBASE-KP4 uses:
- The PCS defined in Clause 82
- The RS-FEC defined in Clause 91
- The PMA and PMD defined in Clause 94
The CEI 56G long reach (LR) specification discusses multiple FECs, but the standard is KP4 FEC with PAM4. The 1e-15 BER requirement is a system specification met with FEC.
Functions Within the RS-FEC Sublayer
Lane Block Synchronization
It then uses the synchronization headers to obtain lock to the 66-bit blocks in each bit stream and outputs 66-bit blocks.
Alignment Lock and Deskew
Alignment marker lock identifies the PCS lane number received on a particular lane of the service interface. After alignment marker lock is achieved on all 20 lanes, all inter-lane skew is removed. The RS-FEC transmit function supports a maximum skew of 49 ns between PCS lanes, and a maximum skew variation of 400 ps.
Lane Re-order
The RS-FEC transmit function orders the PCS lanes according to the PCS lane number.
Alignment Marker Removal
64B/66B to 256B/257B Transcoder
If all four incoming blocks are data blocks:
- Remove the 2-bit headers of all four 66-bit data blocks.
- Append a header bit of 1 to the four 64-bit data payloads.
If there is at least one control block among the four-incoming blocks:
- Remove the 2-bit headers of all four incoming 66-bit blocks
- Append a header bit of 0 to the four payloads of the four blocks.
RS-FEC deletes the second 4-bit nibble in the block type field (BTF) of the first control block in a transcoded block. RS-FEC retains the first 4-bit nibble in the BTF of the first control block (indicating the type).
- Add the 4-bit header x1, x2, x3, or x4 following the overall header bit 0, where:
- x1 = Data block
- x2, x3, and x4 = Control block
FEC Implementation Using the E-Tile Channel Placement Tool
The E-Tile Channel Placement Tool allows you to swiftly plan protocol placements in the product prior to reading comprehensive documentation and implementing designs in the Intel® Quartus® Prime software.
The Excel-based E-Tile Channel Placement Tool, supplemented with Instruction, Legend, Revision and Protocols tabs, is self-sustaining, and available for download.
- 100GbE EHIP_CORE (25G * 4) MAC + PCS with RS (528, 514)
- 100GbE EHIP_CORE (50G * 2) MAC + PCS with RS (544, 514)
FEC in Practical Application
Datacenter Applications Scenario
Consider a typical datacenter topology.
The interconnection between the spine switches and the lead switches is a 10G/40G/100G backplane.
25GbE is a proposed standard for Ethernet connectivity in a datacenter application space, and takes advantage of the technology defined for 100GbE as four 25 Gbps lanes running on four fibers or copper pairs.
Hardware Results
Test Design
FEC performance in the Intel® Stratix® 10 device was measured using a 25GbE design running RS (528, 514) FEC.
Test Setup
The test configuration included:
- An Intel® Stratix® 10 TX signal integrity development kit board using the E-Tile device
- FCI backplane (Megtron 6 material)
- Variable ISI box
The FCI backplane is connected to the E-Tile device on one lane, starting with 28 dB loss (error free even without FEC). Attenuation is increased on only one channel using the variable ISI box. This provides fine control over the insertion loss.
Insertion Loss Plots
FEC Statistics Tool
Hardware Data
Total IL (dB) 1 | Number of corrected bits | Number of Corrected Symbols | Number of Corrected Codewords | Number of Uncorrected Codewords | PRE-FEC BER 2 | Estimated POST-FEC BER 2 |
---|---|---|---|---|---|---|
38.7 | 0 | 0 | 0 | 0 | 0 | 0 |
44.3 | 2 | 2 | 2 | 0 | 6.48E-14 | 0 |
44.4 | 36 | 36 | 36 | 0 | 1.17E-12 | 0 |
44.8 | 91 | 91 | 91 | 0 | 2.95E-12 | 0 |
45.2 | 116 | 116 | 116 | 0 | 3.76E-12 | 0 |
45.8 | 418 | 418 | 418 | 0 | 1.35E-11 | 0 |
46.2 | 1131 | 1130 | 1130 | 0 | 3.66E-11 | 0 |
46.6 | 2469 | 2469 | 2469 | 0 | 7.99E-11 | 0 |
47 | 17984 | 17978 | 17978 | 0 | 5.83E-10 | 0 |
47.4 | 220808 | 220580 | 220519 | 0 | 7.13E-09 | 0 |
47.8 | 901459 | 899306 | 898544 | 0 | 2.91E-08 | 0 |
48.2 | 2567073 | 2557116 | 2551742 | 0 | 8.31E-08 | 0 |
48.6 | 6665926 | 6628124 | 6593734 | 0 | 2.15E-07 | 0 |
49 | 31511961 | 31252439 | 30527903 | 0 | 1.02E-06 | 0 |
49.2 | 113176637 | 111898208 | 103314714 | 0 | 3.66E-06 | 0 |
49.4 | 194993850 | 192151109 | 167763244 | 1 | 6.32E-06 | 2.72E-13 |
49.6 | 457728720 | 448886002 | 331518303 | 374 | 1.48E-05 | 1.02E-10 |
49.8 | 928669603 | 904545799 | 510475435 | 50083 | 3.00E-05 | 1.36E-08 |
Comparison to the Specification
Specification (802.3bj) | Hardware Measurement |
---|---|
4.9 to 5.3 dB | 5.1 dB 3 |
Note the following:
- Post FEC BER is an estimate from uncorrectable code words.
- Received bits at the PRBS are normalized to account for PRBS payload + MAC padding (preamble, start codeword delimiter, and so on).
- Total IL = SI development kit loss + backplane loss + cable loss + variable ISI box loss.
- Total IL is a first order loss calculated by summing all the individual losses.
These hardware results demonstrate that the Intel FEC solution complies with the specification, making it a compelling solution for your Ethernet, CPRI, or Fibre Channel designs.
References
For more information about forward error correction, refer to the following resources:
- J. Schrum, YouTube video : "Error Detection and Correction 3: Forward Error Correction," 2016. [Online].
- A. Davis, EE Times : "Design How-To Forward Error Correction," 1998. [Online].
- N. R. Wagner, "The Laws of Cryptography: The Hamming Code for Error Correction," [Online].
- Optical Transport Network (OTN) Tutorial. ITU. [Online].
Document Revision History for AN 846: Intel Stratix 10 Forward Error Correction
Document Version | Changes |
---|---|
2018.07.02 | Initial release. |