AN 851: Incremental Block-Based Compilation Tutorial: for Intel Arria 10 FPGA Development Board
Version Information
Updated for: |
---|
Intel® Quartus® Prime Design Suite 19.2 |
1. AN 851: Incremental Block-Based Compilation Tutorial for Intel Arria 10 FPGA Development Board
Incremental block-based compilation enables you to preserve satisfactory compilation results for specific FPGA core logic design blocks (or logic that comprises a hierarchical design instance), and then reuse those results in subsequent compilations. You assign the hierarchical instance as a design partition, which you can then preserve following successful compilation. The preserved design partition must only include core resources (such as LUTs, registers, memory blocks, and DSP blocks), and cannot include any periphery resources.
This tutorial uses an Intel® Arria® 10 design example to show you how to improve the predictability of results and reduce design iterations by:
- Preserving a design partition after synthesis or final compilation, and reusing the preserved results in subsequent compilations.
- Targeting only specific design partitions for optimization, while leaving other design partitions unchanged.
1.1. Tutorial Design Overview
This tutorial includes a prepared design example to demonstrate incremental block-based compilation. You can download the design example to follow along with the tutorial steps in the Intel® Quartus® Prime Pro Edition software, as Downloading Tutorial Design Files describes.
The example top-level design instantiates a PLL that generates a 550 MHz fast clock (CLK1), and a 100 MHz slow clock (CLK2). The top-level design also instantiates 4 blinking LED modules that drive LED[3:0] every 2, 4, 8, and 16 seconds, respectively.
To increase the design size in the Intel® FPGA, the design example also instantiates 20 duplicate instances of an OpenCores* design.1
The duplicate OpenCores* design instances have the following characteristics:
- The design implements each instance in parallel.
- I/O wrapper logic is present to reduce the number of I/O pins that the larger design requires.
- No timing-critical paths exist between the instances and the wrapper logic.
1.2. Downloading Tutorial Design Files
- Download and extract the tutorial design files at:
- View the extracted tutorial design file directory structure. The completed directory contains the final versions of all the files for the tutorial. You can use the files in the completed directory for comparison to confirm successful completion of the tutorial steps. The scripts folder contains the original files.
The tutorial directory includes the following files:
File Name | Description |
---|---|
top.sv |
Top-level file that instantiates the iopll, big_partition1_top, blinking_led_2s, blinking_led_4s, blinking_led_8s, and blinking_led_16s instances. The file also includes logic to drive LED[4:7] as a single, shifting bit. |
top.qpf | Intel® Quartus® Prime project file that stores project name and revisions. |
top.qsf | Intel® Quartus® Prime settings file containing the project assignments and settings. |
big_partition1_top.v | Design file that instantiates 20 instances of an OpenCores* design. |
blinking_led_2s.sv | Design file that includes logic to drive LED[0] every two seconds. |
blinking_led_4s.sv | Design file that includes logic to drive LED[1] every four seconds. |
blinking_led_8s.sv | Design file that includes logic to drive LED[2] every eight seconds. |
blinking_led_16s.sv | Design file that includes logic to drive LED[3] every 16 seconds. |
blinking_led.sdc | A Synopsys Design Constraints file that creates the 50 MHz clock. |
iopll.ip | The IOPLL Intel® FPGA IP instantiated in top. The IP uses 50 MHz as the reference clock frequency, and generates 100 MHz and 550 MHz clocks. |
tx_dcfifo.ip | The FIFO Intel® FPGA IP instantiated in blinking_led_2s, blinking_led_4s, blinking_led_8s, and blinking_led_16s instances. This is a dual clock FIFO with a write clock of 550 MHz and read clock of 100 MHz. |
compile.tcl | A bash script that compiles the tutorial design at the command line. |
partitions.tcl | A tcl script that includes the assignments to create the partitions that the tutorial describes. Running the script writes the assignments to the Intel® Quartus® Prime Settings File (.qsf). |
preserve.tcl | A tcl script that includes the assignments to preserve the partitions that the tutorial describes. Running the script writes the assignments to the .qsf. |
report_timing.tcl | A tcl script that includes Intel® Quartus® Prime Timing Analyzer commands that generate summary of paths reports with least positive or worst slack in each partition, along with commands to report timing for two specific nodes in the three partitions that meet timing requirements. |
- To restore all of the tutorial files to their original state, run scripts/restore.tcl from the a10_pcie_devkit_ibbc/tutorial directory.
-
To compile the tutorial design from command line, run compile.tcl from the a10_pcie_devkit_ibbc/tutorial directory.
1.3. Incremental Block-Based Compilation Tutorial
Process Description
You determine which design blocks might be suitable for preservation and optimization by running a flat compilation and timing analysis to identify the most timing-critical blocks. You then preserve the partitions for blocks that meet timing, so that the Compiler can reuse the successful results for those partitions in subsequent compilations. When you preserve a partition at the final snapshot, the Compiler preserves the final device resource utilization, placement, routing, and hold time fix-up.
After optimizing the timing-critical design blocks, you can preserve those partitions and focus optimization on other parts of the design.
Tutorial Steps
This tutorial includes the following steps:
- Step 1: Compile the Flat Design
- Step 2: Identify Timing-Critical Design Blocks
- Step 3: Create Design Partitions
- Step 4: Analyze Timing of the Partitioned Design
- Step 5: Preserve Timing-Closed Partitions
- Step 6: Optimize Timing-Critical Design Blocks
- Step 7: Verify Preservation and Optimized Results
- (Optional) Step 8: Device Programming
- (Optional) Step 9: Verify Results in Hardware
1.3.1. Step 1: Compile the Flat Design
- In the Intel® Quartus® Prime Pro Edition software, click File > Open Project and open the /tutorial/top.qpf project file.
-
To compile the flat design, click Compile Design on the Compilation Dashboard. A check mark
appears as each stage completes. The compilation may require 30 minutes or more,
depending on your system.
Figure 3. Compilation Dashboard
1.3.2. Step 2: Identify Timing-Critical Design Blocks
- To open the Timing Analyzer, click Tools > Timing Analyzer .
-
In the Timing Analyzer, on the
Tasks pane, double-click Update Timing
Netlist to load the final timing netlist generated during the
compilation.
Figure 4. Timing Analyzer Tasks Pane
-
To run the report_timing.tcl script to identify any failing paths in the
timing-critical design blocks, type the following command in the Console window.
If not already visible, click View > Console in the Timing Analyzer to
display the Console. The script runs commands to identify any failing
paths.
source report_timing.tcl
The tcl script runs the report_timing command, capturing timing for the top 100 paths with the worst slack. The script is also preconfigured to capture timing between specific nodes for some of the design blocks. You analyze timing for these nodes later in this tutorial.
Figure 5. Timing Analyzer Report FoldersTable 2. Timing Analysis Reports that report_timing.tcl Generates Timing Analysis Folder Generated For Timing Reports Show inst_big u_big_partition1_top Analysis of top 100 paths with worst slack inst_i1 u_blinking_led_i1 inst_i2 u_blinking_led_i2 inst_i3 u_blinking_led_i3 inst_i4 u_blinking_led_i4 inst_big_path1 u_big_partition1_top Analysis of timing between specific nodes for some partitions inst_i1_path1 u_blinking_led_i1 inst_i2_path1 u_blinking_led_i2 - In the inst_big folder, right-click the Slow 900 mV 100C Model report, and then click Generate in All Corners. Repeat this step for the inst_i1, inst_i2, inst_i3, and inst_i4 folders.
- View the Multi Corner Summary report that generates under each folder in the Report pane. Reports in red text in the inst_i3 and inst_i4 folders indicate timing-critical design blocks with failing paths.
-
Open the Multi Corner
Summary report in the inst_i3 folder. Check the values in the From Node and To
Node fields. Analysis indicates that the failing paths in
u_blinking_led_i3 are in the 64-bit
counter. This counter counts the number of cycles equivalent to 8s, where each
cycle is of 1.818 ns.
Figure 6. Multi Corner Summary for u_blinking_led_i3Note: Due to processor, memory, or OS variations, the slack values in this tutorial are only for reference and may vary from the actual values you observe.
-
Open the Multi Corner
Summary report in the inst_i4 folder. Check the values in the From Node and To
Node fields. Analysis indicates that the failing paths in
u_blinking_led_i4 are in the 64-bit
counter. This counter counts the number of cycles equivalent to 16s, where each
cycle is of 1.818 ns.
Figure 7. Multi Corner Summary for u_blinking_led_i4The timing analysis identifies u_blinking_led_i3 and u_blinking_led_i4 as timing-critical design blocks for optimization.
1.3.3. Step 3: Create Design Partitions
-
In the
Project Navigator, right-click u_blinking_led_i1
in the Hierarchy tab, point to
Design Partition, and select the Default
partition Type. A design partition icon
appears next to each instance you assign.
Figure 8. Create Design Partitions
- Repeat step 1 to create partitions for the u_big_partition1_top, u_blinking_led_i2, u_blinking_led_i3, and u_blinking_led_i4 instances.
-
If the Design Partitions Window is not already open, click
Assignments > Design Partitions Window. The Design Partitions Window lists the partitions you define,
along with the root partition (|) the Compiler automatically creates for each
project.
Figure 9. Design Partitions Window
- To compile the partitioned design, click Compile Design on the Compilation Dashboard.
1.3.4. Step 4: Analyze Timing of the Partitioned Design
- Click Tools > Timing Analyzer , and then double-click Update Timing Netlist.
-
Run the report_timing.tcl script to regenerate the timing analysis
reports for failing paths:
source report_timing.tcl
The timing analysis reports in the inst_i3 and inst_i4 folders remain red, indicating that u_blinking_led_i3 and u_blinking_led_i4 still do not meet timing requirements in the partitioned design. Later in this tutorial you optimize these design blocks to ensure that they meet timing requirements in the flat design.Figure 10. u_blinking_led_i3 and u_blinking_led_i4 Violate Timing Requirements - In the inst_big folder, right-click the Slow 900 mV 100C Model report, and then click Generate in All Corners. Repeat this step for the inst_big1_path1, inst_i1_path1, and inst_i2_path1 folders.
-
View the Multi Corner
Summary reports in the inst_big1_path1, inst_i1_path1, and inst_i2_path1 timing analysis folders. The report_timing.tcl script includes commands to
generate these reports for pre-selected nodes. Note the slack and placement
results for the paths in 3 partitions, as the following figure shows. Later in
the tutorial you compare these results with those after compilation of the final
snapshot.
Figure 11. Multi Corner Summary for u_big_partition1_top
1.3.5. Step 5: Preserve Timing-Closed Partitions
The Compiler preserves the final device utilization, placement, routing, and hold time fix-up for the partitions that you preserve. The preserved partition becomes the source for each subsequent compilation.
- Click Assignments > Design Partitions Window.
-
Select final as the
Preservation Level for the
blinking_led_2s, big_partition1_top, and
blinking_led_4s partitions.
Figure 12. Setting Partition Preservation Levels
1.3.6. Step 6: Optimize Timing-Critical Design Blocks
-
Open blinking_led_8s.sv
in a text editor and uncomment the following lines:
//reg [31:0] count_msb; //reg [31:0] count_lsb; //reg [1:0] state=2'b00; //always_ff @(posedge fast_clock) begin // fifo_wreq <= 1'b0; // case (state) // 2'b00: begin // count_lsb <= count_lsb + 1; // if (count_lsb[31:0]==32'hFFFFFFFF) begin // state <= 2'b01; // end // end // // 2'b01: begin // count_lsb <= count_lsb + 1; // if (count_lsb[31:0]==32'h064962EC) begin // count_lsb <= 1; // value_in <= !value_in; // fifo_wreq <= 1'b1; // state <= 2'b00; // end // end // // default: begin // count_msb <= 0; // count_lsb <= 0; // state <= 2'b00; // end // endcase //end
-
In blinking_led_8s.sv,
comment the following lines and save the changes:
reg [63:0] count_in; always_ff @(posedge fast_clock) begin count_in <= count_in + 1; fifo_wreq <= 1'b0; if (count_in==64'd4400440044) begin count_in <= 0; value_in <= !value_in; fifo_wreq <= 1'b1; end end
-
Open blinking_led_16s.sv
and uncomment the following lines:
//reg [31:0] count_msb; //reg [31:0] count_lsb; //reg [1:0] state=2'b00; // //always_ff @(posedge fast_clock) begin // fifo_wreq <= 1'b0; // // case (state) // 2'b00: begin // count_lsb <= count_lsb + 1; // if (count_lsb[31:0]==32'hFFFFFFFF) begin // state <= 2'b01; // end // end // // 2'b01: begin // count_lsb <= count_lsb + 1; // if (count_lsb[31:0]==32'hFFFFFFFF) begin // state <= 2'b10; // end // end // // 2'b10: begin // count_lsb <= count_lsb + 1; // if (count_lsb[31:0]==32'h0C92C5D8) begin // count_lsb <= 1; // value_in <= !value_in; // fifo_wreq <= 1'b1; // state <= 2'b00; // end // end // // default: begin // count_msb <= 0; // count_lsb <= 0; // state <= 2'b00; // end // // endcase //end
-
In blinking_led_16s.sv,
comment the following lines and save the changes:
reg [63:0] count_in; always_ff @(posedge fast_clock) begin count_in <= count_in + 1; fifo_wreq <= 1'b0; if (count_in==64'd8800880088) begin count_in <= 0; value_in <= !value_in; fifo_wreq <= 1'b1; end end
- To compile the design with these changes, click Compile Design on the Compilation Dashboard.
1.3.7. Step 7: Verify Preservation and Optimized Results
-
In the Compilation Report (Processing > Compilation Report), under the Fitter
folder, expand the Preserved Assignments
folder. The reports indicate use of the preserved partitions.
Figure 13. Preserved Partitions Report
- Click Tools > Timing Analyzer , and then double-click Update Timing Netlist.
-
Run the report_timing.tcl script to regenerate the timing analysis
reports:
source report_timing.tcl
Timing analysis data in the inst_i3 and inst_i4 report folders now indicate that the blinking_led_i3 and blinking_led_i4 partitions meets timing requirements.Figure 14. Optimized u_blinking_led_i3 and u_blinking_led_i4 Meet Timing - In the Timing Analyzer reports , right-click the Slow 900 mV 100C Model report in each folder, and then click Generate in All Corners.
-
Open the Multi Corner Summary report to
check the slack and placement results for the big_partition1_top partition. The slack value is similar to
performance at the time of preservation. The placement results are the same as
at the time of preservation. You can compare these preserved results with Figure 11.
Figure 15. Preserved Slack and Placement
- Repeat step 4 to verify the slack and placement results for the blinking_led_i1 and blinking_led_i2 partitions.
1.3.8. (Optional) Step 8: Device Programming
Follow these steps to configure the FPGA on the Intel® Arria® 10 GX Development Kit:
- To open the Intel® Quartus® Prime Programmer, click Tools > Programmer.
-
Connect the board cables:
- JTAG USB cable to board
- Power cable attached to board and power source
- Turn on power to the board.
-
In the
Intel®
Quartus® Prime Programmer, click
Hardware Setup.
Figure 16. Hardware Setup
-
In the Hardware list, select
USB-BlasterII, and then click
Close. The device chain appears.
Note: If the device chain does not appear, verify the board connections.
- Click Auto-Detect. The device chain populates.
-
In the Found Devices list, select the device that
matches your design and click OK. For this tutorial,
select the 10AX115S2 device that matches the
10AX115S2F45I1SG FPGA on the
Intel®
Arria® 10 GX
Development Kit.
Figure 17. Select Device
-
Right-click the 10AX115S2 row in the file list, and then click
Change File.
Figure 18. Programmer Window
- Browse to select the top.sof file from the appropriate tutorial/output_files/ directory.
-
Enable the Program/Configure option for the 10AX115S2
row.
Figure 19. Program/Configure Option
-
Click Start. The progress bar reaches 100% when device
configuration is complete. The device is now fully configured and in
operation.
Figure 20. Programming SuccessfulNote: If device configuration fails, make sure the device you select for configuration matches the device you specify during .sof file generation.
1.3.9. (Optional) Step 9: Verify Results in Hardware
After device programming you can verify the results of this tutorial in hardware. After completing this tutorial, LEDs D6-D3 map to the blinking_led_top instance, and LEDs D10-D7 map to the top-level design. After you configure the FPGA with the SRAM Object File (.sof), blinking_led flashes red LEDs in the following order:
- D3 blinks every two seconds
- D4 blinks every four seconds
- D5 blinks every eight seconds
- D6 blinks every 16 seconds
The top-level design illuminates LEDs D10-D7 as a shifting bit in green.
1.4. Incremental Block-Based Compilation Tutorial Revision History
Document Version | Software Version | Changes |
---|---|---|
2019.07.15 | 19.2 |
|
2018.11.10 | 18.1 |
|
2018.06.27 | 18.0 |
|
2018.06.26 | 18.0 | Corrected link to design example files. |
2018.06.22 |
18.0 |
Initial release of the document. |