Introduction

hello! I'm Nami. Finally, I became a second-year employee and became impatient.
Last time I wrote an article about a new feature called Hyper-Retiming using Hyper-Registers.
This time is also an introduction of the function using Hyper-Register.
Just in case, review Hyper – xx….

[Hyper Flex]
The whole mechanism to increase the performance of FPGA introduced from Stratix® 10

[Hyper-Retiming, Hyper-Pipelining, Hyper-Optimization, etc.]
Device structures and features that actually work to improve performance

[Hyper-Register]
Registers introduced from Stratix 10 to achieve above functionality

This time, I will introduce Hyper-Pipelining, which is one of the functions!
If you haven't read the previous article yet, please read "Stratix 10 New Features ~The Mystery of Hyper-XX (What is Hyper-Register?)~ "When
"Stratix 10 New Features ~The Mystery of Hyper-XX (What is Hyper-Retiming?)~” before continuing with this article.
First, let's talk a little bit about pipelining.

What is pipelining?

A technique that improves performance by inserting registers into critical paths to balance delays.

【example】
As before, let's say we want the FPGA to run at 300 MHz (3.3ns). In that case, data must propagate between ALM registers within 3.3ns. Suppose now that ALM is connected as shown in Fig.1.

Fig.1 Placement and wiring that does not satisfy the desired operating frequency

The delay between each register is 1.5ns and 3.5ns. In this state, a delay of 3.5ns occurs and the operating frequency is 286MHz, which does not satisfy the desired operating frequency (300MHz). So far, it's the same as last time.

This is where we do the pipelining.
Compiling with Quartus® automatically places pipeline registers in unused ALM registers in locations where timing can be optimized for critical paths. By adding a pipeline register, we were able to satisfy the timing as shown below.

Fig.2 Placement and routing that satisfies the desired operating frequency by pipelining

This is what I thought when I learned about pipelining.
"What's the difference from Retiming...?"
I'm sure there are other people who think this way.

But there was a big difference between the two.
Retiming only changes the position of registers, so there is no change in latency, but pipelining inserts registers, which increases latency accordingly. In other words, the calculation result was output after 3 cycles in Fig. 1, but it will be output after 4 cycles in Fig. 2.
However, the advantage is that the performance improvement rate is higher with pipelining than with Retiming!
However, when resource utilization is high and many ALMs are already in use, adding pipeline registers is not possible.

Let's take a look at Stratix 10's Hyper-Pipelining, which is the main topic of this article!

Hyper-pipelining

Hyper-Pipelining balances routing delays between registers by relocating pipeline registers to Hyper-Registers at optimal locations using Hyper-Retimer (a function that uses Hyper-Registers instead of ALM registers). A function that eliminates critical paths and improves operating frequency (performance).

However, you may be wondering where to add the pipeline registers to.
Please do not worry even if you are like that. The tool automatically determines where to add registers by Hyper-Pipelining! Thank you Quartus.

First of all, for that purpose, the flow called Fast Forward Compile is performed. Quartus will then estimate how many stages of registers you need to add to improve performance. After that, by adding the desired number of pipeline registers before and after the clock domain and at the input and output of the design in the HDL description, the tool will automatically place the registers in the optimal position.

Figure 6: IBIS generated by Quartus on the left, IBIS downloaded from the web on the right

And the additional register moves to the optimum path as shown in Fig.5.

Fig.4 Before register move
Fig.5 After register move

The moved Hyper-Register works as shown in Figure 6 and improves performance.

Fig.6 Hyper-Pipelining

With conventional pipelining, the operating frequency was 400MHz after the improvement, but with Hyper-Pipelining, we were able to increase the operating frequency to 572MHz!
Also, although the latency is increased, the overall timing improvement is higher than Hyper-Retiming.
Such flows can use Hyper-Pipelining to improve performance!

Summary

Benefits of Hyper-Pipelining
1. Higher timing improvement rate due to more flexible placement than ALM registers (using Hyper-Registers)
2. Enables timing improvements independent of ALM utilization (using Hyper-Registers)
3. The tool automatically judges and moves registers.

Hyper-Pipelining is used when Hyper-Retiming alone does not meet the desired operating frequency!