Nios® II Processor Reference Guide

ID 683836
Date 8/28/2023
Public
Document Table of Contents

5.2.8. Instruction Performance

All instructions take one or more cycles to execute. Some instructions have other penalties associated with their execution. Late result instructions have two cycles placed between them and an instruction that uses their result. Instructions that flush the pipeline cause up to three instructions after them to be cancelled. This creates a three-cycle penalty and an execution time of four cycles. Instructions that require Avalon® -MM transfers are stalled until any required Avalon® -MM transfers (up to one write and one read) are completed.
Table 65.  Instruction Execution Performance for Nios II/f Core 4byte/line data cache
Instruction Cycles Penalties
Normal ALU instructions (e.g., add, cmplt) 1  
Combinatorial custom instructions 1  
Multicycle custom instructions > 1 Late result
Branch (correctly predicted, taken) 2  
Branch (correctly predicted, not taken) 1  
Branch (mispredicted) 4 Pipeline flush
trap, break, eret, bret, flushp, wrctl, wrprs; illegal and unimplemented instructions 4 or 5 Pipeline flush
call, jmpi, rdprs 2  
jmp, ret, callr 3  
rdctl 1 Late result
load (without Avalon® -MM transfer) 1 Late result
load (with Avalon® -MM transfer) > 1 Late result
store (without Avalon® -MM transfer) 1  
store (with Avalon® -MM transfer) > 1  
flushd, flushda (without Avalon® -MM transfer) 2  
flushd, flushda (with Avalon® -MM transfer) > 2  
initd, initda 2  
flushi, initi 4  
Multiply   Late result
Divide   Late result
Shift/rotate (with hardware multiply using embedded multipliers) 1 Late result
Shift/rotate (with hardware multiply using LE-based multipliers) 2 Late result
Shift/rotate (without hardware multiply present) 1 to 32 Late result
All other instructions 1  

For Multiply and Divide, the number of cycles depends on the hardware multiply or divide option. Refer to "Arithmetic Logic Unit" and "Instruction and Data Caches" s for details.

In the default Nios II/f configuration, instructions trap, break, eret, bret, flushp, wrctl, wrprs require four clock cycles. If any of the following options are present, they require five clock cycles:

  • MMU
  • MPU
  • Division exception
  • Misaligned load/store address exception
  • EIC port
  • Shadow register sets