Board Management Controller User Guide: Intel FPGA Programmable Acceleration Card N3000-N

ID 683186
Date 9/08/2020
Public

3. Board Monitoring through I2C SMBus

The standard I2C slave to Avalon-MM interface (read-only) shares the PCIe* SMBus between the host BMC and the Intel® MAX® 10 RoT. The Intel® FPGA PAC N3000-N supports standard I2C slave interface and the slave address is 0xBC by default only for out-of-band access. Byte addressing mode is 2-byte offset address mode.

Reference the telemetry data register memory map below when accessing telemetry data through I2C. The description column includes the equations you can use to calculate actual temperature and hysteresis values. The units can be Celsius (°C), mA, mV, mW depending on which sensor you read.

Table 2.  Telemetry Data Register Memory Map
Register Offset Width Access Field Default Value Description
Board Temperature 0x100 32 RO [31:0] 32'h00000000

TMP411

Register value is signed integer Temperature = register value * 0.5

Board Temperature High Warning 0x104 32 RO [31:0] 32'h00000000

TMP411

Register value is signed integer

High Limit = register value * 0.5

Board Temperature High Fatal 0x108 32 RO [31:0] 32'h00000000

TMP411

Register value is signed integer

High Fatal = register value * 0.5

Hysteresis 0x10C 32 RO [31:0] 32'h00000000

TMP411

Register value is signed integer

Hysteresis = register value * 0.5

FPGA Core Temperature 0x110 32 RO [31:0] 32'h00000000

TMP411

Register value is signed integer

Temperature = register value * 0.5

FPGA Core Temperature High Warning 0x114 32 RO [31:0] 32'h00000000

TMP411

Register value is signed integer

High Limit = register value * 0.5

FPGA Core Temperature High Fatal 0x118 32 RO [31:0] 32'h00000000

TMP411

Register value is signed integer

High Fatal = register value * 0.5

QSFP A Temperature 0x11C 32 RO [31:0] 32'h00000000

QSFP A

Register value is signed integer

Temperature = register value * 0.5

QSFP A Temperature High Fatal 0x120 32 RO [31:0] 32'h00000000

QSFP A

Register value is signed integer

High Alarm = register value * 0.5

QSFP A Temperature High Warning 0x124 32 RO [31:0] 32'h00000000

QSFP A

Register value is signed integer

High Warning = register value * 0.5

QSFP A Voltage 0X128 32 RO [31:0] 32'h00000000 QSFP A

Voltage(mv) = register value

QSFP B Temperature 0x12C 32 RO [31:0] 32'h00000000

QSFP B

Register value is signed integer

Temperature = register value * 0.5

QSFP B Temperature High Fatal 0x130 32 RO [31:0] 32'h00000000

QSFP B

Register value is signed integer

High Alarm = register value * 0.5

QSFP B Temperature High Warning 0x134 32 RO [31:0] 32'h00000000

QSFP B

Register value is signed integer

High Warning = register value * 0.5

QSFP B Voltage 0x138 32 RO [31:0] 32'h00000000 QSFP A

Voltage(mv) = register value

FPGA Core Voltage 0x13C 32 RO [31:0] 32'h00000000

LTC3884

Voltage(mV) = register value

FPGA Core Current 0x140 32 RO [31:0] 32'h00000000

LTC3884

Current(mA) = register value

12v Backplane Voltage 0x144 32 RO [31:0] 32'h00000000 Voltage(mV) = register value
12v Backplane Current 0x148 32 RO [31:0] 32'h00000000 Current(mA) = register value
1.2v Voltage 0x14C 32 RO [31:0] 32'h00000000 Voltage(mV) = register value
12v Aux Voltage 0x150 32 RO [31:0] 32'h00000000 Voltage(mV) = register value
12v Aux Current 0x154 32 RO [31:0] 32'h00000000 Current(mA) = register value
1.8v Voltage 0x158 32 RO [31:0] 32'h00000000 Voltage(mV) = register value
3.3v Voltage 0x15C 32 RO [31:0] 32'h00000000 Voltage(mV) = register value
Board Power 0x160 32 RO [31:0] 32'h00000000 Power(mW) = register value
Retimer Link Status 0x164 32 RO [0] 1'b0 Retimer A Port 0 link status
[1] 1'b0 Retimer A Port 1 link status
[2] 1'b0 Retimer A Port 2 link status
[3] 1'b0 Retimer A Port 3 link status
[4] 1'b0 Retimer B Port 0 link status
[5] 1'b0 Retimer B Port 1 link status
[6] 1'b0 Retimer B Port 2 link status
[7] 1'b0 Retimer B Port 3 link status
[31:8] 24’h00000000 Reserved
Retimer A Core Temperature 0x168 32 RO [31:0] 32'h00000000

Retimer A

Register value is signed integer

Temperature = register value * 0.5

Retimer A Serdes Temperature 0x16C 32 RO [31:0] 32'h00000000

Retimer A

Register value is signed integer

Temperature = register value * 0.5

Retimer B Core Temperature 0x170 32 RO [31:0] 32'h00000000

Retimer B

Register value is signed integer

Temperature = register value * 0.5

Retimer B Serdes Temperature 0x174 32 RO [31:0] 32'h00000000

Retimer B

Register value is signed integer

Temperature = register value * 0.5

The Intel® MAX® 10 BMC populates the QSFP voltage, temperature, high fatal temperature, and high warning temperature values in its telemetry data register by polling the QSFP module and copying the read values into the corresponding telemetry data register. The BMC supports QSFP modules in compliance with SFF8436 or SFF8636 standard and SFP modules in compliance with SFF8472 standard. If the QSFP module does not support Digital Diagnostics Monitoring or if the QSFP module is not installed, you may see the following values for QSFP voltage, temperature, temperature high fatal, temperature high warning:
  • deadbeef in the telemetry data registers
  • N/A while running the fpgainfo bmc command
  • Device or resource busy in the sysfs entries
The sysfs node entries for QSFP sensor data can be found at:
  • For QSFPA:
    /sys/class/fpga/intel-fpga-dev.0/intel-fpga-fme.0/spi-altera.0.auto/spi_master/spi0/spi0.0/sensor14/value
    /sys/class/fpga/intel-fpga-dev.0/intel-fpga-fme.0/spi-altera.0.auto/spi_master/spi0/spi0.0/sensor15/value
    /sys/class/fpga/intel-fpga-dev.0/intel-fpga-fme.0/spi-altera.0.auto/spi_master/spi0/spi0.0/sensor15/high_warn
    /sys/class/fpga/intel-fpga-dev.0/intel-fpga-fme.0/spi-altera.0.auto/spi_master/spi0/spi0.0/sensor15/high_fatal
  • For QSFPB:
    /sys/class/fpga/intel-fpga-dev.0/intel-fpga-fme.0/spi-altera.0.auto/spi_master/spi0/spi0.0/sensor37/value
    /sys/class/fpga/intel-fpga-dev.0/intel-fpga-fme.0/spi-altera.0.auto/spi_master/spi0/spi0.0/sensor38/value
    /sys/class/fpga/intel-fpga-dev.0/intel-fpga-fme.0/spi-altera.0.auto/spi_master/spi0/spi0.0/sensor38/high_warn
    /sys/class/fpga/intel-fpga-dev.0/intel-fpga-fme.0/spi-altera.0.auto/spi_master/spi0/spi0.0/sensor38/high_fatal

Use the Intelligent Platform Management Interface (IPMI) tool to read the telemetry data through the I2C bus.

I2C command to read the board temperatures at address 0x100:

In the command below:
  • 0x20 is the I2C master bus address of your server that can access PCIe slots directly. This address varies with the server. Please refer to your server datasheet for the correct I2C address of your server.

  • 0xBC is the I2C slave address of the Intel® MAX® 10 BMC.
  • 4 is the number of read data bytes
  • 0x01 0x00 is the register address of the board temperature which is presented in the Table 2.
Command:
$ sudo ipmitool i2c bus=0x20 0xBC 4 0x01 0x00
Output:
01110010 00000000 00000000 00000000

The output value returned is in big endian format i.e., the lower bytes are transferred first before the upper bytes. So the hexidecimal returned is: 0x00000072

0x72 is 114 in decimal.

To calculate the temperature in Celsius multiply by 0.5: 114 x 0.5 = 57 °C

Note: Please check with your server vendor if you encounter the following error when running the ipmitool command:
I2C Master Write-Read command failed: Bus Error
Unable to perform I2C Master Write-Read
The bus address and command could be different depending on the server vendor. Here are some IPMI tool commands for different servers to read board temperature:
  • Dell PowerEdge R740 server:
    $ sudo ipmitool i2c bus=0 0xBC 4 0x01 0x00
  • Intel Neoncity server:
    ipmitool i2c bus=2 0xBC 4 0x01 0x00