Recommended steps on how to resolve when Nutanix* Life Cycle Manager (LCM) firmware upgrade or LCM inventory gets stuck because the BMC hangs
LCM inventory on Intel servers fails with the following error during the process:
2021-05-19 19:22:23 ERROR exception.py:87 LCM Exception [LcmExceptionHandler]: Inventory Failed - found the following errors:
Inventory failed for DCB Firmware on xx.xx.xx.6 (environment: hypervisor) with [Unable to get the fru details. Failed to run payload_info cmd]
Traceback (most recent call last):
File "/home/circleci/project/build/python-tree/bdist.linux-x86_64/egg/framework/main/exception.py", line 953, in wrapper
File "/home/circleci/project/build/python-tree/bdist.linux-x86_64/egg/framework/main/framework.py", line 878, in __run_operations
File "/home/circleci/project/build/python-tree/bdist.linux-x86_64/egg/framework/main/metrics/metric_entity.py", line 2079, in __call__
File "/home/circleci/project/build/python-tree/bdist.linux-x86_64/egg/framework/main/metrics/metric_entity.py", line 2174, in _execution
File "/home/circleci/project/build/python-tree/bdist.linux-x86_64/egg/framework/main/base_classes/base_op.py", line 251, in run
File "/home/circleci/project/build/python-tree/bdist.linux-x86_64/egg/framework/main/ops/inventory_op.py", line 210, in _run
File "/home/circleci/project/build/python-tree/bdist.linux-x86_64/egg/framework/main/ops/inventory_op.py", line 255, in _detect_inventory
File "/home/circleci/project/build/python-tree/bdist.linux-x86_64/egg/framework/main/ops/inventory_op.py", line 327, in _distributed_inventory
File "/home/circleci/project/build/python-tree/bdist.linux-x86_64/egg/framework/main/ops/distribute_op_tasks.py", line 274, in monitor_tasks_and_report_errors
LcmRecoverableError: Inventory Failed - found the following errors:
Inventory failed for DCB Firmware on xx.xx.xx.6 (environment: hypervisor) with [Unable to get the fru details. Failed to run payload_info cmd]
lcm_ops.out on the CVM of the affected node will show the following:
2021-05-19 19:20:39 INFO helper.py:109 (xx.xx.xx.6, inventory, 2cf0ad40-ff89-431f-80ee-a6968051c4bf) DEBUG: [2021-05-19 19:20:39.639205] Failed to run payload_info cmd. OS type: ahv, Error: Get SDR 0000 command failed: Timeout
2021-05-19 19:20:39 INFO helper.py:109 (xx.xx.xx.6, inventory, 2cf0ad40-ff89-431f-80ee-a6968051c4bf) Get SDR 0000 command failed: Timeout
2021-05-19 19:20:39 INFO helper.py:109 (xx.xx.xx.6, inventory, 2cf0ad40-ff89-431f-80ee-a6968051c4bf) Get SDR 0000 command failed: Timeout
2021-05-19 19:20:39 INFO helper.py:109 (xx.xx.xx.6, inventory, 2cf0ad40-ff89-431f-80ee-a6968051c4bf) Get SDR 0000 command failed: Timeout
2021-05-19 19:20:39 INFO helper.py:109 (xx.xx.xx.6, inventory, 2cf0ad40-ff89-431f-80ee-a6968051c4bf) Get SDR 0000 command failed: Timeout
2021-05-19 19:20:39 INFO helper.py:109 (xx.xx.xx.6, inventory, 2cf0ad40-ff89-431f-80ee-a6968051c4bf)
2021-05-19 19:20:39 INFO helper.py:109 (xx.xx.xx.6, inventory, 2cf0ad40-ff89-431f-80ee-a6968051c4bf) DEBUG: [2021-05-19 19:20:39.639332] Unable to get the fru details
2021-05-19 19:20:39 ERROR helper.py:106 (xx.xx.xx.6, inventory, 2cf0ad40-ff89-431f-80ee-a6968051c4bf) EXCEPT:{"err_msg": "Unable to get the fru details. Failed to run payload_info cmd", "name": "DCB Firmware"}
2021-05-19 19:20:39 INFO lcm_ops_by_host:906 (xx.xx.xx.6, inventory, 2cf0ad40-ff89-431f-80ee-a6968051c4bf) Raising inventoryOpError with DCB Firmware, Unable to get the fru details. Failed to run payload_info cmd
, xx.xx.xx.6
Command 'ipmitool fru list' (run from the host) may fail with an error similar to the one outlined below:
[root@AHV ~]# ipmitool fru list
FRU Device Description : Builtin FRU Device (ID 0)
Chassis Type : Rack Mount Chassis
Chassis Part Number : ..................
Chassis Serial : ..................
Chassis Extra : Nutanix-Qualified
Chassis Extra : LCM 2.00
Board Mfg Date : Tue Oct 17 7:15:00 2019
Board Mfg : Intel Corporation
Board Product : S2600WF0
Board Serial : BQWFXXXXX222
Board Part Number : H99999-999
Product Manufacturer : Intel Corporation
Product Name : S2600WF0
Product Part Number : K99999-001
Product Version : LWF2312NXXxxxx
Product Serial : BQF9XXXXXXX
Product Asset Tag : ....................
Get SDR 0000 command failed: Timeout
Get SDR 0000 command failed: Timeout
Get SDR 0000 command failed: Timeout
Get SDR 0000 command failed: Timeout
Get SDR 0000 command failed: Timeout
Nutanix* has released the LCM-2.4.3.3 version, which addressed the issue.
To download the LCM-2.4.3.3 version, go to the Nutanix* portal (This requires a Nutanix* account).
Note |
Refer to the Third Party Content section in the Intel Terms of Use. |