|
|
Customer configuration |
|
HDM version |
6.12 V100R001B06D012SP01 |
|
BIOS version |
5.74 V100R001B05D091 |
|
CPLD version |
V00F |
|
PFR CPLD version |
V00B |
|
Motherboard PCB version |
VER.A |
|
Slot1 |
NIC-ETH-RP1000P2SFP-LP-2P |
|
Slot2 |
NIC-ETH-SF400T-LP-4P-GE(0x401) |
|
Slot3 |
RAID-LSI-9460-16i(4G) |
|
Slot4 |
NIC-ETH-RP1000P2SFP-LP-2P |
The machine HDM reported an alarm for no response from the RAID controller, while inband operation is normal
Analysis indicates an out-of-band (OOB) information anomaly caused by RP1000, which does not affect business operations. The RAID controller itself shows no abnormalities.
1.mitigation solution 1
1. For machines that have not reported alarms, send an IPMI command to downgrade the log level of the two RAID controller no-response alarms to INFO level
ipmitool -H X.X.X.X -I lanplus -U user -P password raw 0x36 0x09 0xa2 0x63 0x00 0x36 0x02 0x6f 0x28 0x04 0x00 0x00 0x00 0x01 0x04
ipmitool -H X.X.X.X -I lanplus -U user -P password raw 0x36 0x09 0xa2 0x63 0x00 0x36 0x02 0x6f 0x28 0x00 0x00 0x00 0x00 0x01 0x04
2. For machines that have reported RAID controller no-response alarms, it is recommended to restart the host first to recover the alarm, and then send the above IPMI command to downgrade the alarm log level to INFO level.
2.mitigation solution 2
To completely prevent I2C8 exceptions, HDM can be downgraded to version 3.51, which does not include the merged MCTP I2C over lan information reading compatible with RP1000
3.solution 3
The issue is caused by the firmware of the RP1000 network card, which leads to abnormal HDM reading of I2C information and causes the I2C main process to malfunction. Upgrading to the latest RP1000 firmware will completely resolve this problem. The new network card firmware is planned to be released in April 2025.