2008-4-22 20:34
enbo_lee
求助FAST600问题
大体这样,客户有两台570,每个570有一个HBA卡,各自连接FAST600和两个控制器,一个HBA卡连接一个控制器,前段时间FAST600上的一个控制器亮黄灯,后来一台570找不到存储上的硬盘了,以为控制器有问题了,在调控制器的过程中,又一个控制器有问题了,另一个主机也连接不下,存储断电重新起动后,存储的前面板最右边那个亮蓝灯,后面的控制器的灯都不亮,电源灯亮黄灯,串口和控制口都登录不进去,后来调来两个好的备件,换上,控制器也没有反应,无法登录,怀疑跟背板有问题,今天把机器运回来,找了一台好的FAST600,把两个原来的控制器插上之后,存储的前面板正常,控制器的灯都有绿灯,警告灯亮黄灯,用管理口连接,两个用SM都登录不进去,PING管理IP是通的,但就是无法登录,但串口能登录进去,重新起动内存报错,初试化也不使好,开始怀疑SM和微码不匹配,但换了一个备件控制器,一连接就能登录进去,这样无法判断控制器是好的还是坏的,百思不得其解,希望坛子里的高手帮帮忙解决一下,查找一下原因,提供一些命令什么的。
现在我的思路是先判断控制器是好的还是坏的,再看背板的问题,现在两个控制器现在这样,不知道怎么办才好,咳,以下是我用串口进去查的一些命令,
[[i] 本帖最后由 enbo_lee 于 2008-4-23 08:05 编辑 [/i]]
2008-4-22 20:38
enbo_lee
回复 #1 enbo_lee 的帖子
ifShow
gei (unit number 0):
Flags: (0x8063) UP BROADCAST MULTICAST ARP RUNNING
Type: ETHERNET_CSMACD
Internet address: 192.168.128.102
Broadcast address: 192.168.128.255
Netmask 0xffffff00 Subnetmask 0xffffff00
Ethernet address is 00:a0:b8:13:82:38
Metric is 0
Maximum Transfer Unit size is 9000
0 octets received
0 octets sent
0 packets received
5 packets sent
0 unicast packets received
0 unicast packets sent
0 non-unicast packets received
5 non-unicast packets sent
0 input discards
0 input unknown protocols
0 input errors
0 output errors
0 collisions; 0 dropped
lo (unit number 0):
Flags: (0x8069) UP LOOPBACK MULTICAST ARP RUNNING
Type: SOFTWARE_LOOPBACK
Internet address: 127.0.0.1
Netmask 0xff000000 Subnetmask 0xff000000
Metric is 0
Maximum Transfer Unit size is 32768
0 packets received; 0 packets sent
0 multicast packets received
0 multicast packets sent
0 input errors; 0 output errors
0 collisions; 0 dropped
value = 29 = 0x1d
-> N netCfgShow
==== NETWORK CONFIGURATION ====
Interface Name : gei0
My MAC Address : 00:a0:b8:13:82:38
My Host Name : target
My IP Address : 192.168.128.102
Server Host Name : host
Server IP Address : 0.0.0.0
Gateway IP Address : 0.0.0.0
Subnet Mask : 255.255.255.0
Network Init Flags : 0x00
Network Mgmt Timeout : 30
Shell Password : ************
User Name : guest
User Password : ************
NFS Root Path : (null)
NFS Group ID Number : 0
NFS User ID Number : 0
value = 27 = 0x1b
->
BOOT OPERATIONS MENU
1) Perform Isolation Diagnostics 10) Serial Interface Mode Menu
2) Download Permanent File 11) Display Hardware Configuration
3) Reserved 12) Change Hardware Configuration Menu
4) Dump NVSRAM Group 13) Development Options Menu
5) Patch NVSRAM Group 14) Display Memory Error Log
6) Set Real Time Clock 15) Manufacturing Setup Menu
7) Display Board Configuration R) Restart Controller
8) Special Services Menu Q) Quit Menu
9) Display Exception Message
Enter Selection: 11
HARDWARE CONFIGURATION
Model name: 2882
Model title: 2882
Board ID code: 0x23
Hardware option code: 0
Number host channels: 2
Number drive channels: 2
Host-side SCSI ID: 5
Drive-side SCSI ID: 7
Processor cache enabled? Yes
Memory fault detection? Yes
Fibre Channel feature? Yes
Press <Enter> to continue
Program memory base: 0x40000000
Program memory size: 0x08000000 (128 MB)
I/O buffer memory base: 0x48000000
I/O buffer memory size: 0x08000000 (128 MB)
Boot flash memory base: 0
Boot flash memory size: 0 (0 KB)
File flash memory base: 0x08000000
File flash memory size: 0x01000000 (16 MB)
Level 2 cache size: 0 (0 KB)
DRAM memory base: 0x40000000
DRAM memory size: 0x10000000 (256 MB)
Processor type: Arm Processor
Core Register Base: 0xFFFFE000
Board Register Base: 0x10080000
Processor rate (MHz): 200
Interval timer rate (Hz): 0
Countdown timer rate (Hz): 0
Battery feature supported? Yes
Battery may be on-board? Yes - feature revision #0
Press <Enter> to continue
Ethernet controller present? Yes
Network enabled? Yes
Ethernet revision: 2
Ethernet Adapter_ID: 0x43FF035C
Ethernet IRQ: 5
Base Address: Base = 0x80020000 Max size = 0x20000
Press <Enter> to continue
Host channels in use: 2
Expected number of host chann 2
controller type 1 controller 0x010015BC
revision level 1 revision lev 5
channel mode 1 channel mode: Fibre Channel
channel width 1 channel width 8
rate (MHz) 1 rate (MHz): 0
max IDs 1 max IDs: 8
Adapter_ID 1 Adapter_ID: 0x43FF05FC
Device Number 1 Device Number 0x08 ID Sel: 0x18 Func Num: 0x00
IRQ 1 IRQ: 12
Base Address: Base = 0x80000000 Max size = 0x2000
Press <Enter> to continue
controller type 2 controller 0x010015BC
revision level 2 revision lev 5
channel mode 2 channel mode: Fibre Channel
channel width 2 channel width 8
rate (MHz) 2 rate (MHz): 0
max IDs 2 max IDs: 8
Adapter_ID 2 Adapter_ID: 0x43FF0554
Device Number 2 Device Number 0x08 ID Sel: 0x18 Func Num: 0x01
IRQ 2 IRQ: 13
Base Address: Base = 0x80002000 Max size = 0x2000
Press <Enter> to continue
Drive channels in use: 2
Drive interface mode: Fibre Channel
Drive interface rate (MHz): 0
controller type 1 controller 0x010015BC
revision level 1 revision lev 5
max IDs 1 max IDs: 8
Adapter_ID 1 Adapter_ID: 0x43FF04AC
Device Number 1 Device Number 0x09 ID Sel: 0x19 Func Num: 0x00
IRQ 1 IRQ: 14
Base Address: Base = 0x80004000 Max size = 0x2000
Press <Enter> to continue
controller type 2 controller 0x010015BC
revision level 2 revision lev 5
max IDs 2 max IDs: 8
Adapter_ID 2 Adapter_ID: 0x43FF0404
Device Number 2 Device Number 0x09 ID Sel: 0x19 Func Num: 0x01
IRQ 2 IRQ: 15
Base Address: Base = 0x80006000 Max size = 0x2000
Press <Enter> to continue
BOOT OPERATIONS MENU
1) Perform Isolation Diagnostics 10) Serial Interface Mode Menu
2) Download Permanent File 11) Display Hardware Configuration
3) Reserved 12) Change Hardware Configuration Menu
4) Dump NVSRAM Group 13) Development Options Menu
5) Patch NVSRAM Group 14) Display Memory Error Log
6) Set Real Time Clock 15) Manufacturing Setup Menu
7) Display Board Configuration R) Restart Controller
8) Special Services Menu Q) Quit Menu
9) Display Exception Message
Enter Selection: 13
2008-4-22 20:41
enbo_lee
OTICE: The BOOT OPERATIONS MENU has been invoked too late for
proper operation of some activities, including Isolation Diagnostics.
You may wish to restart this controller again and press Control-B
IMMEDIATELY after seeing the start-up indicator ("-=<###>=-").
BOOT OPERATIONS MENU
1) Perform Isolation Diagnostics 10) Serial Interface Mode Menu
2) Download Permanent File 11) Display Hardware Configuration
3) Reserved 12) Change Hardware Configuration Menu
4) Dump NVSRAM Group 13) Development Options Menu
5) Patch NVSRAM Group 14) Display Memory Error Log
6) Set Real Time Clock 15) Manufacturing Setup Menu
7) Display Board Configuration R) Restart Controller
8) Special Services Menu Q) Quit Menu
9) Display Exception Message
Enter Selection: 1
Date: 04/22/2008 Time: 08:21:44
Isolation Diagnostics
1. Reset options
2. Select options
3. Display options
4. Run tests
D. Debugger
S. Shell
Q. Quit
Enter Selection: NET: BOOTP network configuration failed.
Network ready
4
Power-Up Diagnostics - Loop 1 of 1
3700 Buffer DRAM Skipped
3600 Processor DRAM
01 Data lines Passed
02 Address lines Passed
6590 Host Channel 1--Tachyon DX2
01 TachLite Register Test Passed
6591 Host Channel 2--Tachyon DX2
01 TachLite Register Test Passed
6BA1 Drive Channel 1--Tachyon DX2
01 TachLite Register Test Passed
6BA2 Drive Channel 2--Tachyon DX2
01 TachLite Register Test Passed
4409 Ethernet
01 Register read Passed
02 Register address lines Passed
03 Register data lines Passed
04 MDI Register test Passed
05 Interrupt test Skipped
06 Serial EEPROM validation Passed
3202 Flash 2 Skipped
5600 Acorn Skipped
4412 Interrupt Tests Skipped
3300 NVSRAM
01 Data lines Passed
3900 Real-Time Clock
01 RT Clock Tick Passed
Isolation Diagnostics
1. Reset options
2. Select options
3. Display options
4. Run tests
D. Debugger
S. Shell
Q. Quit
Enter Selection: 04/22/08-08:21:53 (GMT) (IOSched): NOTE: ChipFreezeStalled...reset Tachyon
04/22/08-08:21:53 (GMT) (IOSched): NOTE: hddConfigInProgress ==> Chan 0
04/22/08-08:21:53 (GMT) (IOSched): NOTE: hddConfigDone
Q
Press Y to confirm exit: Y
Diagnostic Manager exited normally.
BOOT OPERATIONS MENU
1) Perform Isolation Diagnostics 10) Serial Interface Mode Menu
2) Download Permanent File 11) Display Hardware Configuration
3) Reserved 12) Change Hardware Configuration Menu
4) Dump NVSRAM Group 13) Development Options Menu
5) Patch NVSRAM Group 14) Display Memory Error Log
6) Set Real Time Clock 15) Manufacturing Setup Menu
7) Display Board Configuration R) Restart Controller
8) Special Services Menu Q) Quit Menu
9) Display Exception Message
Enter Selection:
*** Invalid Selection
Enter Selection:
*** Invalid Selection
Enter Selection: 1
Date: 04/22/2008 Time: 08:22:14
Isolation Diagnostics
1. Reset options
2. Select options
3. Display options
4. Run tests
D. Debugger
S. Shell
Q. Quit
Enter Selection: 4
Power-Up Diagnostics - Loop 1 of 1
3700 Buffer DRAM Skipped
3600 Processor DRAM
01 Data lines Passed
02 Address lines Passed
6590 Host Channel 1--Tachyon DX2
01 TachLite Register Test Passed
6591 Host Channel 2--Tachyon DX2
01 TachLite Register Test Passed
6BA1 Drive Channel 1--Tachyon DX2
01 TachLite Register Test Passed
6BA2 Drive Channel 2--Tachyon DX2
01 TachLite Register Test Passed
4409 Ethernet
01 Register read Passed
02 Register address lines Passed
03 Register data lines Passed
04 MDI Register test Passed
05 Interrupt test Skipped
06 Serial EEPROM validation Passed
3202 Flash 2 Skipped
5600 Acorn Skipped
4412 Interrupt Tests Skipped
3300 NVSRAM
01 Data lines Passed
3900 Real-Time Clock
01 RT Clock Tick Passed
Isolation Diagnostics
1. Reset options
2. Select options
3. Display options
4. Run tests
D. Debugger
S. Shell
Q. Quit
Enter Selection: 04/22/08-08:22:19 (GMT) (IOSched): NOTE: ChipFreezeStalled...reset Tachyon
04/22/08-08:22:19 (GMT) (IOSched): NOTE: hddConfigInProgress ==> Chan 0
04/22/08-08:22:19 (GMT) (IOSched): NOTE: hddConfigDone
A6 -- SDRAM Preparation ERROR: ECC failure in address 0x4079ffa0
addr=0x4079ffa0, req=0, type=0, dir=1, syn=0x83, [0]=73416c46, [1]=644f6d48
A6 -- SDRAM Preparation ERROR: ECC failure in address 0x4079ffa0
addr=0x4079ffa0, req=0, type=0, dir=1, syn=0x83, [0]=00000000, [1]=00000000
A5 -- SDRAM Preparation ERROR: Basic Memory Access failure in address 0x40001000
A6 -- SDRAM Preparation ERROR: ECC failure in address 0x4079ffa0
addr=0x4079ffa0, req=0, type=0, dir=1, syn=0x83, [0]=00000000, [1]=00000000
A6 -- SDRAM Preparation ERROR: ECC failure in address 0x4079ffa0
addr=0x4079ffa0, req=0, type=0, dir=1, syn=0x83, [0]=00000000, [1]=00000000
A5 -- SDRAM Preparation ERROR: Basic Memory Access failure in address 0x4fffffe0
A6 -- SDRAM Preparation ERROR: ECC failure in address 0x4079ffa0
addr=0x4079ffa0, req=0, type=0, dir=1, syn=0x83, [0]=00000000, [1]=00000000
A6 -- SDRAM Preparation ERROR: ECC failure in address 0x4079ffa0
addr=0x4079ffa0, req=0, type=0, dir=1, syn=0x83, [0]=00000000, [1]=00000000
A5 -- SDRAM Preparation ERROR: Basic Memory Access (bank 2) failure in address 0x48000000
A6 -- SDRAM Preparation ERROR: ECC failure in address 0x4079ffa0
addr=0x4079ffa0, req=0, type=0, dir=1, syn=0x83, [0]=00000000, [1]=00000000
A6 -- SDRAM Preparation ERROR: ECC failure in address 0x4079ffa0
addr=0x4079ffa0, req=0, type=0, dir=1, syn=0x83, [0]=00000000, [1]=00000000
A5 -- SDRAM Preparation ERROR: Basic Memory Access (bank 2) failure in address 0x47ffffe0
-=<###>=-
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@ @@@@ @@ @@ @@@@ @@@ @@ @@ @@
@@ @@ @@ @@ @@@ @ @ @@@@ @ @@@@@
@@ @@@ @@ @@ @@@ @ @ @ @ @ @@@@@
@@ @@ @@ @@ @@@ @ @ @ @ @ @@@@@
@@@@ @@@@ @@ @@ @@ @@@ @@ @@ @@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
SHV RAID Controller
LSI Logic Storage Systems, Inc.
SHVRAID Version 05.33.07.00
Current Date and Time: April 22, 2008 08:22:40
Send <BREAK> for shell access or baud rate change
WARNING: Restart by watchdog time out
NMI: Multiple unknown ECC errors occurred
-=<###>=-
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@ @@@@ @@ @@ @@@@ @@@ @@ @@ @@
@@ @@ @@ @@ @@@ @ @ @@@@ @ @@@@@
@@ @@@ @@ @@ @@@ @ @ @ @ @ @@@@@
@@ @@ @@ @@ @@@ @ @ @ @ @ @@@@@
@@@@ @@@@ @@ @@ @@ @@@ @@ @@ @@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
SHV RAID Controller
LSI Logic Storage Systems, Inc.
SHVRAID Version 05.33.07.00
Current Date and Time: April 22, 2008 08:22:53
Send <BREAK> for shell access or baud rate change
NET: Requesting boot parameters from a boot server
04/22/08-08:22:57 (GMT) (tRAID): WARN: Battery Age has exceeded specified limit.
04/22/08-08:22:57 (GMT) (tRAID): NOTE: lockMgr Role is Master
04/22/08-08:22:58 (GMT) (IOSched): NOTE: hddConfigInProgress ==> Chan 0
04/22/08-08:22:58 (GMT) (IOSched): NOTE: hddConfigInProgress ==> Chan 0
04/22/08-08:23:03 (GMT) (IOSched): NOTE: hddConfigInProgress ==> Chan 1
04/22/08-08:23:03 (GMT) (IOSched): NOTE: Extended Link Down ==> Chan 1
04/22/08-08:23:03 (GMT) (IOSched): NOTE: hddConfigDone
04/22/08-08:23:03 (GMT) (tRAID): NOTE: WWN baseName 000800a0-b8138238 (valid==>NoPrevAlt)
04/22/08-08:23:03 (GMT) (tRAID): WARN: Memory multi-bit ECC error detected
04/22/08-08:23:03 (GMT) (tRAID): WARN: Suspending Start-Of-Day Task
NET: BOOTP network configuration failed.
Network ready
2008-4-22 21:48
五“宅”一生
找个好的600。把你的控制器换上去看好不好用就可以判断控制器是否有问题啦。
2008-4-22 21:59
enbo_lee
兄弟没写明白吗?原来的600已经看不到控制器了,我这些东西都是在好的机器上看这两个控制器的,但只能串口进去,管理口只能PING通,但不能登录,两个都一样,因为我不还想要数据,我把一个控制器sysWipe初始了一下,完了sysReboot重新起动了一下,还是那样,只能PING的通,登录不进去,我用好的控制器和一个有问题的控制器插入,好的控制器都进不去了人管理口,如果单独插入一个备件好的控制器,SM可以进去的,使用正常,不知什么原因,烦透了,这么多看贴的兄弟哥哥们有没有好命令,如何看这个控制器是好的?还是真的有问题?
[[i] 本帖最后由 enbo_lee 于 2008-4-22 22:05 编辑 [/i]]
2008-4-22 23:45
炸鸡
好长,好累。帮不上。:L
2008-4-23 09:33
笑看风云淡
看样子是控制器内存坏了。
你现在要做的是收集log,把log发给IBM,
我记得2882是有maintenance mode的,可以查看更底层的信息。不过不适合你的这种情况。
2008-4-23 09:58
flysnowpp
如果没有过保,最好是找IBM解决一下。
我看很有可能是过保了,LZ是做维保的吧。
2008-4-23 10:09
cnpmc
太长了,还是不懂了,弄不了呀
2008-4-23 10:12
炸兔子
写得比较乱,思路不清晰.
2008-4-23 10:43
enbo_lee
感谢专家们的支持,我现在经过商量也只能想到内存这块了,我下午找个内存换上去看管用不,如果管用的话当然最好了,剩下就是背斑的问题了,我问了IBM的工程师,给了我一些命令,ld </Debug
date
arrayPrintSummary
inetstatShow
netCfgShow
memShow
memoryShow
moduleShow
moduleList
ghsList
getObjectGraph_MT 8
showEnclosures
cfgUnitList
cfgPhyList
vdAll cfgPhy
vdAll cfgUnit
vdAll vdShow
spmShow
fcDevs 2
fcDevs 4
fcDevs 11
fcAll
ccmStateAnalyze 8
cacheAnalyze
unld "Debug"
但他们也不做分析,也是把这些数据给日本人分析或直接就换备件了,我也不知道具体再判断了,只能先试内存了。
2008-4-23 10:52
enbo_lee
回复 #8 flysnowpp 的帖子
过保了,
2008-4-23 11:51
bennial
微码是什么版本的?
串口上面运行getObjectGraph_MT 99的结果正常吗?
2008-4-23 13:51
笑看风云淡
[quote]原帖由 [i]enbo_lee[/i] 于 2008-4-23 10:43 发表 [url=http://bbs.loveunix.net/redirect.php?goto=findpost&pid=779226&ptid=83666][img]http://bbs.loveunix.net/images/common/back.gif[/img][/url]
感谢专家们的支持,我现在经过商量也只能想到内存这块了,我下午找个内存换上去看管用不,如果管用的话当然最好了,剩下就是背斑的问题了,我问了IBM的工程师,给了我一些命令,ld [/quote]
虽然IBM OEM了LSI,但是出了什么问题,一般还是直接找LSI,IBM解决不了什么的,lsi在亚洲人也少。
2008-4-23 19:30
enbo_lee
完了,换上内存也不好使,没办法了,今天做了一个测试,就是把好的控制器插进去,用SM进入之后,带电插入另一个有问题的控制器,可以认到了,但单击有问题的控制器右键没有反应。难道彻底没招了吗?求高人?有人说把好的微码升级一下,因为现在好的控制器的微码比较低,感觉这样成吗?难道这到步就完了吗?咱国内没有修的吗?我感觉现在毛病不大了,串口能进去,而且能PING通,那说明SM和控制器的里的子系统连接有问题,专家们帮帮忙吧,
2008-4-23 19:35
enbo_lee
回复 #13 bennial 的帖子
您说的我这个没试,
2008-4-23 19:39
enbo_lee
[quote]原帖由 [i]bennial[/i] 于 2008-4-23 11:51 发表 [url=http://bbs.loveunix.net/redirect.php?goto=findpost&pid=779264&ptid=83666][img]http://bbs.loveunix.net/images/common/back.gif[/img][/url]
微码是什么版本的?
串口上面运行getObjectGraph_MT 99的结果正常吗? [/quote]
您说哪个微码?好的控制器的微码?,今天试了升一下微码,但我这都是6以上的,它不让升,报错,我是插了一个好的,带一个坏的,好的微码是5几,试了好几个不成,
2008-4-23 19:50
enbo_lee
北京有没有修的?
2008-4-23 21:12
bennial
[quote]原帖由 [i]笑看风云淡[/i] 于 2008-4-23 13:51 发表 [url=http://bbs.loveunix.net/redirect.php?goto=findpost&pid=779296&ptid=83666][img]http://bbs.loveunix.net/images/common/back.gif[/img][/url]
虽然IBM OEM了LSI,但是出了什么问题,一般还是直接找LSI,IBM解决不了什么的,lsi在亚洲人也少。 [/quote]
小看IBM在中国的投入了吧,以前DS4000系列三线支持在日本,现在国内也有三线支持了
其实三线支持可以解决90%以上的问题了,需要找LSI处理的情况比较少
2008-4-23 21:20
yddll
上次有个fastt200坏了就是日本鬼子来修的
2008-4-23 22:15
enbo_lee
[quote]原帖由 [i]bennial[/i] 于 2008-4-23 21:12 发表 [url=http://www.loveunix.net/redirect.php?goto=findpost&pid=779506&ptid=83666][img]http://www.loveunix.net/images/common/back.gif[/img][/url]
小看IBM在中国的投入了吧,以前DS4000系列三线支持在日本,现在国内也有三线支持了
其实三线支持可以解决90%以上的问题了,需要找LSI处理的情况比较少 [/quote]
哥哥们,在哪修?能修不?我现在不能在给换台整机吧,有连控制器和背板坏的吗?估计没人信呀。
2008-4-24 10:07
bennial
什么坏的可能都有啊,坏了就没办法修了,直接换吧
2008-4-24 11:29
老农
纯粹瞎搞,不死几回不会懂的
2008-4-24 14:09
jackwork_80
怎么了?领导,您指导一点,我哪块做的不对吗
页:
[1]
2
Powered by Discuz! Archiver 5.5.0
© 2001-2006 Comsenz Inc.