|
美国国家实验室大型计算机系统故障数据,谁可以提出一点idea怎么使用这些数据?
有二十几个大型计算机集群。
信息如下:
System
machinetype
numberofnodes
numberofprocessorstotal
numberofprocessorspernode
nodenumberasitexisted(eitherstartswith0or1)
nodenumberstartingwithzero
installdate
productiondate
decommissiondate
fieldreplacableunittype
mememory
cputype
mememorytype
numberofinterconnects
nodepurpose-compute,frontend,graphics
ProblemStarted(mm/dd/yyhh:mm)
ProblemFixed(mm/dd/yyhh:mm)
DownTime
------belowhereareinterruptblamecategories---------------
Facilities
Hardware
HumanError
Network
Undetermined
Software
SameEvent-relatedevents
example
2,cluster,49,6152,80,0,0,5-Apr,5-Jun,current,part,80,1,1,0,graphics.fe,6/21/200510:54,6/21/200511:00,6,,GraphicsAccelHdwr,,,,,N |
本帖子中包含更多资源
您需要 登录 才可以下载或查看,没有账号?-注 册-
×
|