Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

decoding MCE Logs? Possible hardware issue?



On Wed, September 29, 2010 10:10 am, Jerry Feldman wrote:
> On 09/29/2010 09:29 AM, Derek Atkins wrote:
>> Jerry Feldman <gaf-mNDKBlG2WHs at public.gmane.org> writes:
>>
>>
>>>> But I suspect there's still really a hardware problem somewhere.  :(
>>>>
>>>>
>>>>
>>> Just one thing to add. I have a number of servers with Supermicro
>>> boards, and one of them won't boot unless I blacklist one of the edac
>>> modules. That system has 64GB ECC memory and either 1 or 2 Intel Xeon
>>> CPUs (One of my systems only has 1 CPU the rest have 2).  If you are
>>> interested I can email you with the modules I am blacklisting.
>>>
>> Note that this is a Supermicro with AMD CPUs.  It only has 16GB RAM
>> right now, but I might extend that if I find that some of the RAM is
>> bad.  The system boots just fine, and I do not have any edac modules
>> loaded at all (according to lsmod).  So I'm not sure what blacklisting
>> it would accomplish?
>>
>> -derek
>>
>>
> I've got 5 systems with Supermicro X7DB8+ Mother Boards, and only one
> has problems with the edac modules. In my search for a solution to the
> udev hang problem I found a lot of pointers to Supermicro boards. I
> don't know why that one has the issue. It certainly is a much different
> issue than you have. My lsmod on another system shows:
> [gaf at boslc05 ~]$ lsmod | grep edac
> i5000_edac             42177  0
> edac_mc                60193  1 i5000_edac
>
> In any case I was just trying to provide some additional information.

Interesting!  I wonder if this is an Intel v. AMD thing?  Or perhaps a
2.6.27 v. 2.6.34 thing?  Or maybe it's an X7DB8+ v. H8DA3-2 thing?

I'd turn off edac if it looked like it was actually loading on my system.

Ahh, the joys of ECC RAM -- harder to tell when the RAM is bad.  ;)

-derek








BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org