Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

mcelong reports AMD DRAM Parity Error?



Jarod Wilson <jarod-ajLrJawYSntWk0Htik3J/w at public.gmane.org> writes:

> On Nov 18, 2010, at 10:30 AM, Derek Atkins wrote:
>
>> Hey,
>> 
>> Back onto my mcelog issue from a while ago..
>
> Crap, I apologize, I'd meant to follow up on this, and it fell
> through the cracks... So I jumped right on it right now.
>
>> I finally updated to the
>> newly released mcelog.x86_64 2:1.0-0.1.pre3.fc13 and when I ran mcelog
>> I got this output:
>> 
>> HARDWARE ERROR. This is *NOT* a software problem!
>> Please contact your hardware vendor
>> MCE 0
>> CPU 0 4 northbridge TSC 24b8cb30a62636 
>> MISC c008000001000000 ADDR 3c5e80c80 
>>  Northbridge DRAM Parity Error
>>       bit34 = err cpu2
>>       bit43 = L3 subcache in error bit 1
>>       bit46 = corrected ecc error
>>       bit59 = misc error valid
>>  memory/cache error 'generic read mem transaction, generic transaction, level generic'
>> STATUS 9c294834001d011b MCGSTATUS 0
>> SOCKETID 0 
>> 
>> Does this mean I have a busted CPU?  Or busted RAM?
>
> RAM. However, its not a fatal error, its simply a corrected
> ecc error. I'm told this is all a single event here, and the
> event was the corrected ecc error, anyway. So you might want
> to replace some memory at some point, but hey, its ecc memory
> doing what its designed to do here.

Is there an easy way to figure out which bank of RAM had the error?

I guess I can wait until I have another issue..

> I'd probably not worry about the memory too much, unless its
> happening at least daily, and/or if its causing some sort of
> noticeable performance hit.

Well, when I was running the F13 kernel my VMs would get into a snit and
the virtual disks would lock up, causing "disk IO errors" inside the
VMs.  The same hardware running the F10 kernel doesn't exhibit this
problem.  So, is this a performance hit?  I would say so.  Or it could
be an issue with vmware-server and the F13 kernel.

-derek
-- 
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       warlord-DPNOqEs/LNQ at public.gmane.org                        PGP key available






BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org