| « Failing hardware part 3 | Failing hardware? » |
Failing hardware part 2
First, thanks to everyone who wrote in on that last entry, and thanks also to Robin and Tony for some specific kernel/hardware ideas.
My workstation continues to lockup not-so-randomly, with the majority of freezes occurring while gaming.
I remembered that I undervolted my CPU a respectable amount a long time ago, so I upped the voltage just a little bit, to 1.175V, thinking that maybe it was undervolted a bit too far, and the issue took a long time to manifest. No change. Can't say I noticed any difference.
Next I thought I'd upgrade to the latest hardmasked nvidia-drivers. No change. Then Marius suggested using the nv driver to see if that fixed things, but the problem there is that a good amount of freezes occur while I'm doing something with 3D graphics, which nv can't do. So that wouldn't tell me much. Also, the graphical corruptions occur at boot and prevent the display of grub.conf, so I figured the X drivers may not be related to the issue.
To test this, Robin suggested that I remove framebuffer support from my kernel entirely. It made the issues worse! Now I seem to be on the right track. Previously, with the framebuffer enabled, the Grub screen doesn't display, but as soon as the kernel loads uvesafb and my initrd early in the boot process, the screen clears up and returns to normal. The fbsplash theme displayed just fine.
With framebuffer disabled, the severe corruption continues all the way up until init enters runlevel 3, as you can see in the following pictures I snapped with my cell phone.
Early in the boot process, just after grub loads the kernel

A clear screen once runlevel 3 begins

Tony figures the graphics card memory is bad, and based on what I see, I'm inclined to agree. I already figured it was probably time to replace the graphics card, so now I'm shopping around.
I've always liked my nVidia chips over the years, and this one has served me well. Still, now that the ATI RadeonHD 4670 is out, I'm strongly considering getting one. I like that it offers better performance than my 7600GT for about $80, which is half of what I paid for the nVidia card two years ago. It can handle UT2004 and any current games, assuming they run on Linux. I only have a 19" monitor, so I don't need to spend a lot of money to find a powerful card at extremely high resolutions.
Plus, the ATI card has basic support from both xf86-video-ati and xf86-video-radeonhd, though neither driver seems to be capable of 3D acceleration. I'd stick with the binary fglrx driver for the time being. The only thing I'm not sure about is whether or not the new & improved Catalyst Control Center Linux Edition (AMDCCCLE, phew!) actually supports fan adjustment, or if the temperatures can be queried and reported in a panel applet. I've been spoiled rotten by my passively cooled 7600GT, so I want to keep control over the fan noise. If there were passive 4670s on the market, I'd get one, but that hasn't happened yet. Maybe in the future, once I'm ready to use the FOSS driver, there will only be one. Right now I couldn't choose between the two; I'm hoping they merge on down the road. Too confusing; this consumer wants less choice. ![]()
Y'know, I used to be fairly avid nVidia Linux fan, simply because for so long ATI's support was a joke. It took them forever to come up with a hardware counter to nVidia's SLI tech, and then another ~3 years before the Linux support for it rolled around, and that's just the beginning of their disregard for Linux. But 2007 marked a turnaround, so now I'm actually rather impressed with the amount of work they've done for open-source drivers. They've really been making progress.
nVidia still shows no signs of opening up their stuff, aside from the largely-unnoticed-and-irrelevant CUDA initiative. But for years, their stuff just worked, as much as a binary driver can be expected to, especially compared to fglrx. Maybe old age has mellowed me a bit, as I'm feeling pragmatic enough to think that purchasing a card from the Red Team would be smarter than buying another Green card. I mean, sure, nVidia usually just works for me, but I know they don't plan on doing much featurewise; they've already killed off hardware video playback accel, and they haven't announced anything special for Linux or put out any kind of roadmap. ATI has, with things like UVD2, Xvmc and many other features they intend to bring to Linux in the various drivers. They want users to actually use their hardware features. What a concept! nVidia, you listening? And for reasonable card prices, too.
I mean, nothing nVidia has at the $80 mark comes close to the performance of the 4670. I could immediately put it to use with fglrx in UT2004 and Xfwm4's light window compositing, while at the same time look forward to open-source acceleration in the months to come.
That seems like a win all around, but I'll need to do some more research. Hopefully I fix these hardware issues with just a new graphics card. I'd hate to end up purchasing a whole 'nother machine before the problems disappear.
Trackback address for this post
Trackback URL (right click and copy shortcut/link location)
12 comments
I'll let you know if I ever find a solution. (if you find it first, I'll be happy to hear it ;-))
Check wether '/boot/grub/splash.xpm.gz' exists and take a look at [1].
Cheers.
[1] http://bugs.gentoo.org/show_bug.cgi?id=231039
If some of them have curved top side,they started outgassing, lost at least some ( if not practically all) of their effect and should be replaced.
http://www.neoseeker.com/Articles/Hardware/Reviews/hd4550/
You can use a vacuum cleaner and a brush, but DON'T USE any air spray, because, they cause card to cool getting damp (because of condensation)
Good luck!
I resolved the problem by remerge the grub & grub install
http://www.hisdigital.com/html/product_ov.php?id=368
Wes:
It is a hardware problem, not a software issue. You must have missed the entirety of the posts that talked about how everything freezes while I'm working, not the corruption during boot.
And to fix your grub issue, just add the correct splashscreen location. It used to be installed in /boot, but now turns up in /usr/share/grub. Run emerge --config grub to copy it to the expected place in /boot. This behavior was added in a recent stable grub version.
Branko:
I already thought of that; all my caps are okay. Nothing leaking. Just as well, replacing the motherboard or PSU would have cost more than replacing the graphics card.
Pacho:
Yeah, I cleaned out my card as thoroughly as I could, as well as the entire inside of my machine. Didn't make any difference, aside from dropping temps about 5C. Which is nice, but didn't solve anything.
Loki and robbat2:
Interesting, but I already purchased my replacement card last night. Like Robin, I got my card from HIS; I went with a 4670.
In fact, you can have bad cap that looks just fine.
Look for bowing of the top side of the cap.Even slight bowing is _bad_ .
Especially in the PSU ( take cover off- ofcourse with everything unplugged ).
Wrt to card cleaning: it doesn't seem to be issue here, but if you do clean connectors ( which is good thing to do ):
1. Go over them with hard pen rubber, which is a bit ahrder than pencil rubber and has a bit of glass powder in it.
2. Finish with alcohol on tissue paper. In absence of ethanol any strong bewerage is fine ( whiskey, cognac etc)
But seriously, you seem to have HW issue.
Don't just start blindly buying things, check them out first. Even if your graphic card is bad, it may very well be that it got killed by bad caps either in PSU, on board or even on the card itself.
If not then this guide was very useful to me...
http://www.gentoo.org/doc/en/articles/hardware-stability-p2.xml
At the least it's an excellent primer on pci latencies.
