|HOME||STUDIO||FORUM||STORE||GAME REVIEWS||NEWS||WHITE PAPERS||ABOUT|
Playing games puts a tremendous amount of stress on the display adapter and the PSU in a PC. Overheating is an all too common problem. Extreme gaming cards are even more prone to have problems. We cannot overstress the value of regular cleaning of computer components. Modern parts run hot and blocked fans will fail.
If your GPU is over 85C read this page carefully.
For background information on video cards, see our GPU page.
We currently have a pair of GTX 260s and a HD 5450 which is passively cooled. The GTX 260s are definitely designed to survive better than the old 8600 GT we bought back in 2007. There are several thermal sensors on the GTX 260 that can be monitored by the driver.
If we every get an AMD gaming class card we will rotate the journals so that we can provide equal coverage. We have an HD 5450 which is the bottom of the line card we use for our web server when integrated graphics are not available.
The power on off cycle can change the temperature of the thermal interface material considerably. Over time the expansion and contraction of the semiconductor and heatsink will act like a pump. In effect the thermal interface material is squeezed thinner and thinner.
Thermal interface material can vary widely in capability. The best can manage 10W/mK thermal conductivity. The Arctic MX-4 we use is around 8.5W/mK. By comparison copper is around 385W/mK. This is why thermal interface material should be extremely thin. In practice most use far too much.
Epoxy thermal interface materials can be better performing. Typically passive heatsinks on a motherboard are using it.
We use the popular MSI Afterburner and we created a custom fan profile to be sure that our GPU fan was running at 100% at 75C or higher to be sure a defective driver never destroys our video card again. We set the fan to 30% below 50C and its a straight line to 100% at 75C. This keeps the noise down when we are not playing games and maxes it when the game is demanding.
"Display driver nvlddmkm stopped responding and has successfully recovered" This is only found with Windows Vista and higher.
Artifacts on the screen are common when the GPU or the VRAM is having problems. Generally the GPU tends not to fail as often as VRAM. When the GPU fails, the screen will go blank. Typically VRAM problems present as snow, anomalies or speckles on the screen. Often only a single VRAM chip is at fault which is what is causing the snow or streaked like effects.
The Zotac 8600 GT, Zotac 8500 GT and Zotac 8400 GT all have the same VRAM problem from cards that we had to replace in the shop. The problem we noticed was that these lower end cards lack the thermal sensors found on more expensive models. Thermal sensors are not expensive and should be universal. This is based on period revisions of the popular GPU-Z.
We have also seen others with more recent cards like the GTX 200 series with the same VRAM problems. Given that VRAM runs hot its best to provide as much cooling as possible.
Sometimes it is possible to fix the problem, other times its not. The first step is to try adjusting the clocks of the GPU and VRAM down somewhat to see if that helps. Other times the thermal grease has failed and then the chances are the card is finished. The NVIDIA driver version 190.45 had a defective fan profile that quickly led to large numbers of dead video cards.
These programs allow a user to adjust GPU and VRAM clock speeds and they are popular with the overclocking crowd. These program also allow users to increase the fan speed or to create a custom fan profile. We use a custom fan profile that is much more aggressive than the factory configuration.
Try reducing the VRAM clock first by 5% and see if that clears up the artifacts. Try another 5% until it clears up. If you cannot clear it up then the VRAM chip has burned out and the video card is trash.
We have seen rare reports of success with baking the video card in an oven for several minutes. Given the card is already on its way to the trash, what can it hurt?
The idea is to reflow solder that has become damaged from oxidation or stresses.
Set the over to 375F. Make sure all the plastic parts are removed. Place the card on some aluminum foil and then bake it for about 7-8 minutes. The idea is to get the solder to reflow.
The idea is to get the solder to reflow. Some solders evidently seem to be problematic and this approach seems to be able to recover some dead cards. The problem is with the RoHS which banned lead in solder. New solders have experienced tin fingers and other problems.
After the card cools down use some thermal grease on the GPU and RAM chips and reassemble the fan assembly. Then try it out, if it works, congratulations, otherwise its back to the trash can with it.
We generally use NVIDIA cards however we also use Radeon cards. Both have their respective enthusiasts.
Our old 8600 GT was destroyed completely by the bad driver problem. The card has thermal sensors but the GPU lacks the ability to adjust the clock speed so the BIOS is powerless to prevent failure.
A bad driver (196.75), caused widespread damage to video cards. Zotac used mediocre thermal grease and there is no thermal sensor so the card simply overheated until it failed catastrophically. We attempted repairs but capacitors kept failing. Our GTX 260s also were affected by the driver problems but thermal sensors prevented these cards from failing.
Once a capacitor is popped it has to be replaced. The problem is that collateral damage may frustrate any repair. Given our experience, popped capacitors mean the card is garbage.
The capacitors on more expensive video cards are able to tolerate higher temperatures. The best ones are solid core. NVIDIA mentioned their GPU can tolerate 105C before it will fail.
Our EVGA GTX 260 has been a minor nuisance. The card was running far too hot and upon inspection we noted that the thermal grease was compromised. The card now runs much cooler after regreasing the card. We are also now watching the temperatures on both cards so that if the BFG card gets too warm then it can be regreased. The card is simply a provisional solution while we await the upcoming 20nm lineup. Searching with Google, we found many threads over regreasing cards damaged by the 196.75 driver.
Evidently the thermal interface material had degraded. It resembled a gel like material with some fibers which were residual from the original thermal pads. Curiously our BFG GTX 260 card is equally as old and it does not overheat even with Furmark running for extended periods of time.
We simply disassemble the GTX 260 by removed all of the large screws on the back. There are two small screws in the DVI bracket that also have to be removed.
For some reason The fan connector would not release so we carefully opened the card to reveal the GPU and other parts. Using a cotton ball and some isopropyl alcohol is best for cleaning the heat sink and semiconductor surfaces. Its not a bad idea to use a small brush to clear away bulk material if necessary.
Cleaning up the mess, we then applied a small dab fresh thermal grease to all areas. Its important to not forget the regulator area as they also get warm.
Reassembling the card brought the idle temperature down significantly. Clearly the thermal interface pads used by EVGA are not suitable, they degrade when they should be able to tolerate 105C.
The Arctic MX-4 claims 8 year service life which is more then the usual service life of a video card. In effect MX-4 doubles the service life of a video card.
The better MX-4 reduced the GTX 260 temperatures enough that throttling is no longer a problem. Obviously the GPU needs high-end thermal grease to be able to operate efficiently. Furmark no longer causes the card to overheat.
Using a custom fan profile with higher fan speeds is one way to keep the video card cool. The other is regular cleaning. Canned air can blow air into the video card fan assembly to remove dust that can block fans. Using an air purifier will reduce dust in the gaming room and extend the life of your valuable hardware.