Jump to content

BSOD of hell


Hulk'O'Saurus

Recommended Posts

Hello, 

I have a MSI gaming laptop which started giving off consistent BSOD about two months ago. I tried a few different things, including full Windows 10 reinstall and update of BIOS but nothing helped. BSOD were happening while in Windows and during various activities, including idle desktop. 

I shipped the unit to their main depot in Poland and they changed a faulty sound board, as they themselves said. The unit came back with a document stating various hardware checks were performed and passed, but lo and behold, BSOD still happen while in Windows. 

At this point any help is appreciated.  

IP5ok2U.png

m0x5eY5.pngtBxm170.png

Link to comment
Share on other sites

How did you determine it was the motherboard? I would probably memtest86 as my first step to check the RAM.

Quote

How I have existed fills me with horror. For I have failed in everything - spelling, arithmetic, riding, tennis, golf; dancing, singing, acting; wife, mistress, whore, friend. Even cooking. And I do not excuse myself with the usual escape of 'not trying'. I tried with all my heart.

In my dreams, I am not crippled. In my dreams, I dance.

Link to comment
Share on other sites

4 hours ago, Bartimaeus said:

How did you determine it was the motherboard? I would probably memtest86 as my first step to check the RAM.

I did not. Prior to shipping it to Poland I was afraid it might be the motherboard, but their technician changed a faulty sound board, as they themselves said. I don't know enough about these things to make any final judgement. 

I will look into memtest86 and see what I can come up with. 

IP5ok2U.png

m0x5eY5.pngtBxm170.png

Link to comment
Share on other sites

@Hulk'O'Saurus

Memtest86 from the Memtest86 website. You use the included imageusb.exe to burn the image to a USB flash drive (or you can use something like Rufus if you need to do it to a CD/DVD instead), then boot the flash drive (let me know if you need help on figuring that out), start the test after it successfully boots the Rufus interface. If there are any errors at all, the RAM is the problem (and should be re-seated and re-tested one RAM stick at a time and maybe in a different motherboard RAM slot just to rule out any possible interferences).

Edited by Bartimaeus
  • Thanks 1
Quote

How I have existed fills me with horror. For I have failed in everything - spelling, arithmetic, riding, tennis, golf; dancing, singing, acting; wife, mistress, whore, friend. Even cooking. And I do not excuse myself with the usual escape of 'not trying'. I tried with all my heart.

In my dreams, I am not crippled. In my dreams, I dance.

Link to comment
Share on other sites

2 hours ago, Bartimaeus said:

@Hulk'O'Saurus

Memtest86 from the Memtest86 website. You use the included imageusb.exe to burn the image to a USB flash drive (or you can use something like Rufus if you need to do it to a CD/DVD instead), then boot the flash drive (let me know if you need help on figuring that out), start the test after it successfully boots the Rufus interface. If there are any errors at all, the RAM is the problem (and should be re-seated and re-tested one RAM stick at a time and maybe in a different motherboard RAM slot just to rule out any possible interferences).

Thanks for the reply :). I will try the test tomorrow. 

In the mean time I got a few suggestions from another forum. One of them was running the laptop unplugged for as long as it could while doing something small on it. It didn't crash. Usually it would crash after 10 minutes of work, and quite consistently so. But not this time. 

It appears that the fault might be in the power supply and cables, and that could be why the technicians didn't find anything else while working on the unit. 

Will call MSI tomorrow, as well. 

IP5ok2U.png

m0x5eY5.pngtBxm170.png

Link to comment
Share on other sites

@Hulk'O'Saurus

That's...odd. Never heard of something like that, though I have significantly less experience with laptops than I do desktops. What about being plugged in would cause the PSU to act abnormally? You'd think it'd be the other way around if anything, that it might not be able to fully deliver power if it's running off of battery...

  • Hmmm 1
Quote

How I have existed fills me with horror. For I have failed in everything - spelling, arithmetic, riding, tennis, golf; dancing, singing, acting; wife, mistress, whore, friend. Even cooking. And I do not excuse myself with the usual escape of 'not trying'. I tried with all my heart.

In my dreams, I am not crippled. In my dreams, I dance.

Link to comment
Share on other sites

Battery power is already DC, wall power is AC and has to be run through a transformer to get to DC? So if the transformer (or rectifier maybe?) is bad... it causes fluctuating current and the over/ under volt protection in the laptop kicks in?

I mean, I would have thought that would cause outright crashes/ dying rather than BSODs or revert to using battery power, but 'bad' power from a dying psu can give some pretty baffling symptoms as well. I've also had plenty of non computer stuff fix itself by changing a power cord.

Edited by Zoraptor
  • Like 1
  • Thanks 1
Link to comment
Share on other sites

OK... 

I did another test today to see whether unplugged will work further. 

This is what happened. 

I put the laptop to charge while switched off. Once charged I unplugged it and did stuff for about 40 minutes without any problem, then I remembered that it was set to switch off within 5 minutes of inactivity while on battery power. I went to control panel to change that and it froze - no BSOD. I plugged it in and switched it off/on then went back to doing something and it BSOD within 5 minutes while plugged in and it gave this stop code:

h8djKTS.jpg

I plugged it off and wait for Windows to restart. It went BSOD twice on Windows startup, giving an WHEA_UNCORRECTABLE_ERROR, then it gave a message that Windows did not start properly at which point I selected restart anyways. It did go into Windows this time, but at that point I switched it off and went back to the drawing board :) 

Edited by Hulk'O'Saurus

IP5ok2U.png

m0x5eY5.pngtBxm170.png

Link to comment
Share on other sites

@Zoraptor "'bad' power from a dying psu can give some pretty baffling symptoms[...]" Amen to that.

@Hulk'O'Saurus So wait, the system froze when you tried to enter control panel while on battery power?

This is a weird situation. So if it were me, I'd be doing a number of things:

1. Since we suspect a power issue, set the CPU power level in Windows' advanced power plan settings to minimum (5% usually, I think).

2. Using temperature-monitoring software to make sure the CPU and GPU temps aren't weird (hwinfo64).

3. Memtest86 to see if the RAM errors out (though if the power supply is bad and causing power fluctuations or something, it may not necessarily be the RAM's fault that it's erroring out). Random BSOD errors like the ones you're getting is a frequent symptom of bad or malfunctioning RAM. I would probably also test plugged vs. unplugged since you've mentioned the different Windows experience vs. the two is still occurring, even if problems don't totally go away when running unplugged.

4. Run an external OS off of an external hard drive or a flash drive. Does it still keep BSODing?

5. Test more with battery power. Are you consistently getting significantly better uptime vs. when using plugged power? Is there any difference in the way you've used the two (i.e. waiting a while for the battery to charge to get a literal cold boot with battery power vs. frequent hot boots with plugged power)?

Edited by Bartimaeus
  • Thanks 1
Quote

How I have existed fills me with horror. For I have failed in everything - spelling, arithmetic, riding, tennis, golf; dancing, singing, acting; wife, mistress, whore, friend. Even cooking. And I do not excuse myself with the usual escape of 'not trying'. I tried with all my heart.

In my dreams, I am not crippled. In my dreams, I dance.

Link to comment
Share on other sites

Is the BSOD stop code/module consistent at all?
IRQ errors would seem strange to be power related.
You can also check if running 'high performance' settings wile on battery or 'power saving' settings while plugged-in will make a difference.
 

Link to comment
Share on other sites

He listed two different types of errors in his previous post, WHEA_UNCORRECTABLE_ERROR & IRQL_NOT_LESS_OR_EQUAL, so I assume not. When I looked up both of them, it seemed like people all said "yeah, you're going to have to look at your bsod dump to figure that one out", which is not fun to do.

Quote

How I have existed fills me with horror. For I have failed in everything - spelling, arithmetic, riding, tennis, golf; dancing, singing, acting; wife, mistress, whore, friend. Even cooking. And I do not excuse myself with the usual escape of 'not trying'. I tried with all my heart.

In my dreams, I am not crippled. In my dreams, I dance.

Link to comment
Share on other sites

2 hours ago, Bartimaeus said:

 

@Hulk'O'Saurus So wait, the system froze when you tried to enter control panel while on battery power?

It froze after I entered control panel :). 

2 hours ago, Bartimaeus said:

2. Using temperature-monitoring software to make sure the CPU and GPU temps aren't weird (hwinfo64).

3. Memtest86 to see if the RAM errors out (though if the power supply is bad and causing power fluctuations or something, it may not necessarily be the RAM's fault that it's erroring out). Random BSOD errors like the ones you're getting is a frequent symptom of bad or malfunctioning RAM. I would probably also test plugged vs. unplugged since you've mentioned the different Windows experience vs. the two is still occurring, even if problems don't totally go away when running unplugged.

I am running memtest86 literally now - it's on it's 4th hour and nearly done with the 3 check, one more to go. I am doing the test while plugged in, and so far it hasn't shown any problems, and the test has given 0 errors, as well. Could that be important somehow? 

I also notice that the memtest has a temp check for the CPU which shows stable 55-60 centigrade. 

2 hours ago, Bartimaeus said:

1. Since we suspect a power issue, set the CPU power level in Windows' advanced power plan settings to minimum (5% usually, I think).

5. Test more with battery power. Are you consistently getting significantly better uptime vs. when using plugged power? Is there any difference in the way you've used the two (i.e. waiting a while for the battery to charge to get a literal cold boot with battery power vs. frequent hot boots with plugged power)?

That I will. 

2 hours ago, Bartimaeus said:

4. Run an external OS off of an external hard drive or a flash drive. Does it still keep BSODing?

I've heard of Ubuntu boot from USB. Perhaps a stupid question, but is it free ;)?

IP5ok2U.png

m0x5eY5.pngtBxm170.png

Link to comment
Share on other sites

If it's on its 4th hour and no errors detected, it's good, especially for a system crashing that quickly and consistently. Temperature's probably fine as well.

I would stick with a copy of Windows, because you want to replicate your current setup but with a different drive to control for any other factors. You can download an installable copy of Windows 10 here: https://www.microsoft.com/en-us/software-download/windows10

Select "create installation media", pick the version of Windows that matches yours, select .iso file or to USB depending on what you want to do, then use WinToUSB to install directly to a flash drive (Microsoft does not normally let you install Windows to flash drives - you have to use software like WinToUSB to do so).

Actually, I recall now that you said you did a full reinstall of Windows 10 as it is, so you probably only need to follow the WinToUSB thing if you already have install media.

If this were a desktop, the easiest thing to test would be to swap out the PSU. Stupid laptops.

Edited by Bartimaeus
Quote

How I have existed fills me with horror. For I have failed in everything - spelling, arithmetic, riding, tennis, golf; dancing, singing, acting; wife, mistress, whore, friend. Even cooking. And I do not excuse myself with the usual escape of 'not trying'. I tried with all my heart.

In my dreams, I am not crippled. In my dreams, I dance.

Link to comment
Share on other sites

15 hours ago, Bartimaeus said:

If it's on its 4th hour and no errors detected, it's good, especially for a system crashing that quickly and consistently. Temperature's probably fine as well.

I would stick with a copy of Windows, because you want to replicate your current setup but with a different drive to control for any other factors. You can download an installable copy of Windows 10 here: https://www.microsoft.com/en-us/software-download/windows10

Select "create installation media", pick the version of Windows that matches yours, select .iso file or to USB depending on what you want to do, then use WinToUSB to install directly to a flash drive (Microsoft does not normally let you install Windows to flash drives - you have to use software like WinToUSB to do so).

Actually, I recall now that you said you did a full reinstall of Windows 10 as it is, so you probably only need to follow the WinToUSB thing if you already have install media.

If this were a desktop, the easiest thing to test would be to swap out the PSU. Stupid laptops.

Actually, I was always using the in-build reset function of the laptop - it has an image of Windows there somewhere but when it installs it always does with all the firmware on, plus all the updates. In other words that Windows reset option keeps itself updated. When this whole BSOD thing started I looked up a few things online, and apparently there has been a small upsurge with BS all over the net over the latest(then) Windows update. I didn't really think too much of it then, but when the next update rolled out and it didn't fix my issue I shipped it back, ect., ect.

On the other hand Memtest86 returned 0 errors and was on for more than 6 hours while plugged in. 

 jhnfXL4.jpg

19 hours ago, pmp10 said:

Is the BSOD stop code/module consistent at all?
IRQ errors would seem strange to be power related.
You can also check if running 'high performance' settings wile on battery or 'power saving' settings while plugged-in will make a difference.
 

It's WHEA_UNCORRECTABLE_ERROR the vast majority of the time. 

I can remember seeing IRQL appearing only once so far. 

I will check different performance setups today. 

Edited by Hulk'O'Saurus

IP5ok2U.png

m0x5eY5.pngtBxm170.png

Link to comment
Share on other sites

Managed to open bluescreenviewer in windows unplugged. It froze once in between but did not produce a dmp file. This what I managed to take:

https://i.imgur.com/MWIfXfb.png

https://i.imgur.com/8C86zTe.png

https://i.imgur.com/K2jjkfa.png

https://i.imgur.com/oVbnx7W.png

https://i.imgur.com/xvcDYqv.png

On first glance they all look like driver related issues. Will try safe mode some time later on. 

If anything rings a bell...

IP5ok2U.png

m0x5eY5.pngtBxm170.png

Link to comment
Share on other sites

My next step would definitely be install an actually fresh copy of Windows onto a USB drive like I suggested above...also, disable your internet if you can before doing so so that Windows 10 doesn't immediately start auto-updating and prevent you from seeing if that's more stable over a long period of time before updating Windows or drivers. You can also install GSmartControl to check out the vitals of your hard drives, but I think for something as extreme as this, Windows would probably detect something's awry and automatically start chkdsk, which I'd think you'd mention if it did, so I'm not too concerned about the drive(s).

If with a new format on a separate drive before updating and before installing drivers the system continues to have stability problems, I think you can pretty safely rule that it's the PSU, especially with all the other symptoms. If the problem doesn't persist, though, that's where it'll get interesting.

The "at fault" files shown in those images are all like critical kernel and abstraction layer files, so it doesn't really say anything besides that there is something seriously wrong with the system.

Edited by Bartimaeus
  • Thanks 1
Quote

How I have existed fills me with horror. For I have failed in everything - spelling, arithmetic, riding, tennis, golf; dancing, singing, acting; wife, mistress, whore, friend. Even cooking. And I do not excuse myself with the usual escape of 'not trying'. I tried with all my heart.

In my dreams, I am not crippled. In my dreams, I dance.

Link to comment
Share on other sites

@Hulk'O'Saurus Before doing what I said in the previous post, it occurs to me that we haven't even tried as basic of a thing as putting the system into safe mode (which I see you actually just mentioned in your previous post). Please do that and test stability before anything else.

Edited by Bartimaeus
  • Thanks 1
Quote

How I have existed fills me with horror. For I have failed in everything - spelling, arithmetic, riding, tennis, golf; dancing, singing, acting; wife, mistress, whore, friend. Even cooking. And I do not excuse myself with the usual escape of 'not trying'. I tried with all my heart.

In my dreams, I am not crippled. In my dreams, I dance.

Link to comment
Share on other sites

Just now, Bartimaeus said:

@Hulk'O'Saurus Before doing what I said in the previous post, it occurs to me that we haven't even tried as basic of a thing as putting the system into safe mode. Please do that and test stability before anything else.

I actually literally put it in safe mode about half an hour ago, plugged in. 

Have the bluescreenviewer on and it has been running without issue. 

This is something that is coming from Tom's Hardware:

I suspect a driver issue. Most common might be the graphics driver.
Windows has been known to update graphics drivers to less than optimal versions.
In a gaming laptop you have both integrated and discrete adapters.
I might try reinstalling the graphics drivers for your laptop directly from asus.
I note that the driver available seems to be old, released on 8/2016
Alternately, install the mobile graphics driver directly from nvidia.

If you can access the desktop, select to run using the integrated adapter.
That will be the default when running on battery.
Plugged in, the default will be the discrete adapter.
Look through your dumps and google any code that looks involved; that might give you a clue as too what is involved.

IP5ok2U.png

m0x5eY5.pngtBxm170.png

Link to comment
Share on other sites

Does your "reset" Windows 10 come with drivers pre-installed? If so, start uninstalling them one by one between safe and not safe mode boots until the system stops freezing outside of safe mode, :).

Edited by Bartimaeus
  • Thanks 1
Quote

How I have existed fills me with horror. For I have failed in everything - spelling, arithmetic, riding, tennis, golf; dancing, singing, acting; wife, mistress, whore, friend. Even cooking. And I do not excuse myself with the usual escape of 'not trying'. I tried with all my heart.

In my dreams, I am not crippled. In my dreams, I dance.

Link to comment
Share on other sites

22 minutes ago, Bartimaeus said:

Does your "reset" Windows 10 come with drivers pre-installed? If so, start uninstalling them one by one between safe and not safe mode boots until the system stops freezing outside of safe mode, :).

Yes it does. Usually it only needs Nvidia Experience and it's good to go. 

I will need to do some browsing... x)

EDIT: Jebus... scratch that, as I was literally about to hit the Win + R button the thing BSOD again while in safe mode... IRQL error. Plugged in.

Edited by Hulk'O'Saurus

IP5ok2U.png

m0x5eY5.pngtBxm170.png

Link to comment
Share on other sites

42 minutes ago, pmp10 said:

Honestly I'd say if it can be reproduced in safe mode then it should be covered by warranty.
If you are really up for a trip down the rabbit-hole you can troubleshoot further with driver verifier but it will get messier with every step forward. 

The thing is almost 3 years old and out of warranty 😄 

Anyways...

It is running under Diagnostic Startup atm, and that driver verifier is a thing I'll check out. 

IP5ok2U.png

m0x5eY5.pngtBxm170.png

Link to comment
Share on other sites

Slight update. 

It has ran for about 4 hours now under Diagnostic Startup. Slight caveat - I can't set sleep off under this setup and it goes to sleep every now and again. I don't know if that's significant in any way but it hasn't given any blue screens.  

I was about to make a Windows usb image when I ran into some write protect issues with my 125 gb pen drive. Ordered a new one for tomorrow. 

  • Like 1

IP5ok2U.png

m0x5eY5.pngtBxm170.png

Link to comment
Share on other sites

Started it up under Ubuntu from flashdrive and it has ran for hours, plugged in and without issue. 

I called MSI, as well. I wanted to try and restore through the factory image, but it wasn't working. Instead they said that because I've shipped the unit recently for inspection I am entitled to a 3 months extended warranty period for free. They have stated somewhere in their documents that they do not handle software problems, although I do not know how the issue would be classified. 

Opinions?

Edited by Hulk'O'Saurus

IP5ok2U.png

m0x5eY5.pngtBxm170.png

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...