I encountered this issue as well while using an RTX 3090. Monitoring my temperatures with OCCT, I noticed that while my GPU temp hovered around 85°C, other areas of the card were reaching as high as 105°C. After identifying some airflow issues in my case (I have a new case on the way) and adjusting my voltage via MSI Afterburner, I was able to stabilize my temps and eliminate crashes.
One of the biggest improvements came from tuning my card’s clock speed down by about 200 MHz using the curve editor in MSI Afterburner. Surprisingly, this had no noticeable impact on game performance but significantly reduced power draw and heat output, making my system more stable overall. Although, I still need a case with more efficient airflow, cleary. I normally play games that are less GPU intensive, so I might not have ever noticed the issue otherwise.
Also, unlike many of you, I was able to play for about an hour before encountering the "Fatal Error." If anyone else is troubleshooting, I highly recommend checking all GPU temperature readings and not just the primary temp. Be sure to try experimenting with slight underclocking, as this adjustment made a huge difference for me!
I hope you all get this fixed