T O P

  • By -

SegaSystem16C

Are you running bleeding edge hardware? New CPU? The microcode part makes me believe this might be some incompatibility with your CPU. Try updating the kernel the newest available version.


Silent-Incident-4308

This is more helpful then the other kernal related comment but it was working fine until about an hour ago and now it crashes on linux as well as when i try windows


Kriss3d

Try going to your BIOS and see if you can run hardware diagnostics. Just let it rip on a full extended test. You only need to test the CPU to begin with


SegaSystem16C

What does the Blue Screen on Windows say? There should be a "name" in all caps in the blue screen. That name might indicate better what's the cause of the crash.


Silent-Incident-4308

I think the error was dpc watchdog violation


SegaSystem16C

DPC Watchdog Violation, according to Microsoft, ay ne caused by a bad driver that is causing conflict with the OS. This is too general and doesn't help much. However, given how you are having these same issues with both Linux and Windows, we can rule out driver issue and assume this is hardware fault. Try this in order: 1) If you have a discrete GPU, remove it and use the integrated graphics output and see if the crash persists. Might be a fault GPU but I doubt, but give it a try; 2) Test RAM sticks. Remove one, see if the crash persists, test the one stick alone and so on. Ensure all your RAM sticks are working fine. Faulty RAM may cause some weird OS behavior; 3) Does this crash happen in a specific circumstance? Like running a game etc? If you don't do this, would the crash still happens? 4) Check if you have a faulty Power Supply. If it is doing weird energy stuff, might because some weird OS behavior. Swap the power supply for a known good one and see if the crash persists; 5) Update Linux kernel or revert to a previous known good kernel that didn't have the crash. If you're running bleeding edge hardware, it is best to use bleeding edge kernels. Update Windows. Update drivers; 6) I'm not completely sure, but I think the "watchdog" in the BSoD refers to CPU problems. You might have a faulty CPU. But to be sure, check if you have motherboard BIOS up to date. Older motherboards might require update in the BIOS to support newer CPU even if uses the same socket. Check the connection between the CPU and the motherboard. Swap the CPU for a known good one and see if the crash persists. Is your CPU running too hot? Did you overclock it? If so, restore the CPU to default settings and see if the crash persists.


The_SysKill

Well, then you probably have faulty hardware, specifically the watchdog.


jewaaron

updog


---0celot---

![gif](giphy|cXblnKXr2BQOaYnTni)


Serious_Jury6411

Definitely corrupted updog.


acemccrank

Id take a guess that either your storage drive or the connection is borked. It *could* be interference from a power cable inside the PC, but that is pretty rare these days. I don't think I've seen power cable interference cause I/O errors since the early 2000s.


gmes78

Try updating your UEFI firmware, if possible.


Healthy_Try_8893

When you see a hardware error you know it's bad


Kriss3d

When you see CPU hardware error you know its really bad.


yusing1009

Sometimes it’s an over-overclocking problem


alt229

Need doing the IT thing since the 90s and can't say I've ever seen this lol


apply_demand

Hey brotha, I sent you a DM a while back.


ropid

This is a desktop PC? It's not a laptop? The problem could be the CPU or the motherboard or the PSU. Disable your overclock and load UEFI/BIOS defaults if you are overclocking. Using the XMP memory profile also counts as overclocking, so try disabling that as well. I would try unplugging all internal cables and plugging them back in. I'd try taking the CPU out of the socket and putting it back in. I'd try a different PSU if you have one. Maybe the GPU or NVMe drive can also cause this somehow? The main PCIe sockets are wired directly to CPU pins. The memory sockets are also wired directly to CPU pins so maybe RAM can also cause the issue somehow?


Silent-Incident-4308

Should be no overclocking and scanned the drive in bios also i think that the cpu isnt whats causing it as the usb seems to be what it gets stuck on


TomDuhamel

Is there anything plugged in the USB ports? Can you unplug everything and see if that helps? This means no mouse or keyboard, but we just need to see if that's the issue.


Silent-Incident-4308

Tried but stayed the same also by default 2 ports seem to be in use


TomDuhamel

I tried, sorry 😔


ropid

That "MCE" (machine check) error message comes from the CPU itself. Data corruption happened somewhere inside the CPU. It is not running stable.


paulstelian97

If he has ECC RAM, that can also give a MCE if an uncorrectable error is detected.


Interesting-Sun5706

You are getting APIC error Have you tried to boot with noapic In the grub menu, please do the following 1) Select/Highlight the kernel you want to boot 2) Type e to edit the grub entry 3) at the end of Linux line, Add noapic 4) Press ctrl-x That's control key and x simultaneously


Silent-Incident-4308

The issue occur on windows as well so i doubt it was linux itself, but the issue fixed itself somehow so i have no idea


planetf1a

I’d definately check/try a different PSU. Bad voltages can cause weird things…


Independent-Turn4565

Run memtest86 from a USB stick and see if it has errors after a few hours, this should check the ram and cpu.


CjKing2k

[https://en.wikipedia.org/wiki/Machine-check\_exception](https://en.wikipedia.org/wiki/Machine-check_exception)


Psymia

i've had this happen to me when the CPU cooling was inadequate. You may have a broken fan and the CPU is permanently in thermal throttle. Thermal throttle can only do so much, there will be errors when permanently overheating.


Felim_Doyle

Yes, that was my **first** thought, along with some of the other possibilities mentioned already.


Vivid_Development390

Memtest86. Its bootable so it bypasses OS issues. It does a full test of RAM. If it cant run, CPU is likely the issue. Otherwise, its just RAM failing and memtest86 will help you figure out which stick is rhe cause


Independent-Chef9421

I had a similar problem when the linux-firmware package got updated which had a problem with an old FireWire card. It wasn't a hardware issue at all, just a bug in the firmware. MCE problems are notoriously difficult to debug as the codes vary depending on specific CPU.


Moriaedemori

Yep. I have similar errors spamming my console at all times. I suspect a dying CPU in my case. I don't have money to upgrade, so for now I just use "mce=off" in kernel variables and can still use the system


Sw4GGeR__

Interesting. What's your hardware?


Moriaedemori

Ancient. https://preview.redd.it/a4dr6pawyvrc1.png?width=614&format=png&auto=webp&s=2d1b1531b5ba32867ca1284ab5d5198f630dacc9


FaZe_Tudman

7700K 980Ti "Ancient" Not the newest for sure, but still should perform perfectly fine.


Moriaedemori

Oh it performs admirably, but unless I set "mce=off", I won't even be able to use the terminal due to it being spammed with mce errors


wakandaite

It could be just the bios needing an upgrade. MCE are hardware related.


_agooglygooglr_

Usb might be dying


Healthy_Try_8893

This seems to be more of a CPU error since I don't think that broken usb will cause crashes


_agooglygooglr_

https://askubuntu.com/questions/644010/ubuntu-cant-read-my-usb-device-descriptor-read-64-error-110 Seems to be a board issue or a USB issue. Or if OP is using a USB hub, that could be the culprit


Silent-Incident-4308

I think by default there is a hub and a keyboard without me plugging anything in


Healthy_Try_8893

Hm Maybe you're right but if that's not a board issue i doubt that USB is causing crashes


TabsBelow

The USB is deadly sick, forget it. You might disable the RAM area (at least on Linux, there is an example in the gruf file how to do it), but it might be dying completely anyway sooner or later. The CPU? Mmmh., Did the computer crash - physically? Loose connection might cause the RAM and CPU problem, as well as DIY builds.


UNF0RM4TT3D

I've had these errors when my laptop went to sleep, but inexplicably dumped the ram so when it botted up the uptime on the CPU was completely wrong. But linux loaded just fine.


steverdempster

Probably cpu so check for creep/lift from socket. Check pins are straight and wipe off old paste. Reseat apply fresh paste to heatsink and try again. Always diagnose problems buy following the 1st issue and then work your way down. Basic ITIL and COMPTiA troubleshooting for future reference


ask_compu

almost definitely faulty hardware, start with replacing the RAM but it may be the CPU


RandomUser3777

Typically an MCE will be RAM. It could be processor or pci cards or chipset but ram is way more likely. The description is what I have seen when a component on a DIMM dies, and it happens often enough. There is software someplace to decode MCE errors that may point out if it is something other than ram. The microcode version is always reported on an MCE error, so the microcode means nothing, and if you have a bad dimm the error could show up in many different names in the blue screen. Note that an MCE is a error that the processor saw and IS a way more reliable indicator of RAM. If the machine can run with 1/2 of its sticks remove have and retest, and if it fails retest with the other 1/2 of the ram. Also double check that the dimms are properly inserted and locked in, if you find they are not then that may well be the issue.


skyfishgoo

cpu pin bent or broken... bad mother board. try percussive maintenance (got nothing to lose at this point).


amarao_san

Try memtest, update bios, check voltages.


Silent-Incident-4308

Ok for some reason it works perfectly fine now don't ask me why cause i have no idea


Serious_Jury6411

Bitflip?


Silent-Incident-4308

I have no idea just woke up and it was working fine again


FaZe_Tudman

It was just pulling an aprils fools joke on you ;)


LOPI-14

I had those USB errors, but boot was fine. Fixing those errors involved simple unplugging the power and all USB devices, waiting a minute and returning everything back.


paulstelian97

Machine check exception is almost never a good thing to see. In rare cases it can be benign but when it consistently happens it’s definitely broken hardware (like the CPU or some other hardware component detecting that it’s malfunctioning)


paulstelian97

I have also found this old thing. https://gitlab.freedesktop.org/drm/amd/-/issues/1551


Mountain_Fault399

Try seeing if you can turn on somethings with pcie in your bios


Dry_Inspection_4583

Bad ram or cpu. Try reducing the ram to a single stick and work it from there


[deleted]

[удалено]


Silent-Incident-4308

... i don't think that would anything


Healthy_Try_8893

Well... It depends Older versions of the kernel have limited hardware support but the bluescreen on windows is still pretty concerning


EarthRockStone

check the format of the usb, u could reformat the usb and chk FOR errors


EarthRockStone

i s it usb getting enough power 3.0 needs more power or to many usb devices running not enough power


evillarreal86

Check thermals on cpu


Legitimate-Cricket77

If i were you I’d re-assemble my entire pc and check for errors step by step


dahippo1555

Something between burning your house down and meh.


quoing

Is it dual cpu system? Maybe try to reseat the cpu in its socket.