Antsle Forum

Welcome to our Antsle community! This forum is to connect all Antsle users to post experiences, make user-generated content available for the entire community and more. 

Please note: This forum is about discussing one specific issue at a time. No generalizations. No judgments. Please check the Forum Rules before posting. If you have specific questions about your Antsle and expect a response from our team directly, please continue to use the appropriate channels (email: [email protected]) so every inquiry is tracked. 

Please or Register to create posts and topics.

I killed my Antsle One!

Hopefully someone will help guide me in the right direction... I've had an Antsle One for a little over 2 years.  I recently decided to set up a small k8s cluster using RancherOS KVMs on my antsle.  I noticed that after a few weeks, the Antsle box started becoming completely unresponsive (no Antsle GUI, no antlets running, but I COULD SSH to edgeLinux.  Usually, a simple "sudo reboot" solved it for another week or so when it would happen again.  Saturday, it happened again and after issuing the "sudo reboot" command, it never came back up.  I connected a monitor and keyboard to it and rebooted and I see the Supermicro boot for about a 1/4 of a second, then the screen clears and goes to a flashing cursor then stays there indefinitely (been 2 full days, now, with no change).  So, I plugged in an ethernet cable to the IPMI port and used the IPMI web page to look at console activity, and it looks like perhaps the boot SSD has died; likely due to heat.  IPMI also shows a lot of critical and non-recoverable temperature events in the IPMI event log (screen-capture attached).

I can replace the SSD's without issue and re-installing edgeLinux, what I'm worried about is heat killing 2 more of them.  Any ideas what I need to look at?

Uploaded files:
  • Capture.PNG
mshappe has reacted to this post.
mshappe