Resolved -
Dear partner
We are pleased to let you know that the problem has been resolved and we are up and running again.
We apologize for the challenges this may have caused!
Best regards
Cloud Factory team
Feb 17, 13:11 CET
Monitoring -
Dear Partner
The issue has been resolved, and we are actively monitoring the systems while Nutanix continues with log extraction and follow-up on the incident.
We are closely monitoring the environment to ensure stability.
I/O latency is almost back to normal, and we expect that all standard workloads will now function as usual. However, as the environment is still under heavy load, some high-workload servers may experience temporary slowdowns.
As mentioned previously, we highly recommend checking that your servers are up and running and verifying that all services are functioning as expected for both you and your end customers.
We will not move the incident to the "Resolved" state until we have completed a thorough Root Cause Analysis.
We sincerely apologize for any inconvenience this has caused.
Best regards,
Cloud Factory Team
Jan 31, 18:13 CET
Update -
Dear Partner,
The I/O latency is decreasing rapidly, which is a good sign. We expect to return to normal latency levels within 1–2 hours.
Latency has already decreased a considerable amount, meaning that many applications and VMs will now run without issues.
However, as mentioned in our previous message, we are still seeing a significant impact on I/O performance, which is causing increased VM startup times. In some cases, this may result in VMs failing to start as expected or starting up with errors.
This issue particularly affects VMs with relatively high I/O loads, where services may not start properly.
We recommend checking whether your VMs have started correctly and verifying that services are running as expected. Please note that, due to the ongoing I/O latency, some VMs may still experience slow performance.
We will provide a new status update once I/O levels have returned to normal and are considered stable.
Best regards
Cloud Factory team
Jan 31, 15:35 CET
Identified -
Dear Partner,
We have identified the problem causing the previously reported issues.
The cluster service on Cluster 01 triggered a reboot of all VMs on the cluster.
All VMs have been powered on again.
However, we are seeing a very high impact on I/O performance, which is causing increased VM startup times. In some cases, this may result in VMs failing to start as expected or starting up with errors.
This is especially affecting VMs with relatively high I/O loads, where services may not start properly.
We anticipate that the high I/O load will persist for some time—likely between 1 to 3 hours—during which I/O latency will be higher than usual. This means that VM response times will be slower.
We recommend checking whether your VMs have started correctly and verifying that services are running as expected. However, please note that due to the ongoing I/O latency, VM performance may still be slow.
We are actively working to resolve the issue and will provide a new status update within 30 minutes.
Best regards,
Cloud Factory Team
Jan 31, 14:58 CET
Investigating -
Dear partner
We are currently experiencing technical difficulties with cluster 01.
It seems that VM's have been rebooted.
The issue is being investigated and we will post an update ASAP.
Best regards
Cloud Factory team
Jan 31, 14:41 CET