Some systems are experiencing issues

About This Site

Welcome on the T2B Cluster status page.

Please find status information about critical T2B cluster components, incidents and planned maintenance.

Mail subscription is available to get a notification when a component status change.

T2B Cluster

Mass Storage (/pnfs)

Operational

Batch System

Operational

User Interfaces - mX machines

Operational

Network

Operational

Past Incidents

Thursday 30th January 2025

No incidents reported

Wednesday 29th January 2025

No incidents reported

Tuesday 28th January 2025

No incidents reported

Monday 27th January 2025

No incidents reported

Sunday 26th January 2025

No incidents reported

Saturday 25th January 2025

No incidents reported

Friday 24th January 2025

Mass Storage (/pnfs) issues with slow pnfs affecting all machines

Hello,

Unfortunately standard access to /pnfs have been slow lately. To fix this, we have been forced to restart the nfs service for pnfs.

The consequence to that is that many of our machines have lost access to /pnfs entirely. We are busy fixing that, and might have to restart mX machines.

More info will be given asap.

Sorry for the troubles, The T2B Admin Team

Unfortunately pnfs is still very unstable.

Restarting the services or even rebooting the whole machine does not help.

We're still investigating what could cause the instability.

We had to restart again all the pnfs system, and that fixed everything. /pnfs is accessible again from all mX machines without a need to reboot them. A lot of worker nodes went down because of the issue unfortunately, we have remotely restarted as many as possible.

Current cluster capacity: EL7 - free: 420 + run: 3499 [1368+2131] + drain: 0 = 3919 EL9 - free: 28 + run: 5220 [5026+194] + drain: 0 = 5248