All systems are operational

About This Site

Welcome on the T2B Cluster status page.

Please find status information about critical T2B cluster components, incidents and planned maintenance.

Mail subscription is available to get a notification when a component status change.

Past Incidents

Friday 14th February 2025

No incidents reported

Thursday 13th February 2025

Mass Storage (/pnfs) issue with a storage node affetcing /pnfs
a month ago

There is a storage node that seems unstable and requires its reboot. It affected part of /pnfs with ~500TB not accessible.

We are in the process of getting it back running asap, and since it is its second offense, we're decommissioning it.

  • The array has finally finished repairing safely.

    Unfortunately, the filesystem on top also had consistency issues because of the disk failures. It was also repaired in the morning, but that resulted in 29 files not being recovered.

    You can find the list of those files here on the cluster: /group/lost_files/20250219.filelist.txt Please look into it, as it might have impacted some of your files (a few users/experiments have files impacted).

    In the meantime, the /pnfs system is rebuilding the file DB on the pool and scanning anew all files. This process is also estimated to take time, but thankfully all files scanned are immediately available again on /pnfs. This means that less and less files will still be inaccessible over time.

    24 days ago
  • The issue has been identified as 2 disks failing simultaneously on the same RAID array.

    No data has been lost (we are protected against 2 failing disks), and the array is busy rebuilding on 2 hot-spare new disks. Unfortunately that will take some time (at minimum 3-4 days), and the 120TB this pool hosts will not be accessible in the meantime.

    a month ago
  • Wednesday 12th February 2025

    No incidents reported

    Tuesday 11th February 2025

    No incidents reported

    Monday 10th February 2025

    No incidents reported

    Sunday 9th February 2025

    No incidents reported

    Saturday 8th February 2025

    No incidents reported