Some systems are experiencing issues

About This Site

Welcome on the T2B Cluster status page.

Please find status information about critical T2B cluster components, incidents and planned maintenance.

Mail subscription is available to get a notification when a component status change.

Past Incidents

Tuesday 26th March 2024

User Interfaces - mX machines Issues with /user

Hello,

Unfortunately we have encountered the same issue with /user as last Friday. Connection to some mX machines can be slow, and /user storage is either slow or blocked.

We are still trying to find the source and get it fixed definitively. Also, /pnfs is not affected.

Sorry for all the issues this causes.

Cheers, The T2B IT Team

  • Hello,

    We have finally found the source of the issues with /user. It was due to a wrong workflow of a user, so after removing jobs all is fixed.

    On this note, please make sure to NEVER have a single input data file that is read by all your jobs on /user. Our /user storage system cannot cope with thousands of jobs trying to read a single file. The correct workflow is to put the file(s) on /pnfs, then inform us so that we can make duplicates, which protects the storage system from harm.

    Cheers The T2B IT Team

  • Monday 25th March 2024

    No incidents reported

    Sunday 24th March 2024

    No incidents reported

    Saturday 23rd March 2024

    No incidents reported

    Friday 22nd March 2024

    No incidents reported

    Thursday 21st March 2024

    No incidents reported

    Wednesday 20th March 2024

    Mass Storage (/pnfs) pnfs slowness

    Hello,

    Unfortunately /pnfs is under heavy pressure from CMS global redirector. The server load goes above 30k, which slows everything down. Everythin works, but slowly ...

    We've stopped the redirector service for a few hours since yesterday, and while off everything works fine. We're trying regularly to make it work again, so if you find periods of slowness it is unfortunately unavoidable.

    We'll inform you as soon as the the situation is resolved.

    Cheers, The IT Team

  • Hello,

    We have found a fix and applied it to make sure the xrootd storm will not happen again and impact negatively all pnfs transfers.

    Cheers, Romain