Some systems are experiencing issues

About This Site

Welcome on the T2B Cluster status page.

Please find status information about critical T2B cluster components, incidents and planned maintenance.

Mail subscription is available to get a notification when a component status change.

Stickied Incidents

Monday 12th August 2024

Mass Storage (/pnfs) Some issues with mass storage /pnfs [Rucio & Crab]

Hello,

Several users have reported that:

1/ Rucio does not allow copies to our RSE with error: Details: RSE excluded; not available for writing.

2/ Crab also complains that tasks can't be started because you are not allowed to write on your home directory on our site: Checkwrite Result: Unable to check write permission in /store/user/rougny on site T2_BE_IIHE

We are investigating both issues.

On the other hand, standard grid commands on your files (eg gfal-copy) seem to work without any issues.

Cheers, Romain

  • Dear all,

    After consulting with central CMS IT services, it seems that they have resolved the problem from their end. We also received confirmation from users that the rucio and crab indeed work as expected again.

    Kind regards,

    Olivier For the T2B Admin team

  • Past Incidents

    Saturday 30th March 2024

    No incidents reported

    Friday 29th March 2024

    No incidents reported

    Thursday 28th March 2024

    No incidents reported

    Wednesday 27th March 2024

    No incidents reported

    Tuesday 26th March 2024

    User Interfaces - mX machines Issues with /user

    Hello,

    Unfortunately we have encountered the same issue with /user as last Friday. Connection to some mX machines can be slow, and /user storage is either slow or blocked.

    We are still trying to find the source and get it fixed definitively. Also, /pnfs is not affected.

    Sorry for all the issues this causes.

    Cheers, The T2B IT Team

  • Hello,

    We have finally found the source of the issues with /user. It was due to a wrong workflow of a user, so after removing jobs all is fixed.

    On this note, please make sure to NEVER have a single input data file that is read by all your jobs on /user. Our /user storage system cannot cope with thousands of jobs trying to read a single file. The correct workflow is to put the file(s) on /pnfs, then inform us so that we can make duplicates, which protects the storage system from harm.

    Cheers The T2B IT Team

  • Monday 25th March 2024

    No incidents reported

    Sunday 24th March 2024

    No incidents reported