|
Rolling reboot of all clusters, starting from Wednesday morning, April 19, 2017 |
Batch, Maintenance, Owens, Ruby |
Resolved |
1:40PM 4/27/2017 Update: Rolling reboots are completed.
3:10PM 4/18/2017 Update: Rolling reboots on Owens have started to address GPFS errors occured... Read more |
8 years 10 months ago |
8 years 9 months ago |
|
Email Issues |
client portal |
Resolved |
OSU is having ongoing periodic problems with Microsoft (their mail hosting provider) severely delaying outbound email. There is no solution being offered and no timeline for getting it resolved.... Read more |
6 years 4 months ago |
6 years 2 weeks ago |
|
Slurm database repair on 01/25/2024 |
Outage |
Resolved |
We have scheduled a Slurm database repair, which is planned to start at 8:30 am US/Eastern on Thursday, January 25, 2024. During the repair, Slurm database will be offline; running jobs and... Read more |
2 years 3 weeks ago |
2 years 2 weeks ago |
|
Lustre is still offline. HPC systems back up |
Maintenance |
Resolved |
Day One of the scheduled downtime has been completed, and HPC operations have resumed. As planned, Lustre work will extend into Day Two. Jobs using /fs/lustre or $PFSDIR cannot run until this work... Read more |
11 years 7 months ago |
11 years 7 months ago |
|
Replacement of Owens Ethernet switches from Dec 14, 2018 |
Network, Owens |
Resolved |
Updated on Jan 16, 2019, at 09:20 AM:
The replacement is done except for the three switches including the login nodes of Owens. We posted another notice for more... Read more |
7 years 5 months ago |
7 years 4 weeks ago |
|
MVAPICH2 build of CP2K 6.1 |
Pitzer |
Resolved |
We have found some types of CP2K jobs would fail or have poor performance using cp2k.popt and cp2k.psmp from MVAPICH2 build (gnu/4.8.5 mvapich2/2.3). This version will be removed on December 15th... Read more |
5 years 2 months ago |
4 years 11 months ago |
|
WARN SparkSession in Jupyter + Spark instance |
Software |
Resolved (workaround) |
You may encounter the following warning message when running a Spark instance using the default PySpark kernel in a Jupyter + Spark application:
WARN SparkSession: Using an... Read more |
9 months 4 days ago |
9 months 3 days ago |
|
June 7th downtime to finish at 6:30PM |
Connectivity, filesystem, Infrastructure, login, Login Problems, Maintenance, Operations, Outage |
Resolved |
Update: Downtime completed at 6:30PM, June 7th.
The June 7th downtime is now slated to be completed at 6:30PM. Previous estimate was 5PM.
All systems and services will... Read more |
9 years 8 months ago |
9 years 8 months ago |
|
issues accessing /fs/ess/ locations |
filesystem, Operations, Outage |
Resolved |
15 Aug 2022 resolved
... Read more |
3 years 6 months ago |
3 years 6 months ago |
|
Poor network performance on some filesystems |
filesystem |
Resolved |
We are experiencing some network performance issues on a cluster of servers involved with providing GPFS and some project filesystems. GPFS appears to be functioning acceptably, but proj01, proj02... Read more |
12 years 6 months ago |
12 years 6 months ago |