15 July 2016, 5:00PM update: some additional issues we are facing

Known Issues

Title Cat. Res. Description Postsort ascending Upd.
NFS service disruption 6/29/16 filesystem Resolved

OSC experienced errors with NFS services the morning of June 29 between 08:37 and 09:12 that may have caused some jobs to fail, or other unexpected behavior.  The... (Read more)

4 weeks 1 day ago 4 weeks 1 day
June 7th downtime to finish at 6:30PM Connectivity, filesystem, Infrastructure, login, Login Problems, Maintenance, Operations, Outage Resolved

Update: Downtime completed at 6:30PM, June 7th.

 

The June 7th downtime is now slated to be completed at 6:30PM.  Previous estimate was 5PM.

All systems and services will... (Read more)

1 month 3 weeks ago 1 month 2 weeks
Submit filter bug after downtime Batch Resolved

A change was made to a part of our batch software during the downtime that should have only affected users who are a part of multiple projects. We have found that there is a bug in the changes... (Read more)

5 months 2 weeks ago 5 months 2 weeks
Scheduling temporarily suspended on Oakley Batch Resolved

We are migrating the batch scheduler on Oakley to a new virtual machine. In order to accomplish this, the scheduler will be temporarily offline for about four hours on December 16th. Running jobs... (Read more)

7 months 2 weeks ago 7 months 2 weeks
Downtime Update: All Major Services Online Resolved

Friday, Sept 25th 12PM Noon:

  • Oakley is back online and has resumed running jobs.  
  • Ruby... (Read more)
10 months 2 weeks ago 10 months 1 week
Problems with Project Space (/nfs/gpfs) filesystem Resolved

(9/8/15 14:21 Eastern) Project space appears to be back to normal operation. We are running some tests to verify that the problem is fully resolved.


As of early afternoon, Sept. 8,... (Read more)

10 months 3 weeks ago 10 months 3 weeks
Lustre bug causing Oakley login node crashes filesystem, login, Oakley Resolved

Over the past two weeks we have experienced Oakely login node crashes potentially caused by a Lustre bug.  The bug (or issue otherwise) seems to be activated when a user does operations on a... (Read more)

11 months 1 week ago 9 months 3 weeks
Unscheduled GPFS Outage filesystem Resolved

As of 11:30PM on June 16th, we have removed the GPFS filesystem from service due to a number of hardware failures. At this point, further hardware failures would put a large portion of the entire... (Read more)

1 year 1 month ago 1 year 1 month
warning: libhwloc.so.1 may conflict with libhwloc.so.5 Resolved

Sometimes when building MPI programs the following warning appears.  It is harmless and can be safely ignored.

ld: warning: libhwloc.so.1, needed by /usr/local/mvapich2/1.7-intel/lib/... (Read more)          
1 year 2 months ago 9 months 3 weeks
Matlab PCT broken due to pbsrsh modification Matlab Resolved

A change was made to the system wide pbsrsh script which Matlab relies on.  It has been discovered that this change has broken the parallel computing toolbox (... (Read more)

1 year 3 months ago 9 months 3 weeks

Pages