Known issues

Unresolved known issues

Known issue with an Unresolved Resolution state is an active problem under investigation; a temporary workaround may be available.

Resolved known issues

A known issue with a Resolved (workaround) Resolution state is an ongoing problem; a permanent workaround is available which may include using different software or hardware.

A known issue with Resolved Resolution state has been corrected.

Known Issues

Titlesort descending Category Resolution Description Posted Updated
Lustre jobs suspended filesystem Resolved

The Lustre filesystem ($PFSDIR and /fs/lustre) has crashed several times Friday evening (8/15). We have degraded this service temporarily, while we work to isolate the actions that are triggering... Read more

10 years 6 months ago 10 years 5 months ago
Lustre Updates filesystem Resolved

9/10/14 - We have not seen any additional crashes of the Lustre servers since making this change.

8/26/14 
- Lustre jobs are being accepted as of 10AM this... Read more

10 years 5 months ago 10 years 5 months ago
Lustre, Infiniband Operational and Being Monitored Closely filesystem Resolved

UPDATE: Most users should no longer see any issues with Lustre.


Again, please continue to notify OSC Help of any errors you see in job output. For example, you might see "... Read more

10 years 6 months ago 10 years 6 months ago
Maintenance for OnDemand and other web based services Resolved

Update (12/13/14 10am): Maintenance has finished as planned.

 

OnDemand, AweSim applications, and other web based services will be down starting Wednesday, January 31 at 8:30AM for... Read more

10 years 1 month ago 10 years 1 month ago
Maintenance outage on the cluster export services Maintenance, OnDemand, Ruby Resolved

Update on 14 April 2020, 0903:

Work is completed.

Original message:

There will be maintenance on cluster export services on Tuesday, April... Read more

4 years 10 months ago 4 years 10 months ago
Major network switch outage Network Resolved

01:20 PM 11/14/2018 Update:

... Read more

6 years 3 months ago 6 years 3 months ago
Matlab PCT broken due to pbsrsh modification Matlab Resolved

A change was made to the system wide pbsrsh script which Matlab relies on.  It has been discovered that this change has broken the parallel computing toolbox (... Read more

9 years 9 months ago 9 years 4 months ago
Missing shared library of some mvapich2 modules Owens, Pitzer Resolved

Updates on Feb 25 2022:

This issue is fixed. 

Original Post:

Users may see an issue of missing shared library with some mvapich2 modules... Read more

2 years 11 months ago 2 years 11 months ago
module spider/avail/show not showing MPI dependent modules Ruby Resolved

On Ruby, the commands:

  • module spider
  • module avail
  • module show... Read more
9 years 9 months ago 9 years 4 months ago
MOE license server down Licensing Resolved

The MOE license server is experiencing an unknown issue and potentially down.  We are working to resolve the issue.

1 year 4 months ago 1 year 4 months ago

Pages