ORCA

ORCA is an ab initio quantum chemistry program package that contains modern electronic structure methods including density functional theory, many-body perturbation, coupled cluster, multireference methods, and semi-empirical quantum chemistry methods. Its main field of application is larger molecules, transition metal complexes, and their spectroscopic properties. ORCA is developed in the research group of Frank Neese. Visit ORCA Forum for additional information.

We have found that several ORCA 5 jobs requiring heavy I/O load on scratch/project filesystems are causing performance issues and affecting the performance of the filesystems. For optimal performance, we recommend to run such ORCA jobs on a local disk ($TMPDIR), as discussed in the ORCA forum:

    https://orcaforum.kofo.mpg.de/viewtopic.php?f=8&t=10935&p=45270&hilit=di...
    https://orcaforum.kofo.mpg.de/viewtopic.php?f=9&t=10835&p=44967&hilit=di...

We also recommend using ORCA 4.2.1 unless ORCA 5 is necessary for your job. To run an ORCA job using $TMPDIR, please refer to the example in the Usage section below.
To avoid potential memory issues, it is important to tune the %maxcore value based on the number of cores you request. Plese refer to the "Best practices" section in the Usage guidelines below for more details.

Availability and Restrictions

Versions

ORCA is available on the OSC clusters. These are the versions currently available:

Version Owens Pitzer Cardinal Notes
4.0.1.2 X X   openmpi/2.1.6-hpcx
4.1.0 X X   openmpi/3.1.4-hpcx
4.1.1 X X   openmpi/3.1.4-hpcx
4.1.2 X X   openmpi/3.1.4-hpcx
4.2.1 X* X*   openmpi/3.1.6-hpcx
5.0.0 X X   openmpi/5.0.2-hpcx
5.0.2 X X   openmpi/5.0.2-hpcx
5.0.3 X X   openmpi/5.0.2-hpcx
5.0.4 X X X openmpi/5.0.2
* Current default version. The notes indicate the MPI module likely to produce the best performance, but see the Known Issue below named "Bind to CORE".

You can use module spider orca to view available modules for a given machine. Feel free to contact OSC Help if you need other versions for your work.

Access

ORCA is available to OSC academic users; users need to sign up ORCA Forum. You will receive a registration confirmation email from the ORCA management. Please contact OSC Help with the confirmation email for access.

Publisher/Vendor/Repository and License Type

ORCA, Academic (Computer Center)

Usage

Usage on Owens and Pitzer

Set-up

ORCA usage is controlled via modules. Load one of the ORCA modulefiles at the command line, in your shell initialization script, or in your batch scripts. To load the default version of ORCA module, use module load orca. To select a particular software version, use module load orca/{version}. For example, use module load orca/4.2.1to load ORCA version 4.2.1.

IMPORTANT NOTE: You need to load correct compiler and MPI modules before you use ORCA. In order to find out what modules you need, use module spider orca/{version}.

Batch Usage

When you log into owens.osc.edu or pitzer.osc.edu, you are actually logged into a linux box referred to as the login node. To gain access to the mutiple processors in the computing environment, you must submit your job to the batch system for execution. Batch jobs can request mutiple nodes/cores and compute time up to the limits of the OSC systems. Refer to Queues and Reservations and Batch Limit Rules for more info.  Batch jobs run on the compute nodes of the system and not on the login node. It is desirable for big problems since more resources can be used.

Interactive Batch Session

For an interactive batch session one can run the following command:

sinteractive -A <project-account> -n 1 -t 00:20:00

which requests one core (-n 1), for a walltime of 20 minutes (-t 00:20:00). You may adjust the numbers per your need.

Non-interactive Batch Job

batch script can be created and submitted for a serial or parallel run. You can create the batch script using any text editor you like in a working directory on the system of your choice. Below is the example batch script for a parallel run:

#!/bin/bash
#SBATCH --job-name=orca_mpi_test
#SBATCH --time=0:10:0
#SBATCH --nodes=2 --ntasks-per-node=<number-of-cores-per-node>
#SBATCH --account=<project-account>

module reset
module load openmpi/3.1.6-hpcx
module load orca/4.2.1
module list

sbcast -p h2o_b3lyp_mpi.inp $TMPDIR/h2o_b3lyp_mpi.inp
cd $TMPDIR
$ORCA/orca h2o_b3lyp_mpi.inp > $SLURM_SUBMIT_DIR/h2o_b3lyp_mpi.out

Please note that the <number-of-cores-per-node> cannot exceed the maximum cores per node. You can refer to Cluster Computing for the maximum number for each cluster.

Best practices

Set correct value for %maxcore

In general, it is recommended to utilize 3000, which is 75% of the usable memory per core on each cluster. However, you may need to increase %maxcore due to the methods and the modular system. In this case, you can decrease the number of cores for the same job. For example, if you have the following script to run an 80-core ORCA job on two Pitzer 40-core nodes:

#!/bin/bash
#SBATCH --nodes=2 --ntasks-per-node=40

module reset
module load openmpi/3.1.6-hpcx
module load orca/4.2.1
module list

sbcast -p h2o_b3lyp_mpi.inp $TMPDIR/h2o_b3lyp_mpi.inp
cd $TMPDIR
$ORCA/orca h2o_b3lyp_mpi.inp > $SLURM_SUBMIT_DIR/h2o_b3lyp_mpi.out

If you need to increase %maxcore to 4000, you can run ORCA with 60 cores (30 cores per node) in the same job script by replacing the ORCA command line with:

$ORCA/orca h2o_b3lyp_mpi.inp "--npernode=30" > $SLURM_SUBMIT_DIR/h2o_b3lyp_mpi.out

Known Issues

Multi-node job hang 

Resolution: Resolved
Update: 03/13/2024
Version: 5.0.x

You may experience a multi-node job hang if the job runs into a module that requires heavy I/O, e.g., CCSD. Additionally, it potentially leads to our GPFS performance issue. We have identified the issue as related to the MPI I/O issue of OpenMPI 4.1. To remedy this, we will take the following procedures:

On April 15, 2024, we will deprecate all ORCA 5.0.x modules installed under OpenMPI 4.1.x. It is recommended to switch to orca/5.0.4 under openmpi/5.0.2-hpcx with intel/19.0.5 or intel/2021.10.0. If you need another ORCA version, please inform us.

Intermittent failure of default CPU binding

Name: Bind to CORE
Resolution: Resolved (workaround)
Update: 4/27/2023
Version: At least through 5.0.4

The default CPU binding for ORCA jobs can fail sporadically.  The failure is almost immediate and produces a cryptic error message, e.g.

$ORCA/orca h2o.in
.
.
.
--------------------------------------------------------------------------
A request was made to bind to that would result in binding more
processes than cpus on a resource:

Bind to: CORE
Node: o0033
#processes: 2
#cpus: 1

You can override this protection by adding the "overload-allowed"
option to your binding directive.
--------------------------------------------------------------------------
.
.
.
[file orca_tools/qcmsg.cpp, line 465]:
.... aborting the run 

Three workarounds are known.  Invoke ORCA without CPU binding:

$ORCA/orca h2o.in "--bind-to none"

Use a non hpcx MPI module with ORCA:

module load openmpi/4.1.2-tcp orca/5.0.4
$ORCA/orca h2o.in

Use more SLURM ntasks relative to ORCA nprocs which does not prevent the failure but merely reduces it's likelyhood:

#SBATCH --ntasks=10
cat << EOF > h2o.in
%pal
  nprocs 5
end
.
.
.
EOF
$ORCA/orca h2o.in

Note that each workaround can have performance side effects, and the last workaround can have direct charging consequences.  We recommend that users benchmark their jobs to gauge the most desirable approach.

Immediate failure of MPI job

Resolution: Resolved
Update: 10/24/2022
Version: 4.1.2, 4.2.1, 5.0.0 and above

Update on 10/24/2022

The issue was resolved after upgrading SLURM to version 22. We have restored the MPI command in ORCA to mpirun.

Issue

If you have found your MPI job failed immediately, please remove all extra parameters for mpirun from the command line, e.g.

$ORCA/orca h2o_b3lyp_mpi.inp "--machinefile $PBS_NODEFILE"  > h2o_b3lyp_mpi.out

to

$ORCA/orca h2o_b3lyp_mpi.inp > h2o_b3lyp_mpi.out

We found a bug from OpenMPI following a recent SLURM update, which resulted in a multi-node MPI job failing immediately when using mpirun. We have implemented a workaround by replacing mpirun with srun in ORCA.

ORCA 4.1.0 issue with scratch filesystem

Resolution: Resolved
Update: 04/17/2019 
Version: 4.1.0

For a MPI job that request multiple nodes, the job can be run from a globally accessible working directory, e.g., home or scratch directories. It is useful if one needs more space for temporary files. However, ORCA 4.1.0 CANNOT run a job on our scratch filesystem. The issue has been reported on ORCA forum.  This issue has been resolved in ORCA 4.1.2. In the examples listed, scratch storage was used (--gres=pfsdir & $PFSDIR).

Further Reading

Scratch Storage information is availiable from the Storage Documentation

 

Supercomputer: 
Service: