OnDemand provides two related job managment tools, one which allows you to create and submit jobs via your web browser, and the other allows you to monitor your queued and running jobs.
Selecting "My Jobs" in the Jobs menu will open an application that allows you to create new jobs and submit them to the cluster, and inspect the results of jobs submitted via this tool.
The core functionality of this tool is provided by the "New Job" button. Clicking on it will open the job creation wizard.
Following the steps in order:
- Input a job name.
- Select which cluster you wish to run the job on from the pull-down list.
- Select whether you want to use one of our provided job templates as a starting point, use a job you have already created with the wizard, or use a template for which you know the directory.
- Expanding the list items will show you what files are associated with that template. Click on the template you wish to use to ensure it is selected.
- Click "Copy" to copy the template files to your new job.
- Select the file you wish to submit from the job directory. Double click it to have it loaded into the editor pane.
- Make any necessary changes to the script.
At this point, you can either submit the job to the queue using the "submit" button, use the "save" button to save the job (if you need to make more advanced changes to the input files, for example, or just want to save it until later), or "cancel".
After you exit the wizard, you will be returned to the main My Jobs screen.
On the main screen for "My Jobs" you can click on a job to examine it.
If the job has not been submitted, the "submit job" button will be active, and will submit the selected job to the queue.
The "Edit" pull-down menu will allow you to edit or delete a job, depending on it's status.
The "View" pull-down menu will allow you to view the script, job output, and job error files.
The "Go" pull-down menu will launch the file transfer client or ssh client, starting in the job directory, or jump directly to the job monitor for that job, depending on the job status.
The "Active Jobs" application will show you all of your jobs currently in the queue (running or queued), regardless of how the jobs were submitted.
Across the top are four links, the first jumps back to the default screen for "Active Jobs". The other three provide systems status for the two supercomputers, and your home directory file server. All three of these screens will look like the following screen shot.
Across the top, you can select the timeframe you are interested in examining. The four charts show the percent of cores of the entire system that are in use, the total system load, the total cluster memory use, and the total network traffic on the cluster. The "CPU Report" showing less than 100% use means that there are some cores not currently being used by job, but that doesn't necessarily mean that they are available to be scheduled; there may be a system reservation that is preventing the scheduler from utilizing all of those cores.
On the main "Active Jobs" screen, you can click on a job in the list to select it, and then perform various inspection tasks.
Selecting the "Job Status" button will open a screen similar to the system status, but allowing inspection of each node in the job individually, over the duration of the entire job.
This screen will allow you to select each node individually (listed as tabs across the top) and examine each node's state. You can check that your job is working well in parallel, if the network communication is high or low, if the memory is swapping, etc.
The "Peek Job" option will display the output of a
qpeek on that job.
This option will execute a
qdel and delete the job.