This page outlines ways to generate and view performance data for your program using tools available at OSC.
This section describes how to use performance tools from Intel. Make sure that you have an Intel module loaded to use these tools.
Intel VTune is a tool to generate profile data for your application. Generating profile data with Intel VTune typically involves three steps:
You need executables with debugging information to view source code line detail: re-compile your code with a -g
option added among the other appropriate compiler options. For example:
mpicc wave.c -o wave -g -O3
Profiles are normally generated in a batch job. To generate a VTune profile for an MPI program:
mpiexec <mpi args> amplxe-cl <vtune args> <program> <program args>
where <mpi args>
represents arguments to be passed to mpiexec, <program>
is the executable to be run, <vtune args>
represents arguments to be passed to the VTune executable amplxe-cl, and <program args>
represents arguments passed to your program.
For example, if you normally run your program with mpiexec -n 12 wave_c
, you would use
mpiexec -n 12 amplxe-cl -collect hotspots -result-dir r001hs wave_c
To profile a non-MPI program:
amplxe-cl <vtune args> <program> <program args>
The profile data is saved in a .map file in your current directory.
As a result of this step, a subdirectory that contains the profile data files is created in your current directory. The subdirectory name is based on the -result-dir argument and the node id, for example, r001hs.o0674.ten.osc.edu
.
3. Analyze your profile data.
You can open the profile data using the VTune GUI in interactive mode. For example:
amplxe-gui r001hs.o0674.ten.osc.edu
One should use an OnDemand VDI (Virtual Desktop Interface) or have X11 forwarding enabled (see Setting up X Windows). Note that X11 forwarding can be distractingly slow for interactive applications.
Intel Trace Analyzer and Collector (ITAC) is a tool to generate trace data for your application. Generating trace data with Intel ITAC typically involves three steps:
You need to compile your executbale with -tcollect
option added among the other appropriate compiler options to insert instrumentation probes calling the ITAC API. For example:
mpicc wave.c -o wave -tcollect -O3
mpiexec -trace <mpi args> <program> <program args>
For example, if you normally run your program with mpiexec -n 12 wave_c
, you would use
mpiexec -trace -n 12 wave_c
As a result of this step, .anc, .f, .msg, .dcl, .stf, and .proc files will be generated in your current directory.
You will need to use traceanalyzer
to view the trace data. To open Trace Analyzer:
traceanalyzer /path/to/<stf file>
where the base name of the .stf file will be the name of your executable.
One should use an OnDemand VDI (Virtual Desktop Interface) or have X11 forwarding enabled (see Setting up X Windows) to view the trace data. Note that X11 forwarding can be distractingly slow for interactive applications.
Intel's Application Performance Snapshot (APS) is a tool that provides a summary of your application's performance . Profiling HPC software with Intel APS typically involves four steps:
Regular executables can be profiled with Intel APS. but source code line detail will not be available. You need executables with debugging information to view source code line detail: re-compile your code with a -g
option added among the other approriate compiler options. For example:
mpicc wave.c -o wave -tcollect -O3
Profiles are normally generated in a batch job. To generate profile data for an MPI program:
mpiexec -trace <mpi args> <program> <program args>
where <mpi args>
represents arguments to be passed to mpiexec, <program>
is the executable to be run and <program args>
represents arguments passed to your program.
For example, if you normally run your program with mpiexec -n 12 wave_c
, you would use
mpiexec -n 12 wave_c
To profile a non-MPI program:
aps <program> <program args>
The profile data is saved in a subdirectory in your current directory. The directory name is based on the date and time, for example, aps_result_YYYYMMDD/.
To generate the html profile file from the result subdirectory:
aps --report=./aps_result_YYYYMMDD
to create the file aps_report_YYYYMMDD_HHMMSS.html.
You can open the profile data file using a web browswer on your local desktop computer. This option typically offers the best performance.
This section describes how to use performance tools from ARM.
Instructions for how to use MAP is available here.
Instructions for how to use DDT is available here.
Instructions for how to use Performance Reports is available here.
This section describes how to use other performance tools.
Rice University's HPC Toolkit is a collection of performance tools. Instructions for how to use it at OSC is available here.
TAU Commander is a user interface for University of Oregon's TAU Performance System. Instructions for how to use it at OSC is available here.