PGI : Différence entre versions
(15 révisions intermédiaires par 3 utilisateurs non affichées) | |||
Ligne 1 : | Ligne 1 : | ||
− | |||
{{entete}} | {{entete}} | ||
− | = | + | <div class="alert"> |
− | + | This software is obsolete, dont use it. Documentation will be unpublished soon. | |
+ | </div> | ||
== Features == | == Features == | ||
Ligne 10 : | Ligne 10 : | ||
[[Image:Pgi.gif|200px|right]] | [[Image:Pgi.gif|200px|right]] | ||
− | The PGI Cluster Development Kit (CDK) version | + | The PGI Cluster Development Kit (CDK) version 14.10 from The Portland Group, Inc. is installed on INRIA Sophia cluster. The PGI CDK 14.10 consists of the follow components: |
Two network-floating seats of the following compilers and tools: | Two network-floating seats of the following compilers and tools: | ||
Ligne 16 : | Ligne 16 : | ||
* Floating multi-user seats for PGI's parallel Fortran, C, and C++ compilers for Linux -- industry-leading single-processor performance and integrated native support for all 3 popular parallel programming models: HPF, OpenMP, and MPI. | * Floating multi-user seats for PGI's parallel Fortran, C, and C++ compilers for Linux -- industry-leading single-processor performance and integrated native support for all 3 popular parallel programming models: HPF, OpenMP, and MPI. | ||
* Graphical MPI and OpenMP Linux Cluster debugging (PGDBG®) and parallel performance profiling (PGPROF®) tools. | * Graphical MPI and OpenMP Linux Cluster debugging (PGDBG®) and parallel performance profiling (PGPROF®) tools. | ||
− | * Pre-compiled/pre-configured MPICH message-passing libraries and utilities ( including [[ | + | * Pre-compiled/pre-configured MPICH message-passing libraries and utilities ( including [[MPICH2|MVAPich]] with infiniband support) |
* Pre-compiled ScaLAPACK parallel math library | * Pre-compiled ScaLAPACK parallel math library | ||
* Optimized BLAS and LAPACK serial math libraries | * Optimized BLAS and LAPACK serial math libraries | ||
Ligne 89 : | Ligne 89 : | ||
** PGI Accelerator and CUDA Fortran GPU-side performance statistics | ** PGI Accelerator and CUDA Fortran GPU-side performance statistics | ||
** Updated graphical user interface | ** Updated graphical user interface | ||
− | |||
* Updated Documentation including the PGI Users Guide, PGI Tools Guide and PVF Users Guide | * Updated Documentation including the PGI Users Guide, PGI Tools Guide and PVF Users Guide | ||
Ligne 100 : | Ligne 99 : | ||
* PGI User's Guide, PGHPF User's Guide, PGHPF Reference Manual, and Release Notes | * PGI User's Guide, PGHPF User's Guide, PGHPF Reference Manual, and Release Notes | ||
− | See also the documentations files in nef-devel:/ | + | See also the documentations files in nef-devel:/misc/opt/pgi/linux86-64/14.10/doc/ on the cluster. |
== Usage == | == Usage == | ||
Ligne 108 : | Ligne 107 : | ||
− | To initialize your environment to use the PGI CDK, issue the following | + | To initialize your environment to use the PGI CDK, issue the following command: |
− | % | + | % module load pgi/pgi-14.10 |
− | |||
− | |||
− | |||
− | + | If you want also to have the PGI version of mpi in your path, use instead: | |
+ | % module load mpi/pgi-14.10 | ||
You'll be able to use the PGI Fortran, C, and C++ compilers and tools on any Linux workstation networked to cluster.inria.fr (the licence server). The commands used to invoke the compilers are as follows: | You'll be able to use the PGI Fortran, C, and C++ compilers and tools on any Linux workstation networked to cluster.inria.fr (the licence server). The commands used to invoke the compilers are as follows: | ||
Ligne 125 : | Ligne 122 : | ||
* pgCC - ANSI and cfront-compatible C++ | * pgCC - ANSI and cfront-compatible C++ | ||
* pgprof - Graphical Performance profiler | * pgprof - Graphical Performance profiler | ||
− | * pgdbg - Graphical debugger | + | * pgdbg - Graphical debugger |
− | |||
=== Compilers options === | === Compilers options === | ||
Ligne 138 : | Ligne 134 : | ||
for the C++ compiler. | for the C++ compiler. | ||
− | |||
− | The PGI | + | <div class="alert"> |
− | + | By default, PGI compilers generate code that is optimized for the type of processor on which compilation is performed, the compilation host. This can be a problem if you want to run your application on all the nodes of the cluster (xeon and opteron). | |
+ | </div> | ||
+ | |||
+ | The PGI 14.10 compilers can produce PGI Unified Binary object or executable files containing code streams fully optimized and supported for both AMD and Intel x64 CPUs. | ||
+ | |||
+ | <div class="info"> | ||
+ | To generate code optimized for both architectures, use: -tp x64 | ||
+ | </div> | ||
To generate code optimized only for Xeon/quadcore, use: | To generate code optimized only for Xeon/quadcore, use: | ||
Ligne 150 : | Ligne 152 : | ||
To link with the MPICH libraries, add '''-Mmpi''' or '''-Mmpi2''' (for Mpich2) to the link line for your Fortran applications. | To link with the MPICH libraries, add '''-Mmpi''' or '''-Mmpi2''' (for Mpich2) to the link line for your Fortran applications. | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
=== Jobs submission === | === Jobs submission === | ||
− | To submit jobs using [[ | + | To submit jobs using [[FAQ|torque]], you must be logged in to host nef.inria.fr (or nef-devel2.inria.fr). Following is an example of a script used to run an MPI "hello world" program: |
% cat mpihello.f | % cat mpihello.f | ||
program hello | program hello | ||
Ligne 172 : | Ligne 166 : | ||
call mpi_finalize(ierr) | call mpi_finalize(ierr) | ||
end | end | ||
− | % cat mpihello. | + | % cat mpihello.sh |
#!/bin/sh | #!/bin/sh | ||
− | |||
− | |||
# The job | # The job | ||
− | + | source /etc/profile.d/modules.sh | |
− | + | module load mpi/pgi-14.10 | |
+ | mpirun -machinefile $OAR_NODEFILE -launcher-exec oarsh ./mpihello | ||
+ | |||
− | % | + | % oarsub -l /nodes=2/core=2 ./mpihello.sh |
Hello world! I'm node 0 | Hello world! I'm node 0 | ||
Ligne 192 : | Ligne 186 : | ||
− | In order to debug MPI applications, you have to use the [[ | + | In order to debug MPI applications, you have to use the [[MPICH2]] implementation provided by PGI (it will not work with openmpi), ie. compile with <code>-Mmpi2</code>. |
− | * reserve a node interactively, for example reserve eight cores on one node: ( | + | * reserve a node interactively, for example reserve eight cores on one node: (oarsub -I -l /nodes=1) |
− | + | * Configure the PGI environment: <code>module load pgi/pgi-14.10</code> | |
− | + | * run pgdbg with your application: <code>pgdbg -mpi:$PGI/linux86-64/14.10/mpi/mpich/bin/mpirun -n 8</code> ./myapp | |
− | * Configure the PGI environment: <code> | ||
− | * run pgdbg with your application: <code>pgdbg -mpi: | ||
* once pgdbd is started, you just have to click in the 'resume' button to start your application. | * once pgdbd is started, you just have to click in the 'resume' button to start your application. | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
=== More documentations === | === More documentations === |
Version actuelle datée du 21 décembre 2020 à 17:03
Sommaire
This software is obsolete, dont use it. Documentation will be unpublished soon.
Features
The PGI Cluster Development Kit (CDK) version 14.10 from The Portland Group, Inc. is installed on INRIA Sophia cluster. The PGI CDK 14.10 consists of the follow components:
Two network-floating seats of the following compilers and tools:
- Floating multi-user seats for PGI's parallel Fortran, C, and C++ compilers for Linux -- industry-leading single-processor performance and integrated native support for all 3 popular parallel programming models: HPF, OpenMP, and MPI.
- Graphical MPI and OpenMP Linux Cluster debugging (PGDBG®) and parallel performance profiling (PGPROF®) tools.
- Pre-compiled/pre-configured MPICH message-passing libraries and utilities ( including MVAPich with infiniband support)
- Pre-compiled ScaLAPACK parallel math library
- Optimized BLAS and LAPACK serial math libraries
- Tutorial examples and programs to help you get your codes up and running quickly using HPF, OpenMP, and MPI messaging
A partial list of the technical features supported by the PGI compilers includes the following:
- PGHPF data parallel compiler with native Full HPF language support
- PGF95 OpenMP and auto-parallel Fortran 95 compiler
- PGF77 OpenMP and auto-parallel FORTRAN 77 compiler
- PGC++ OpenMP and auto-parallel ANSI and cfront-compatible C++ compiler
- PGCC OpenMP and auto-parallel ANSI/K&R C compiler
- PGDBG multi-process/multi-thread graphical debugger
- PGPROF multi-process/multi-thread graphical performance profiler
- Full 64-bit support on AMD Opteron, AMD Athlon 64 and Intel Pentium and Xeon with EM64T including full support for -mcmodel=medium and single data objects > 2GB
- Includes separate 32-bit x86 and 64-bit EM64T/AMD64 development environments and compilers
- Optimizing 64-bit code generators with automatic or manual platform selection
- Executables generated by PGI's 32-bit x86 compilers can run unchanged on AMD64 or EM64T processor-based systems
- AMD Opteron and Intel EM64T optimizations including SSE/SSE2, prefetching, use of extended register sets, and 64-bit addressing
- Intel Pentium II/III/4/Xeon and AMD Athlon XP/MP optimizations including SSE/SSE2 and prefetching where supported in hardware
- Large file (> 2GB) support in Fortran on 32-bit x86 systems
- -r8/-i8 compilation flags, 64-bit integers
- Full support for Fortran 95 extensions
- Optimized ACML version 2.5 math library supported on all targets
- Highly-tuned math intrinsics library routines
- One pass interprocedural analysis (IPA)
- Interprocedural optimization of libraries
- Profile feedback optimization
- Function inlining including library functions
- Vectorization, loop interchange, loop splitting
- Loop unrolling, loop fusion, and cache tiling
- Support for creation of shared objects on Linux and DLLs on Windows
- Cray/DEC/IBM compatibility (including Cray POINTERs)
- Support for SGI-compatible DOACROSS in PGF77 and PGF95, and for SGI-compatible parallelization pragmas in PGCC C and C++
- Byte-swapping I/O for RISC/UNIX interoperability
- Integrated cpp pre-processing
- Threads-based auto-parallelization using PGF77, PGF95, and
- PGCC C and C++
- Full support for OpenMP in PGF77, PGF95, and PGCC C and C++
- Process/CPU affinity support in SMP/OpenMP applications
- FORALL and F95 array assignment merging
- Re-use of communication schedules
- Complete implementation of the HPF Library
- Parallelization of irregular DO loops, FORALLs, and array assignments
- HPF parallelization using direct accesses to shared memory
- Fully upward compatible with PGHPF for high-end parallel systems
- Support for graphical HPF profiling and performance tuning
PGI 2010 New Features and Performance:
- PGI Accelerator™ x64+GPU native Fortran 95/03 and C99 compilers now support the full PGI Accelerator Programming Model v1.0 standard for directive-based GPU programming and optimization.
- Now supported on Linux, MacOS and Windows
- Device-resident data using MIRROR, REFLECTED, UPDATE directives
- COMPLEX and DOUBLE COMPLEX data, Fortran derived types, C structs
- Automatic GPU-side loop unrolling, support for the UNROLL clause
- Support for Accelerator regions nested within OpenMP parallel regions
- PGI CUDA Fortran extensions supported in the PGI 2010 Fortran 95/03 compiler enable explicit CUDA GPU programming
- Declare variables in CUDA GPU device, constant or shared memory
- Dynamically allocate page-locked pinned host memory, CUDA device main memory, constant memory and shared memory
- Move data between host and GPU with Fortran assignment statements
- Declare explicit CUDA grids/thread-blocks to launch GPU compute kernels
- Support for CUDA Runtime API functions and features
- Efficient host-side emulation for easy CUDA Fortran debugging
- PGI Fortran 2003 incremental features. See full list below.
- PGC++/ PGCC enhancements include the latest EDG release 4.1 front-end with enhanced GNU and Microsoft compatibility, extern inline support, improved BOOST support, thread-safe exception handling
- PGI Visual Fortran supports launching and debugging of MSMPI programs on Windows clusters from within Visual Studio, adds support for the PGI Accelerator Programming model and PGI CUDA Fortran on NVIDIA CUDA-enabled GPUs, and now includes the standalone PGPROF performance profiler with CCFF support.
- Compiler optimizations and enhancements include OpenMP support for up to 256 cores, support for AVX code generation, C++ inlining and executable size improvements,
- PGPROF parallel OpenMP performance analysis and tuning tool
- Uniform cross-platform performance profiling without re-compiling or any special software privileges on Linux, MacOS and Windows
- PGI Accelerator and CUDA Fortran GPU-side performance statistics
- Updated graphical user interface
- Updated Documentation including the PGI Users Guide, PGI Tools Guide and PVF Users Guide
Documentation
Documentation includes the following:
- PGI User's Guide, PGHPF User's Guide, PGHPF Reference Manual, and Release Notes
See also the documentations files in nef-devel:/misc/opt/pgi/linux86-64/14.10/doc/ on the cluster.
Usage
Setup
To initialize your environment to use the PGI CDK, issue the following command:
% module load pgi/pgi-14.10
If you want also to have the PGI version of mpi in your path, use instead:
% module load mpi/pgi-14.10
You'll be able to use the PGI Fortran, C, and C++ compilers and tools on any Linux workstation networked to cluster.inria.fr (the licence server). The commands used to invoke the compilers are as follows:
- pgf77 - FORTRAN 77
- pgf90 - Fortran 90
- pgf95 - Fortran 95
- pghpf - High Performanc Fortran
- pgcc - ANSI and K&R C
- pgCC - ANSI and cfront-compatible C++
- pgprof - Graphical Performance profiler
- pgdbg - Graphical debugger
Compilers options
After executing the commands above to initialize your environment, you should be able to bring up man pages for any of the above commands. If you aren't sure which options to use, PGI recommends:
-fast
for all of the Fortran compilers and the C compiler, and:
-fast -Minline=levels:10 --no_exceptions
for the C++ compiler.
By default, PGI compilers generate code that is optimized for the type of processor on which compilation is performed, the compilation host. This can be a problem if you want to run your application on all the nodes of the cluster (xeon and opteron).
The PGI 14.10 compilers can produce PGI Unified Binary object or executable files containing code streams fully optimized and supported for both AMD and Intel x64 CPUs.
To generate code optimized for both architectures, use: -tp x64
To generate code optimized only for Xeon/quadcore, use:
-tp core2-64
To generate code optimized only for Opteron, use:
-tp k8-64
To link with the MPICH libraries, add -Mmpi or -Mmpi2 (for Mpich2) to the link line for your Fortran applications.
Jobs submission
To submit jobs using torque, you must be logged in to host nef.inria.fr (or nef-devel2.inria.fr). Following is an example of a script used to run an MPI "hello world" program:
% cat mpihello.f program hello include 'mpif.h' integer ierr, myproc call mpi_init(ierr) call mpi_comm_rank(MPI_COMM_WORLD, myproc, ierr) print *, "Hello world! I'm node", myproc call mpi_finalize(ierr) end % cat mpihello.sh #!/bin/sh # The job source /etc/profile.d/modules.sh module load mpi/pgi-14.10 mpirun -machinefile $OAR_NODEFILE -launcher-exec oarsh ./mpihello
% oarsub -l /nodes=2/core=2 ./mpihello.sh Hello world! I'm node 0 Hello world! I'm node 2 Hello world! I'm node 1 Hello world! I'm node 3 %
Debugger
In order to debug MPI applications, you have to use the MPICH2 implementation provided by PGI (it will not work with openmpi), ie. compile with -Mmpi2
.
- reserve a node interactively, for example reserve eight cores on one node: (oarsub -I -l /nodes=1)
- Configure the PGI environment:
module load pgi/pgi-14.10
- run pgdbg with your application:
pgdbg -mpi:$PGI/linux86-64/14.10/mpi/mpich/bin/mpirun -n 8
./myapp - once pgdbd is started, you just have to click in the 'resume' button to start your application.
More documentations
All of the documentation for the PGI compilers and tools is on the cluster in nef-devel:/usr/local/pgi/linux86-64/current/doc.
For more information on the PGI compilers and tools, see the URLs:
For more information on HPF in general, see the High Performance Fortran homepage at: http://hpff.rice.edu/
For more information on OpenMP in general, see the OpenMP homepage at: http://www.openmp.org
For more information on the open source components of the PGI CDK, see the URLs: