Use on Datarmor

Note

recall for dummies...

  • 1 b = 1 bit
  • 1 o = 1 octet = 8 bit = 1 B = 1 Byte
  • 1 Go = 1024 Mo

Basic environnement on Datarmor (disks)

Environment variables :

  • $HOSTNAME : noeud de login (datarmor0-3)
  • $HOME : /homeX/datahome/userlogin (50 GO with daily backup)
  • $DATAWORK : /homeX/datawork/userlogin (1 TO no backup)
  • $SCRATCH : /homeX/scratch/userlogin (10 TO no backup)

Répertoire visible par tout le monde (depuis Datarmor) :

  • /scratch/tmp (contenu effacé au bout de 15 jours)

Disk quotas

dukmyhome
dukmywork

How to change the permissions for datawork so that people can access (read only) my data

chmod g+rx $DATAWORK
chmod o+rx $DATAWORK
chmod g+rx $HOME

HOW TO USE PBS commands

qstat

to watch your job:

qstat -u user

Use qstat -q to see the full list of the queue

Name of the queues on Datarmor
type of the job queue name nb of nodes max nb of cpus elapsed time
None seq 1 1 180h
OMP omp 1 28 72h
MPI / hybrid mpi_1 1 28 48h
MPI / hybrid mpi_2 2 56 48h
MPI / hybrid mpi_X X X*28 48h
MPI / hybrid mpi_18 X 504 48h
XXXXXX short X ? 0h05
MPI / hybrid big 19-36 532-1008 24h

To see the status of your job use

qstat -f job_id

qsub

To submit a job from a shell script you should do this

qsub batch_shell

Your batch_shell should look like this

#PBS -q mpi_X
#PBS -l select=X:ncpus=28:mpiprocs=14:ompthreads=2
#PBS -l mem=10G
#PBS -l walltime=HH:MM:SS
#PBS -N name_of_your_job
time mpirun mars.exe >& output

Actually you can choose the number of cores you want to use only in the “omp” queue. For “mpi_X”, queues you must use a multiple of 28 cores

-q: specify the name of the queue according to the number of cpus you want to use:

-q queue_name

select: number of nodes (one = 28 ncpus)

ncpus: number of total cores used per node = 28 max

mpiprocs: number of mpi cores per node

ncpus and mpiprocs are defined only once and will be apply for each node

ompthreads: number of omp threads per node

ncpus must be equal to mpiprocs*ompthreads

Note

  • In case you dont set the ompthreads variable the number of OMP_THREADS is automatically calculted (28 / mpiprocs)
  • In cas of OMP run only you could use the OMP queue and use less than 28 threads one the node : You have to set ncpus=8 (if you want 8 threads for example)

qdel

to kill a job:

qstat -u user   (to get the job ID)
qdel jobid
qdel -W force jobid

Methodology to launch batch on Datarmor

Sequential case

Just set the memory and the walltime of your job

#!/bin/csh
#PBS -l mem=1g
#PBS -l walltime=HH:MM:00

OMP ONLY case

Set the number of threads (YY) you want to use

#!/bin/csh
#PBS -q omp
#PBS -l select=1:ncpus=YY:mem=5gb
#PBS -l walltime=HH:MM:00

And you must add

  • OMP options:

    setenv OMP_SCHEDULE "dynamic,1"
    

MPI ONLY case

Set the number of nodes (NN) given the number of MPI procs you want to use

#!/bin/csh
#PBS -q mpi_NN
#PBS -l mem=6gb
#PBS -l walltime=HH:MM:00

You then will use NN*28 MPI procs

HYBRID case

  1. Choose the number of cores you want. Distribute them on different nodes knowing the number of cores used in total is X nodes * 28 ncpus.
  2. Then split the numbers of mpi cores and threads. For all node, ncpus = mpiprocs x ompthreads=28

Here is an example with 2 nodes in total, different ways

#PBS -q mpi_2
#PBS -l select=2:ncpus=28:mpiprocs=1:ompthreads=28      | 1  MPI & 28 OMP_THREADS per node = 2 MPI * 28 OMP at all
#PBS -l select=2:ncpus=28:mpiprocs=2:ompthreads=14      | 2  MPI & 14 OMP_THREADS per node = 4 MPI * 14 OMP at all
#PBS -l select=2:ncpus=28:mpiprocs=4:ompthreads=7       | 4  MPI &  7 OMP_THREADS per node = 8 MPI * 7 OMP  at all
#PBS -l select=2:ncpus=28:mpiprocs=7:ompthreads=4       | 7  MPI &  4 OMP_THREADS per node = 14 MPI * 4 OMP at all
#PBS -l select=2:ncpus=28:mpiprocs=14:ompthreads=2      | 14 MPI &  2 OMP_THREADS per node = 28 MPI * 2 OMP at all

And you must add

  • OMP options:

    setenv OMP_SCHEDULE "dynamic,1"
    

some examples of batch are provided here

/appli/services/exemples/mars/

SCRATCH

Executable and input files are copied under $SCRATCH directory with the same arborescence as under datawork + .$PBS_JOBID

The data are saved for 15 days under $SCRATCH.

After each run, the simulated data are copied back to datawork except if you use qsub batch_xxx_no_scratch_back

It is possible not to use $SCRACTH from the command qsub batch_no_scratch but the performances are not known yet.

Infos en vrac

  • Benchmarks tunés par SGI sous /mnt/data/home/datarmor/SGI-IFREMERII_Application_src_and_logs/