LNF
Computing Service’s Linux Farm
| Hardware & Software | OpenPBS – Most Used Command | OpenPBS – How to Submit a Job | MPI & OpenPBS |

 


The Computing Service’s Linux farm is accessible by all the users that who have an AFS account of LNF.
The devices respond only by SSH protocol from internal laboratories nodes.

To connect to the farm nodes the alias lxcalc.lnf.infn.it is avaiable.


 

Hardware

5 x Slot 1U HP Proliant DL360
- CPU: 2 x Intel Xeon 2.8 GHz / 512 Kb L2 Cache
- RAM: 3 GB
- HDD: 1 x 18 GB @ 15.0000 rpm SCSI U320
- LAN: 1 x GigabitEthernet 10/100/1000

Software

Scientific Linux 3.04
ASIS                            
ATLAS
BLAS high perf lib by K.Goto    
Blacs
Boost
CERNLIB                         
CMT
Clhep
F90gl
F95
Fluka
Ftncheck
Garfield                        
Geant4                          
Glut
Glut-3.7.1
Gnuplot4
Grace
Intel C/C++ Compiler            
Intel Debugger                  
Intel Eclipse                   
Intel Fortran Compiler          
Intel JRockit JVM               
Intel MKL (Math Kernel Library) 
Kdiff3
Mathematica                     
Mercury
Mesa
Mpich
Nail
OpenDx
Pgplot
Plusfort
Ppower4
Root                            
Scalapack
- Openpbs 2.3.13
    4 code:
      small     max.cput =    1 H
      medium    max.cput =    8 H
      long      max.cput =   24 H
      verylong  max.cput =   72 H
    max_user_run         =    2
    max_group_run        =    6
    resources_max.file   = 2047 MB
    resources_max.vmem   = 2047 MB
- Mathematica 5
    Per la versione grafica si deve aggiungere
    fontserver.lnf.infn.it nei fontserver del
    server/emulatore X11 ed eseguire `mathematica`
- Geant4.6.2
    Prima di utilizzare le librerie Geant4 del
    Cern, eseguire `geant4.env.setup`, che
    impostera' le variabili d'environment.
- Aree di storage temporaneo disponibili:
    /scratch/nfs/<nomegruppo>/<nomeutente>   ($scratchnfs)
    /scratch/local/<nomegruppo>/<nomeutente> ($scratchlocal)
    /tmp (quota 30 MB)

 

Most PBS Used Commands

– qdel

Allows you to delete a job from a queue previously submitted:

[dmaselli@lxcalc4:~]> qdel <JobID>

 

– qstat

Allows you to display information about the submitted job status:

[dmaselli@lxcalc4:~]> qstat
Job id           Name             User             Time Use S Queue
---------------- ---------------- ---------------- -------- - -----
431.lxcalc1      test             dmaselli                0 E long
432.lxcalc1      test             dmaselli                0 R verylong
433.lxcalc1      test             dmaselli                0 R verylong
434.lxcalc1      test             dmaselli                0 R small
435.lxcalc1      test             dmaselli                0 R verylong
436.lxcalc1      test             dmaselli                0 R verylong
437.lxcalc1      test             dmaselli                0 Q verylong
438.lxcalc1      test             dmaselli                0 Q verylong
439.lxcalc1      test             dmaselli                0 Q verylong

S field indicates the job status, it can be
E – Executed: The job it’s done, it will be removed from the queue soon.
R – Running: The job is running
Q – Queued: The job is queued, it’s waiting for the scheduler that set it to run.

To have detailed information of the queued job:

[dmaselli@lxcalc4:~]> qstat -f
Job Id: 440.lxcalc1.lnf.infn.it
    Job_Name = test
    Job_Owner = dmaselli@lxcalc4.lnf.infn.it
    job_state = R
    queue = verylong
    server = lxcalc1.lnf.infn.it
    Checkpoint = u
    ctime = Mon Sep 13 13:21:28 2004
    Error_Path = lxcalc4.lnf.infn.it:/scratch/nfs/calcolo/dmaselli/test.err
    exec_host = lxcalc3/0
    Hold_Types = n
    Join_Path = n
    Keep_Files = n
    Mail_Points = abe
    mtime = Mon Sep 13 13:21:29 2004
    Output_Path = lxcalc4.lnf.infn.it:/scratch/nfs/calcolo/dmaselli/test.log
    Priority = 0
    qtime = Mon Sep 13 13:21:28 2004
    Rerunable = False
    Resource_List.cput = 72:00:00
    Resource_List.file = 2047mb
    Resource_List.nodect = 1
    Resource_List.nodes = Linux
    Resource_List.vmem = 512mb
    session_id = 14031
    Variable_List = PBS_O_HOME=/afs/lnf/user/d/dmaselli,
        PBS_O_LANG=en_US.iso885915,PBS_O_LOGNAME=dmaselli,
        PBS_O_PATH=.:/afs/lnf/user/d/dmaselli/bin:/usr/lnf/bin:/usr/afsws/bin:
        /usr/afsws/etc:/usr/kerberos/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/cus
        tom/openssh/bin:/usr/X11R6/bin:/usr/local/bin:/usr/sbin:/usr/local/bin/
        X11:/cern/pro/bin:/usr/sbin:/usr/lnf/root/bin,
        PBS_O_MAIL=/var/mail/dmaselli,PBS_O_SHELL=/bin/tcsh,
        PBS_O_HOST=lxcalc4.lnf.infn.it,
        PBS_O_WORKDIR=/scratch/nfs/calcolo/dmaselli,PBS_O_QUEUE=default
    comment = Job started on Mon Sep 13 at 13:21
    etime = Mon Sep 13 13:21:28 2004
Job Id: 441.lxcalc1.lnf.infn.it
    Job_Name = test
    Job_Owner = dmaselli@lxcalc4.lnf.infn.it
    job_state = R
    queue = verylong
    server = lxcalc1.lnf.infn.it
    Checkpoint = u
    ctime = Mon Sep 13 13:21:30 2004
    Error_Path = lxcalc4.lnf.infn.it:/scratch/nfs/calcolo/dmaselli/test.err
    exec_host = lxcalc2/0
    Hold_Types = n
    Join_Path = n
    Keep_Files = n
    Mail_Points = abe
    mtime = Mon Sep 13 13:21:30 2004
    Output_Path = lxcalc4.lnf.infn.it:/scratch/nfs/calcolo/dmaselli/test.log
    Priority = 0
    qtime = Mon Sep 13 13:21:30 2004
    Rerunable = False
    Resource_List.cput = 72:00:00
    Resource_List.file = 2047mb
    Resource_List.nodect = 1
    Resource_List.nodes = Linux
    Resource_List.vmem = 512mb
    session_id = 14049
    Variable_List = PBS_O_HOME=/afs/lnf/user/d/dmaselli,
        PBS_O_LANG=en_US.iso885915,PBS_O_LOGNAME=dmaselli,
        PBS_O_PATH=.:/afs/lnf/user/d/dmaselli/bin:/usr/lnf/bin:/usr/afsws/bin:
        /usr/afsws/etc:/usr/kerberos/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/cus
        tom/openssh/bin:/usr/X11R6/bin:/usr/local/bin:/usr/sbin:/usr/local/bin/
        X11:/cern/pro/bin:/usr/sbin:/usr/lnf/root/bin,
        PBS_O_MAIL=/var/mail/dmaselli,PBS_O_SHELL=/bin/tcsh,
        PBS_O_HOST=lxcalc4.lnf.infn.it,
        PBS_O_WORKDIR=/scratch/nfs/calcolo/dmaselli,PBS_O_QUEUE=default
    comment = Job started on Mon Sep 13 at 13:21
    etime = Mon Sep 13 13:21:30 2004

With the use of specific options, it permits to view queues or server informations (-q e -Q) or (-B). Examples:

[dmaselli@lxcalc4:~]> qstat -Q
Queue            Max Tot Ena Str Que Run Hld Wat Trn Ext Type
---------------- --- --- --- --- --- --- --- --- --- --- ----------
verylong           6   0 yes yes   0   0   0   0   0   0 Execution
long               8   0 yes yes   0   0   0   0   0   0 Execution
medium            10   0 yes yes   0   0   0   0   0   0 Execution
small             14   0 yes yes   0   0   0   0   0   0 Execution
default            0   0 yes yes   0   0   0   0   0   0 Route
[dmaselli@lxcalc4:~]> qstat -B
Server           Max Tot Que Run Hld Wat Trn Ext Status
---------------- --- --- --- --- --- --- --- --- ----------
lxcalc1.lnf.infn   0   0   0   0   0   0   0   0 Active

 

– pbsnodes

It permits to view the cluster nodes status information, especially useful for system administration. Example:

[dmaselli@lxcalc4:~]> pbsnodes -a
lxcalc1
     state = free
     np = 2
     properties = Linux,lxcalc1
     ntype = cluster
lxcalc2
     state = free
     np = 2
     properties = Linux,lxcalc2
     ntype = cluster
lxcalc3
     state = free
     np = 2
     properties = Linux,lxcalc3
     ntype = cluster
lxcalc4
     state = free
     np = 2
     properties = Linux,lxcalc4
     ntype = cluster
lxcalc5
     state = free
     np = 2
     properties = Linux,lxcalc5
     ntype = cluster
axcalc
     state = free
     np = 6
     properties = AIX,axcalc
     ntype = cluster

 

Submit a Job at PBS
NOTE: The following files: STDOUT and STDERR, executable and processing data mustn’t reside on AFSI area For this you have to put a copy of executable and processing data on NFS scratch areas. ( /scratch/nfs/<groupname>/<username> )

Generally it’s possible submits a job by using also command line:

qsub -

<commands>
press CTRL-D for quitting command insert.

Howewer this last procedure is strongly not recommended.
Usually the job submission at PBS take place via:

qsub <file.pbs>  [-l (Linux|AIX)]

where <file.pbs> it’s a job-script which contains a list of qsub directivers followed by the real job instructions.
Thw option -l <architecture>’ is optional and it’s necessary only when you want to send a job to the farm axcalc linux and vice versa.
By default, the job will be run on a machine with the same architecture as that one from it was submitted.

– How to create a script job for PBS –

The first row of the script, before any #PBS directive, must be #!/shell, where shell is the complete path for a choosen shell.
For Example:

#!/bin/sh

The guidelines can be given until the start of the first commandHere’s a quick overview of the most important parameters that you can set (for the complete list, execute man qsub):

For assign a name to the job:

 

#PBS -N <job name>

By default the job will have the same name as the pbs file.

For assign a name to the error file:

#PBS -e <path>

The path is not absolute, but relative at the working dir.
If this parameter is not set, the assigned name will be the standard one <job name>. Followed by the job identification number.

Symilarly for assign a name to the output file.

#PBS -o <path>

If this parameter is not set, the assigned name will be the standard one <job name>. Followed by the job identification number.

For execute a job in interactive mode:

#PBS -I

If this option is selected, the job will be executed in intteractive mode, or the standard input, the standard output and the errors flows will be linked throught qsub to the console from where the job was submitted.

To let keep the input or output file to the executor node:

#PBS -k <argument>

The k is for “keep”. Establishes if the standard output or standard error should be retained by the executor node of the job. The possible arguments are e (only standard error), o (only standard output), eo (both), oe (both), n (no one, it’s default). File are saved with the standard name.

For indicate te essential resources for the job:

#PBS -l <resource_list>

For example for indicate the CPU’s time necessary for the job, the script will be like:

#PBS -l  cput=01:00:00

It’s possible to set other values inside the same command, separated by comma. If a resource is indicated with no values, It’s set to infinite. Example:

#PBS -l cput

To specify whether to send an e-mail notification:

#PBS -m <mail_options>

The possible options are:
a – E-mail is sended if a job aborted.
b – E-mail is sended when a job is execute.
e – E-mail is sended when a job end the execution.
n – No one E-mail is sended
The default option is “a”.

To indicate where to send the email notification:

#PBS -M <user_list>

If you are given more email addresses, they must be separated by a comma. By default, the e-mail is sent to the job owner, or the one who has submitted it.

Per indicare la shell da utilizzare:

#PBS -S <path_list>

It indicates to PBS where to find the shell. Usually it can be find in the directory /bin/sh.
If it not specified, PBS use the same shell used by the executor node user.

For indicate the name of the job owner:

#PBS -u <user_list>

Also in this case the dafault name is the same of the submitter of the job.

– It’s important to specify ALWAYS ABSOLUTE PATH of the file I/O and executables 
(Es.: /scratch/nfs/calcolo/dmaselli/test.exe)

Example:

Input file (short.pbs)

 

#!/bin/sh
#Ho appena definito la shell (e' una bash)
#PBS -S /bin/sh
#PBS -M dmaselli@lnf.infn.it
#PBS -m e
#PBS -l cput=01:01:00
#PBS -o risultato
#PBS -e errori
#commento
#Questo e' un commento, il prossimo invece e' un comando
echo ""
DATE=`date`
#Attenzione ad usare gli apici giusti e a non mettere spazi!!!
echo "$DATE"
sleep 5
echo "Ci sono una serie di cose interessanti che ti interessera' sapere"
echo "Questo job stato identificato come $PBS_JOBID e si chiama $PBS_JOBNAME"
echo "e' stato inserito inizialmente nella coda $PBS_O_QUEUE"
echo "ed e' stato eseguito sulla coda $PBS_QUEUE"
echo "E' stato sottoposto dalla macchina: $PBS_O_HOST"
echo "E' stato eseguito sulla macchina: `hostname`"
date
echo ""
#PBS -o risultato         Questa direttiva viene ignorata
 

Output file 1

 Mon Mar 24 16:46:40 CET 2003
 Ci sono una serie di cose interessanti che ti interessera' sapere
 Questo job stato identificato come 56.lxcalc3.lnf.infn.it e si chiama short.pbs
 e' stato inserito inizialmente nella coda default
 ed e' stato eseguito sulla coda long
 E' stato sottoposto dalla macchina: lxcalc5
 E' stato eseguito sulla macchina: lxcalc3
 Mon Mar 24 16:46:45 CET 2003

Output file 2

 Mon Mar 24 16:48:07 CET 2003
 Ci sono una serie di cose interessanti che ti interessera' sapere
 Questo job stato identificato come 57.lxcalc3.lnf.infn.it e si chiama short.pbs
 e' stato inserito inizialmente nella coda default
 ed e' stato eseguito sulla coda long
 E' stato sottoposto dalla macchina: lxcalc4
 E' stato eseguito sulla macchina: lxcalc2
 Mon Mar 24 16:48:12 CET 2003

 

Parallel MPI Jobs in PBS

 

MPI is supported on our linux farm via MPICH. You can find compilers, libraries and include files in /usr/lnf/farmsw/mpich/

When you are submitting MPI jobs in PBS you have to use mpiexec into the PBS script, NOT mpirun (or mpich.mpirun).

To specify number of nodes and cpu-per-node you have to add them in qsub command line. For example to run a job on 3 nodes with 2 cpu-per-node:

qsub -l nodes=3:Linux:ppn=2 <script-pbs>

You must not put these directives into the PBS script.

Sample PBS Script:

 

#!/bin/sh
### Job name
#PBS -N testing
### Declare job non-rerunable
#PBS -r n
### Output files
#PBS -e /scratch/nfs/calcolo/dmaselli/MPI/test.err
#PBS -o /scratch/nfs/calcolo/dmaselli/MPI/test.log
### Mail to user
#PBS -m ae
#
### This job's working directory
echo Working directory is $PBS_O_WORKDIR
cd $PBS_O_WORKDIR    
#
echo Running on host `hostname`
echo Time is `date`
echo Directory is `pwd`
echo This jobs runs on the following processors:
echo `cat $PBS_NODEFILE`
### Define number of processors
NPROCS=`wc -l < $PBS_NODEFILE`
echo This job has allocated $NPROCS nodes
#
### Run the parallel MPI executable
mpiexec /scratch/nfs/calcolo/dmaselli/MPI/mpi.exe