Monthly Archives: August 2011

DL_POLY

August 17, 2011Biochemistry, Scientific Softwareadmin

General information

4.02 version of the MD program for macromolecules, polymers, ionic systems, solutions and other molecular systems. Developed at the Daresbury Laboratory. In Pendulo the 2.2 version remains. There is already the DL_POLY_CLASSIC version which currently is not been developed.

How to submit to the queue

The program is installed in all the architectures, Arina and Pendulo (DL_POLY 2.2). To execute it include in the scripts:

/software/bin/DL_POLY/DL_POLY.Z

The program will exekute in GPGPUs if it starts in these kind of nodes. Besides, they can be selected by using the gpu label within [intlink id=”244″ type=”post”]the queue system[/intlink].

The GUI is also installed. To execute it use:

/software/bin/DL_POLY/gui

Some utilities has been installed in the /software/bin/DL_POLY/ directory.

Benchmark

We show a small benchmarks performed with dl_ploly_4.02. We stady the parallelization as well as the performance of the GPGPUs.

System	1 cores	4 cores	8 cores	16 cores	32 cores	64 cores
Itanium 1.6 GHz	1500	419	248	149	92	61
Opteron	1230	503	264	166	74
Xeon 2.27 GHz	807	227	126	67	37	25

We show in the firs benchamrk that DL_POLY scales very well and that the xeon nodes are the fastest ones, so we recomend them for large jobs.

System	1 cores	2 cores	4 cores	8 cores	16 cores	32 cores
Itanium 1.6 GHz	2137		303	165	93	47
Opteron	1592		482	177	134	55
Xeon 2.27 GHz	848		180	92	48	28
1 GPGPU	125	114	104	102
2 GPGPU		77	72	69
4 GPGPU			53	50
8 GPGPU				37

System	1 cores	2 cores	4 cores	8 cores	16 cores	32 cores	64 cores
Xeon 2.27 GHz	2918		774	411	223	122	71
1 GPGPU	362	333	338	337
2 GPGPU		240	222	220
4 GPGPU			145	142
8 GPGPU				97

We show that the GPGPUs speedup the calculation but each time we double the number of GPGPUs the speed up is multiplied but only 1.5. Because of this for large number of GPGPUs or cores is better to use the paralelization over cores. For example, one node has 8 cores and 2 GPGPUS. The 2 GPGPUs need 220 s while 8 cores need 411 s. Still 4 GPGPUs are faster than 16 cores but 32 cores with 71 s are faster than 8 GPGPUs that need 97 s. Therefore, the GPGPUS can speedup jobs in PCs or single nodes, but for jobs that require higher parallelization the cores parallelization is more effective.

DL_POLY is designed for big systems and the use up to thousand of cores. According to the documentation:

The DL_POLY_4 parallel performance and efficiency are considered very-good-to-excellent as long as (i) all CPU cores are loaded
with no less than 500 particles each and (ii) the major linked cells algorithm has no dimension less than 4.

More information

DL_POLY web page.

DL_POLY user guide (pdf).

DL_POLY GUI user guide (pdf).

Espresso

August 9, 2011Job Submissionadmin

General information

opEn-SourceP ackage for Research in Electronic Structure, Simulation, and Optimization

ESPRESSO is an integrated suite of computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials (both norm-conserving and ultrasoft).

The 6.1 version is availabe. The home page of the code is in DEMOCRITOS National Simulation Center of the Italian INFM.

Quantum ESPRESSO builds onto newly-restructured electronic-structure codes (PWscf, PHONON, CP90, FPMD, Wannier) that have been developed and tested by some of the original authors of novel electronic-structure algorithms – from Car-Parrinello molecular dynamics to density-functional perturbation theory – and applied in the last twenty years by some of the leading materials modeling groups worldwide. Innovation and efficiency is still our main focus.

How to use

[intlink id=”4795″ type=”post”] See how to send espresso section.[/intlink]

Monitorization

remote_vi: Shows the *.out file of espresso.
myjobs: During the execution of a job it shows the CPU and memory (SIZE) usage.

Benchmark

We show various benchmarks results for ph.x and pw.xy in our service the machines. The best are the Xeon nodes and scale well up to 32 cores. Notice that the communication network in the Xeon nodes is better.

Tabla 1:Execution times pw.x (4.2.1 version).

System	8 cores	16 cores	32 cores
Xeon	1405	709	378
Itanium2	2614	1368	858
Opteron 2.4	4320	2020	1174
Core2duo 2.1	–	–	–

Tabla 2: Execution times ph.x (versión 4.2.1)

System	8 cores	16 cores	32 cores
Xeon	2504	1348	809
Itanium2	2968	1934	1391
Opteron 2.4	6240	3501	2033
Core2duo 2.1	–	–	–

More information

ESPRESSO Web page.

On Line Documentation.

ESPRESSO Wiki.

send_espresso

August 9, 2011Job Submissionadmin

send_espresso

To launch espresso calculations to the queue system the send_espresso script is available. Executing it, send_espresso [Enter], the syntax of the command is shown:

send_espresso input Executable Nodes Procs_per_node Time Mem [``Otherqueue options'' ]

Input	Name of the espresso input file without extension
Executable	Name of the espresso program you want to use: pw.x, ph.x, cp.x,…
Nodos	Number of nodes
Procs_per_node:	Is the number of processors per node
Time:	The walltime (in hh:mm:ss format) or the queue name
Mem	Memory in GB (without the unit)
[“Otras opciones de Torque”]	See example bellow

Examples

Example1: send_espresso job1 pw.x 1 4 04:00:00 1
Example2: send_espresso job2 cp.x 2 4 192:00:00 8 "-W depend=afterany:1234"
Example3: send_espresso job5 pw.x 4 8 192:00:00 8 "-m bea -M email@adress.com"

Traditional way

The executables can be found in /software/Espresso, for instance to execute pw.x in queue script use

source /software/Espresso/compilervars.sh
/software/Espresso/bin/pw.x -npool ncores < input_file > output_file

In the -npool ncores option substitute ncores by the number of cores of the job.

How to send Turbomole

August 8, 2011Software @enadmin

send_turbo

To launch turbomole calculations to the queue system send_turbo is available. Executing it, send_turbo without arguments the syntax of the command and examples are shown:

send_turbo "EXEC and Options" JOBNAME TIME[or QUEUE] PROCS[property] MEM [``Other queue options'' ]

EXEC: Name of the Turbomole program you wnat to use.
JOBNAME: Name of the Turbomole control file (usually control).
PROCS: is the number of processors (you can not include the node type).
TIME[or QUEUE]: the walltime (in hh:mm:ss format) or the queue name.
MEM: memory in GB (without the unit).
[“Other queue options”] see examples below.

Examples

To run Turbomole (jobex) with the control input file in 8 cores and 1 GB of RAM execute:

send_turbo jobex control 04:00:00 8 1

To run Turbomole (jobex -ri) with the control input file in 16 cores, 8 GB of RAM and after 1234 job has finished execute:

send_turbo jobex -ri control 192:00:00 16 8 ``-W depend=afterany:1234''

Turbomole

August 8, 2011Software @enadmin

Presently TURBOMOLE is one of the fastest and most stable codes available for standard quantum chemical applications. Unlike many other programs, the main focus in the development of TURBOMOLE has not been to implement all new methods and functionals, but to provide a fast and stable code which is able to treat molecules of industrial relevance at reasonable time and memory requirements.

General information

TURBOMOLE is used by academic and industrial researchers. It is used in research areas ranging form homogeneous and heterogeneous catalysis, inorganic and organic chemistry to various types of spectroscopy, and biochemistry. The philosophy behind the development of the code was, and still is, its usefulness for applications.

It provides:

all standard and state of the art methods for ground state calculations (Hartree-Fock, DFT, MP2, CCSD(T))
excited state calculations at different levels (full RPA, TDDFT, CIS(D), CC2, ADC(2), …)
geometry optimizations, transition state searches, molecular dynamics calculations
various properties and spectra (IR, UV/Vis, Raman, CD)
fast and reliable code, approximations like RI are used to speed-up the calculations without introducing uncontrollable or unkown errors
parallel version for almost all kind of jobs
free graphical user interface

How to use it

The programme is in guinness at /software/TURBOMOLE.We have created the send_turbo script to facilitate the way to send turbomole calculations to the queue. See [intlink id=”4755″ type=”post”]How to send Turbomole[/intlink].

TmoleX, is also available, to help the input creationd and analisys of the results. There is a free download of TmoleX that you can install in your PC or it is available on Guinness. To use TmoleX execute:

TmoleX

To cleanly stop a job after the current iteration, for example the 1234.arina job, use the command:

turbomole_stop 1234

Remember to delete the “stop” file in the directory if you want to resubmit the calculation.

More Infromation

Turbomole web page.

Turbomole Manual

Turbomole Tutorial