Informática Aplicada a la Investigación Rotating Header Image

septiembre, 2015:

Qbox

General Information

Version: 1.62.3

Qbox is a C++/MPI scalable parallel implementation of first-principles molecular dynamics (FPMD) based on the plane-wave, pseudopotential formalism. Qbox is designed for operation on large parallel computers.

How to use it:

To send qbox jobs to the queue we have created the send_qbox utility_

send_qbox  JOBNAME NODES PROCS_PER_NODE[property] TIME

Executing send_box [Enter] more options will be shown. The program is installed in /software/qbox

More Information

On the Qbox Web page.

Qbox

Informazio Orokorra

Bertsioa: 1.62.3

Qbox is a C++/MPI scalable parallel implementation of first-principles molecular dynamics (FPMD) based on the plane-wave, pseudopotential formalism. Qbox is designed for operation on large parallel computers.

Nola Erabili

Lanak bidaltzeko send_qbox tresna prestatu dugu honako erabilerarekin:

send_qbox JOBNAME NODES PROCS_PER_NODE[property] TIME

send_box [Enter] egikarituta erabilgarri dauden beste hainbat aukera erakutsiko dira. Programa /software/qbox katalogoan dago kokatuta.

Informazio Gehiago

Qbox Web orrialdea.

Qbox

Información general

Versión: 1.62.3

Qbox is a C++/MPI scalable parallel implementation of first-principles molecular dynamics (FPMD) based on the plane-wave, pseudopotential formalism. Qbox is designed for operation on large parallel computers.

2. Cómo usar

Para enviar trabajos a la cola se puede usar el comando

send_qbox  JOBNAME NODES PROCS_PER_NODE[property] TIME

Al ejecutar send_box [Enter] aparecen más las opciones de uso

Más información

Página web de  Qbox.

IDBA-UD

Informazio orokorra

IDBA-UD 1.1.1 is a iterative De Bruijn Graph De Novo Assembler for Short Reads Sequencing data with Highly Uneven Sequencing Depth. It is an extension of IDBA algorithm. IDBA-UD also iterates from small k to a large k. In each iteration, short and low-depth contigs are removed iteratively with cutoff threshold from low to high to reduce the errors in low-depth and high-depth regions. Paired-end reads are aligned to contigs and assembled locally to generate some missing k-mers in low-depth regions. With these technologies, IDBA-UD can iterate k value of de Bruijn graph to a very large value with less gaps and less branches to form long contigs in both low-depth and high-depth regions.

Nola erabili

Lanak koletara bidaltzeko ondorengo komandoa erabili daiteke

send_idba-ud

eta galdera batzuk erantzun eta gero bidaliko du lana.

Erendimendua

IDBA-UD ondo eskalatzen du 8 koreetaraino. Hortik gora ez dugu inongo hobenkuntzarik nabaritu. Benchmark --mimk 40 --step 20 aukerekin egin da eta step murriztu dugunean okerrago paralelizatzen du. Bigarren taulan ere step 10-ekin errendimendua ez da ona 4 koretik gora.

kore 1 oinarri bezala 2 kore oinarri bezala
Koreak Denbora (s) Azelerazioa Errendimendua (%) Azelerazioa Errendimendua
1 480 1 100
2 296 1.6 81 1.0 100
4 188 2.6 64 1.6 79
8 84 5.7 71 3.5 88
12 92 5.2 43 3.2 54

Bigarren benchmarka fitxategin handiago batekin egin dugu, 10 milio basetakoa eta --mink 20 --step 10 --min_support 2 aukerekin. Ikusten dugu konportamendu erregularragoa eta nola 4 koreetatik gora ez duen ondo eskalatzen.

Koreak Denbora (s) Azelerazioa errendimendua (%)
1 13050 1 100
2 6675 2.0 98
4 3849 3.4 85
8 3113 4.2 52
16 2337 5.6 35
20 2409 5.4 27

Informazio gehigo

IDBA-UD web orrialdea.

IDBA-UD

General information

IDBA-UD 1.1.1 is a iterative De Bruijn Graph De Novo Assembler for Short Reads Sequencing data with Highly Uneven Sequencing Depth. It is an extension of IDBA algorithm. IDBA-UD also iterates from small k to a large k. In each iteration, short and low-depth contigs are removed iteratively with cutoff threshold from low to high to reduce the errors in low-depth and high-depth regions. Paired-end reads are aligned to contigs and assembled locally to generate some missing k-mers in low-depth regions. With these technologies, IDBA-UD can iterate k value of de Bruijn graph to a very large value with less gaps and less branches to form long contigs in both low-depth and high-depth regions.

How to use

To send jobs to the queue you can use the command

send_idba-ud

which after a few questions configures the job.

Performance

IDBA-UD has a good performance and scaling up to 8 cores. Above we did not measure a improvement. In the benchmark the --mimk 40 --step 20 options has been used. When we have decreased the step the the scalling is worse. This trend can be also seen in the second table.

1 core as base 2 cores as base
Cores Time (s) Speed up Performance (%) Speed up Performance (%)
1 480 1 100
2 296 1.6 81 1.0 100
4 188 2.6 64 1.6 79
8 84 5.7 71 3.5 88
12 92 5.2 43 3.2 54

The second benchark has been done with a bigger file with 10 million bases and the  --mink 20 --step 10 --min_support 2 options. We observe a regular behaviour than in the previous benchmark and how the panellization is good up to 4 cores.

Cores Time (s) Speed up Performance
1 13050 1 100
2 6675 2.0 98
4 3849 3.4 85
8 3113 4.2 52
16 2337 5.6 35
20 2409 5.4 27

More information

IDBA-UD web page.

IDBA-UD

Información general

IDBA-UD 1.1.1 is a iterative De Bruijn Graph De Novo Assembler for Short Reads Sequencing data with Highly Uneven Sequencing Depth. It is an extension of IDBA algorithm. IDBA-UD also iterates from small k to a large k. In each iteration, short and low-depth contigs are removed iteratively with cutoff threshold from low to high to reduce the errors in low-depth and high-depth regions. Paired-end reads are aligned to contigs and assembled locally to generate some missing k-mers in low-depth regions. With these technologies, IDBA-UD can iterate k value of de Bruijn graph to a very large value with less gaps and less branches to form long contigs in both low-depth and high-depth regions.

Cómo usar

Para enviar trabajos a la cola se puede usar el comando

send_idba-ud

que realiza unas preguntas para configurar el cálculo.

Rendimiento

IDBA-UD se ejecuta en paralelo con un buen rendimiento medido hasta por lo menos 8 cores. Por encima no se han medido mejoras apreciables. El benchmark se ha realizado con --mimk 40 --step 20.  Por algún motivo este cálculo tiene un salto cualitativo apreciable de 1 a dos cores. Si se pone un step de 10 el rendimiento a varios cores empeora como se observa en la segunda tabla.

1 core como base 2 cores como base
Cores Tiempo (s) Aceleración Rendimiento (%) Aceleración Rendimiento (%)
1  480 1 100
2 296  1.6  81 1.0 100
4 188 2.6 64 1.6 79
8  84 5.7 71 3.5 88
12 92 5.2 43 3.2 54

El segundo benchmark se ha realizado con un fichero mayor, con 10 millones de bases y las opciones --mink 20 --step 10 --min_support 2. Observamos un comportamiento más regular que en el benchmark anterior y como la paralelización es buena hasta los 4 cores.

Cores Tiempo (s) Aceleración Rendimiento (%)
1 13050 1 100
2 6675 2.0 98
4 3849 3.4 85
8 3113 4.2 52
16 2337 5.6 35
20 2409 5.4 27

Más información

Página web de IDBA-UD.

SPAdes

Informazio orokorra

SPAdes 3.6.0 – St. Petersburg genome assembler – is intended for both standard isolates and single-cell MDA bacteria assemblies. It works with Illumina or IonTorrent reads and is capable of providing hybrid assemblies using PacBio, Oxford Nanopore and Sanger reads. You can also provide additional contigs that will be used as long reads. Supports paired-end reads, mate-pairs and unpaired reads. SPAdes can take as input several paired-end and mate-pair libraries simultaneously. Note, that SPAdes was initially designed for small genomes. It was tested on single-cell and standard bacterial and fungal data sets.

Nola erabili

Kalkuluak bidaltzeko koletara

send_spades

komandoa erabili daiteke, honek galdera batzuk erantzutez kalkulua konfiguratuko du.

Errendimendua

Ez da neurtu inongo hobekuntzarik hainbat koreak erabiliz kalkulu normal batean, mota henetakoa:

spades.py -pe1-1 file1 -pe1-2 file2 -o outdir

Kore bakarra erabiltzea gomendatzen dugu, errendimendu hobea lortuko dela jakin ezik kore gehiago erabiliz gero.

Informazio gehiago

SPAdes web orrialdea.

SPAdes

General information

SPAdes 3.6.0 – St. Petersburg genome assembler – is intended for both standard isolates and single-cell MDA bacteria assemblies. It works with Illumina or IonTorrent reads and is capable of providing hybrid assemblies using PacBio, Oxford Nanopore and Sanger reads. You can also provide additional contigs that will be used as long reads. Supports paired-end reads, mate-pairs and unpaired reads. SPAdes can take as input several paired-end and mate-pair libraries simultaneously. Note, that SPAdes was initially designed for small genomes. It was tested on single-cell and standard bacterial and fungal data sets.

How to use

To send jobs to the queue you can use the

send_spades

command that asks few questions to configure the job.

Performance

We have not measure any performance improvement or time reduction when using several cores in a standard calculation like:

spades.py -pe1-1 file1 -pe1-2 file2 -o outdir

We recommend to use 1 core, unless you know that you can use better performance with several cores.

More information

Web page of SPAdes.

SPAdes

Información general

SPAdes 3.6.0 – St. Petersburg genome assembler – is intended for both standard isolates and single-cell MDA bacteria assemblies. It works with Illumina or IonTorrent reads and is capable of providing hybrid assemblies using PacBio, Oxford Nanopore and Sanger reads. You can also provide additional contigs that will be used as long reads. Supports paired-end reads, mate-pairs and unpaired reads. SPAdes can take as input several paired-end and mate-pair libraries simultaneously. Note, that SPAdes was initially designed for small genomes. It was tested on single-cell and standard bacterial and fungal data sets.

Cómo usar

Para enviar trabajos a la cola se puede usar el comando

send_spades

que realiza unas preguntas para configurar el cálculo.

Rendimiento

No se ha medido ninguna mejora ni reducción del tiempo de cálculo configurando más de un core en un tipo de cálculo:

spades.py -pe1-1 file1 -pe1-2 file2 -o outdir

Recomendamos usar 1 core a menos que se sepa que se va a obtener un mejor rendimiento con más cores.

Más información

Página web de SPAdes.