Used for genome assembly Platanus

1. Software Installation

Download the two files are the main program and documentation.

 

$ mkdir /opt/biosoft/platanus
$ wget http://platanus.bio.titech.ac.jp/Platanus_release/20130901010201/platanus -P /opt/biosoft/platanus
$ wget http://platanus.bio.titech.ac.jp/Platanus_release/20130901010201/README -P /opt/biosoft/platanus
$ chmod 755 /opt/biosoft/platanus

 

2. Use

platanus comprises three command, are assemble, scaffold, gap_close. Its usage is as follows:
common parameters for these three commands:

-t number of threads used, this value is <= 100, the default value is 1. 
prefix -o output file, the default is out.

3. Assemble

This command is based on the algorithm of FIG Bruign assembled contig

-f FILE1 [File2 ...] 
the input file, the input file to support a maximum of 100 total input. The file can be fastq or fasta format. The software will automatically recognize the format. It does not apply to base quality values, base quality value without any impact on the assembly. 
-k INT 
initial k-mer size, default is 32. Data coverage is low, the value is set to be smaller. 
-s INT 
step k-mer value. This value must be> = 1, the default value is 10. Program uses the value K-mer plurality of contigs assembled. 
-n INT 
initial cutoff k-mer coverage. The default value is 0, i.e. the value automatically. The value depends on the automatic frequency distribution of k-mer. If the distribution is not normal, you should manually set. 
-c INT 
set the minimum k-mer coverage. The default value is 2. In larger when the value of k-mer, k-mer coverage is smaller, the smaller the cutoff value, but this value is not lower than the cutoff value of this parameter setting. 
FLOAT -a 
K-Mer value increases the level of security, the default value is 10.0. The final increase in the value of k-mer. If the accuracy is sacrificed to extend the contigs, the value is set lower, such as 5.0. 
-u FLOAT 
eliminate the biggest difference bubbles are running, the default value is 0.1. The larger the value, the easier elimination of air bubbles. Especially if heterozygous genomic rate, the recommended value setting higher, such as 0.2. 
-d FLOAT 
when the branch coverage exceeds this value, the truncated branches, the default value is 0.5. The smaller this value, the higher accuracy. If the error rate is lower base, the lower setting appropriate values, such as 0.3.
-m INT 
limit memory units GB, default is 16. When the program needs to consume more memory than this value, it will prompt a warning, but will not interrupt the operation.
 

File output for this program

PREFIX_contig.fa assembled contiguous sequence 
PREFIX_contigBubble.fa remove bubbles and fusion sequence 
distribution frequency of PREFIX_kmerFrq.tsv k-mers

Reference Source:

http://www.chenlianfu.com/?author=1&paged=13




Guess you like

Origin www.cnblogs.com/bio-mary/p/11410992.html