Whole genome re-sequencing of non-model organisms: lessons from unmapped reads
Anaïs GOUIN, Fabrice LEGEAI, Pierre NOUHAUD, Annabel WHIBLEY,
Jean-Christophe SIMON, Claire LEMAITRE
Individual sampling and assignment
Pea aphids were collected in late August 2011 within a 30 km-diameter location in Eastern France on different plants known to harbor ten of the pea aphid biotypes (biotypes associated with Cytisus scoparius, Lathyrus pratensis, Lotus corniculatus, Melilotus spp., Medicago lupulina, Ononis spinosa, Securigera varia, Vicia cracca, Medicago sativa and Trifolium pratense). As their host plant had already been harvested at this time, individuals belonging to the Pisum sativum biotype were sampled in Western France in November 2011. All individuals were genotyped at seven microsatellite loci (AlA09M, AlB07M, AlB08M, AlB12M, ApF08M, ApH08M and ApH10M, see Caillaud et al, 2004) following Peccoud et al (2008). Individuals with multiple clonal copies of the same genotype were discarded from the dataset on the basis of this multilocus genotype. The clustering software STRUCTURE (Pritchard et al, 2000) was then used to identify individuals showing good assignment to the genetic cluster associated with their collection plant, excluding migrants from other plants and possible hybrids. The number of clusters was set to K = 11, 100 000 MCMC chains were run after a 25 000 burn-in period and analyses were performed with admixture and without any prior information on the sampling plant or site. In parallel, symbiont typing was carried out for each individual through diagnostic PCR using specific primers and following Peccoud et al (2014).
Individual selection and whole-genome resequencing
For each biotype, three different individuals were selected on the basis of their STRUCTURE assignment score (mean 92.3 %, min. 78.0 %). The characteristics of the 33 individual genotypes that were included in the re-sequencing project are given in Table S1. DNA was extracted for three fourth-instar (clonal) larvae per individual to obtain a sufficient amount of genetic material using DNeasy Blood & Tissue Kit (Qiagen) according to the manufacturer’s instructions. DNA concentration was determined for each sample by spectrometry after RNAse treatment using PHERAstar (BMG Labtech.). Each of the 33 samples was processed according to the standard ILLUMINA protocol for preparing libraries for paired-end sequencing (mean insert size of 250 bp). Libraries were sequenced with a 100-bp paired end run on an ILLUMINA HiSeq2000.
2. Pipeline : used command-lines
Initial mapping: bowtie2 --non-deterministic --rg-id $ID --rg $SM --rg $LB --rg $PL -x reference_index -1 fastq1.fq -2 fastq2.fq -S output.sam
Unmapped read trimming: prinseq-lite.pl -verbose -fastq unmapped.fastq -trim_ns_right 0 -trim_qual_right 20 -trim_qual_window 10 -trim_qual_step 2 -trim_qual_type mean -min_len 66
Compareads:
./Compareads1.2_Beta5.sh -a fastq1.fq -b fastq2.fq -k 33 -t 2 -s 0
Assembly using AbySS:
ABYSS -k31 fastq.fq -o $out
Mapping with Stampy:
python stampy.py -g references.index -h references.index -M fastq1.fq fastq2.fq > out.sam
SNP calling pipeline (GATK):
1/ Duplicate removal:
java -Xmx4g -jar MarkDuplicates.jar INPUT=file.bam OUTPUT=out.markdup.bam METRICS_FILE=out.markdup.metrics REMOVE_DUPLICATES=true ASSUME_SORTED=true MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=800
2/ Indel realignment:
java -Xmx4g -jar GenomeAnalysisTK.jar -I out.markdup.bam -R references.fasta -T RealignerTargetCreator -o out.forIndelRealigner.intervals
java -Xmx4g -Djava.io.tmpdir=tmp/ -jar GenomeAnalysisTK.jar -I out.markdup.bam -R references.fasta -T IndelRealigner -targetIntervals out.forIndelRealigner.intervals -o out.realigned.bam
3/ Unified Genotyper:
java -Xmx4g -jar GenomeAnalysisTK.jar -R references.fasta -T UnifiedGenotyper -I out.realigned.bam [-I ...] -o out.vcf
Assembly using SPAdes:
spades.py -s reads.fa -k 31,41,63,81,89 -o out
Two phylogenetic analyses based on 16S rDNA sequences were carried out to establish the placement of the two symbiont genomes (Rickettsia and Spiroplasma) revealed in the unmapped read sets of some pea aphid individuals.
3.a Rickettsia 16S phylogeny
To obtain the 16S ribosomal RNA gene sequence of the Rickettsia genome present in our pea aphid data, reads from individual Ps2 (individual of the P. sativum biotype for which we got the higher coverage after mapping to the Rickettsia bellii genome) that mapped to the 16S gene of Rickettsia bellii genome were extracted, and then assembled using Minia (Chikhi and Rizk, 2012). The assembly led to a single contig of 1640 bp.
16S ribosomal RNA genes from all available Rickettsia species having their complete genome sequenced were collected from NCBI (NR_044656.1, NR_103923.1, NR_074480.1, NR_074394.1, NR_118678.1, NR_074496.1, NR_074474.1, NR_074486.1, NR_074485.1, NR_025967.1, NR_074472.1, NR_025921.1, NR_074497.1, NR_074470.1, NR_074459.1, NR_074469.1, NR_074471.1, NR_074488.1, NR_074527.1, NR_074483.1, NR_074487.1) and from SILVA database (Quast et al, 2013) (ACLC01000066, CP000849). The pipeline of Phylogeny.fr (Dereeper et al, 2008) was used to infer a maximum likelihood phylogenetic tree (multiple alignment with muscle, gblocks to clean the alignment and then PhyML for tree inference).
The obtained phylogenetic tree in Fig. S1 confirms that the closest relative to the Rickettsia pea aphid symbiont is Rickettsia bellii.
Figure S1: Phylogeny of Rickettsia based on 16S RNA gene sequences. The assembly of the 16S sequence of pea aphid Ps2 individual is highlighted in red. Branch support values are indicated in % in dark blue on top of each branch.
3.b Spiroplasma 16S phylogeny
To obtain the 16S ribosomal RNA gene sequence of the symbiont present in V. cracca biotype, the 16S sequence of Spiroplasma melliferum KC3 was aligned with Blast to the contigs obtained from de novo assembly. A unique match with 99 % identity was obtained, resulting in a sequence of 1444 bp.
Other Spiroplasma and outgroup species were selected based on a recent phylogeny of the Spiroplasma genus (Ku et al, 2013), with the 10 Spiroplasma species having their complete genome sequenced, and two without a reference genome (NR_121737.1, NR_103945.1, NR_121738.1, NR_121708.1, NR_036849.1, NR_121701.1, NR_121702.1, NR_121794.1, NR_103946.1, GU585671, GU993266 accessions in NCBI, and AGBZ01000004 SILVA database). Three outgroup sequences of Mycoplasma and Phytoplasma genera were gathered from the Molligen database (Barré et al, 2004). Finally, the 16S sequence obtained from a Spiroplasma symbiont reported in a Japanese pea aphid strain (named as Spiroplasma sp. SM) was added (Fukatsu et al, 2001, NCBI accession AB048263). The pipeline of Phylogeny.fr (Dereeper et al, 2008) was used to build the phylogenetic tree.
The obtained phylogenetic tree in Fig. S2 is consistent with previous analyses based on 16S rDNA (Ku et al, 2013). Importantly, it confirms the membership of the symbiont we assembled to the Spiroplasma genus and the absence of any close relative with a complete genome available in the databases.
Figure S2: Phylogeny of Spiroplasma genus based on 16S RNA gene sequences. The 16S sequence retrieved from the assembly of the pea aphid symbiont in V. cracca biotype is highlighted in red. Species whose complete genome sequence is not available are indicated in grey. Branch support values are indicated in % in dark blue on top of each branch.
4. Other additionnal tables and figures
Individual genotype
|
Plant origin
|
Assignment coefficient (STRUCTURE)
|
Symbionts
|
Spiroplasma
|
R. insecticola
|
H. defensa
|
Rickettsiella
|
S. symbiotica
|
Rickettsia
|
PAXS
|
Wolbachia
|
B. aphidicola
|
Cs1
|
C. scoparius
|
0.95
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Cs2
|
C. scoparius
|
0.93
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Cs3
|
C. scoparius
|
0.96
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Lc1
|
L. corniculatus
|
0.94
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Lc2
|
L. corniculatus
|
0.95
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Lc3
|
L. corniculatus
|
0.88
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Lp1
|
L. pratensis
|
0.96
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Lp2
|
L. pratensis
|
0.97
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Lp3
|
L. pratensis
|
0.97
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Ml1
|
M. lupulina
|
0.86
|
-
|
-
|
-
|
-
|
-
|
●
|
-
|
-
|
●
|
Ml2
|
M. lupulina
|
0.94
|
-
|
-
|
-
|
-
|
-
|
●
|
-
|
-
|
●
|
Ml3
|
M. lupulina
|
0.87
|
-
|
-
|
-
|
-
|
-
|
●
|
-
|
-
|
●
|
Mo1
|
M. officinalis
|
0.95
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Mo2
|
M. officinalis
|
0.97
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Mo3
|
M. officinalis
|
0.96
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Ms1
|
M. sativa
|
0.78
|
-
|
-
|
●
|
-
|
-
|
-
|
-
|
-
|
●
|
Ms2
|
M. sativa
|
0.91
|
-
|
-
|
●
|
●
|
-
|
-
|
-
|
-
|
●
|
Ms3
|
M. sativa
|
0.94
|
-
|
-
|
●
|
-
|
-
|
-
|
-
|
-
|
●
|
Os1
|
O. spinosa
|
0.97
|
-
|
-
|
●
|
-
|
-
|
-
|
-
|
-
|
●
|
Os2
|
O. spinosa
|
0.96
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Os3
|
O. spinosa
|
0.97
|
-
|
-
|
●
|
-
|
-
|
-
|
-
|
-
|
●
|
Sv1
|
S. varia
|
0.96
|
-
|
-
|
-
|
-
|
|
-
|
-
|
-
|
●
|
Sv2
|
S. varia
|
0.92
|
-
|
-
|
-
|
-
|
●
|
-
|
-
|
-
|
●
|
Sv3
|
S. varia
|
0.94
|
-
|
-
|
-
|
-
|
●
|
-
|
-
|
-
|
●
|
Tp1
|
T. pratense
|
0.93
|
-
|
●
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Tp2
|
T. pratense
|
0.93
|
-
|
●
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Tp3
|
T. pratense
|
0.91
|
●
|
●
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Vc1
|
V. cracca
|
0.93
|
●
|
-
|
-
|
-
|
●
|
-
|
-
|
-
|
●
|
Vc2
|
V. cracca
|
0.82
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Vc3
|
V. cracca
|
0.81
|
●
|
-
|
-
|
-
|
●
|
-
|
-
|
-
|
●
|
Ps1
|
P. sativum
|
0.92
|
-
|
-
|
-
|
-
|
●
|
●
|
-
|
-
|
●
|
Ps2
|
P. sativum
|
0.84
|
-
|
-
|
-
|
-
|
-
|
●
|
-
|
-
|
●
|
Ps3
|
P. sativum
|
0.96
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
●
|
Table S1: Characteristics of the 33 individual genotypes selected for whole-genome re-sequencing. Black dots indicate for each individual genotype the presence of one or several of the 9 symbionts (1 obligatory and 8 facultative) reported for the pea aphid and detected based on a PCR-specific test. The obligate symbiont Buchnera is present in all pea aphid genotypes and its PCR-based detection is used as a positive control.
Table S2: Coverage for each individual obtained after initial mapping (Bowtie2) of paired-end read sets onto a set of reference genomes (Acyrthosiphon pisum, mitochondrial genome, Buchnera aphidicola, Spiroplasma melliferum KC3, Hamiltonella defensa 5AT, Rickettsiella grylli, Regiella insecticola R5.15, Wolbachia sp. Strain wRi, Rickettsia sp. Endosymbiont of Ixodes scapularis, Serratia symbiotica str. Tucson). Coverage obtained after mapping of unmapped reads to Rickettsia bellii and to annotated contigs of the Spiroplasma draft are also provided. Coverage higher than 2x is highlighted in grey and the sizes of reference genomes are indicated above the corresponding names. PCR test indicates the presence (+) or absence (-) of the facultative symbionts of A. pisum based on their detection with species-specific primers.
Individual
|
nDNA A. pisum
|
mtDNA A. pisum
|
Buchnera aphidicola
|
Spiroplasma melliferum
|
Spiroplasma A. pisum (partial)
|
PCR test
|
Hamiltonella defensa
|
PCR test
|
Rickettsiella
grylli
|
PCR test
|
Regiella insecticola
|
PCR test
|
Wolbachia Strain wRi
|
PCR test
|
Rickettsia ixodes
|
Rickettsia bellii
|
PCR test
|
Serratia symbiotica
|
PCR test
|
|
530 Mb
|
17 kb
|
600 kb
|
1.29 Mb
|
780 kb
|
|
2.11 Mb
|
|
1.58 Mb
|
|
2 Mb
|
|
1.45 Mb
|
|
2.1 Mb
|
1.5 Mb
|
|
2.60 Mb
|
|
Ms1
|
14.3
|
283.94
|
248.25
|
0
|
0
|
-
|
103.59
|
+
|
0.21
|
+
|
0.65
|
-
|
0
|
-
|
0
|
0
|
-
|
0.07
|
-
|
Ms2
|
12.34
|
384.54
|
534.9
|
0
|
0
|
-
|
117.7
|
+
|
0
|
-
|
2.76
|
+
|
0
|
-
|
0
|
0
|
-
|
0.27
|
-
|
Ms3
|
12.93
|
541.43
|
557.86
|
0
|
0
|
-
|
14.06
|
+
|
0
|
-
|
0.05
|
-
|
0
|
-
|
0
|
0.01
|
-
|
0.02
|
-
|
Tp1
|
11.85
|
365.78
|
297.77
|
0
|
0.13
|
-
|
0.53
|
-
|
0
|
-
|
52.96
|
+
|
0
|
-
|
0
|
0
|
-
|
0.02
|
-
|
Tp2
|
12.77
|
677.67
|
427.31
|
0
|
0.21
|
-
|
0.22
|
-
|
0
|
-
|
38.64
|
+
|
0
|
-
|
0
|
0
|
-
|
0.02
|
-
|
Tp3
|
13.71
|
2080.8
|
1501.75
|
0.21
|
273.13
|
+
|
0.1
|
-
|
0
|
-
|
35.4
|
+
|
0
|
-
|
0
|
0
|
-
|
0.01
|
-
|
Vc1
|
16.84
|
1241.77
|
900.81
|
0.13
|
277.65
|
+
|
0.03
|
-
|
0
|
-
|
0.08
|
-
|
0
|
-
|
0
|
0.03
|
-
|
5.23
|
+
|
Vc2
|
11.91
|
733.97
|
702.45
|
0
|
0.12
|
-
|
0.01
|
-
|
0
|
-
|
0.02
|
-
|
0
|
-
|
0
|
0
|
-
|
0
|
-
|
Vc3
|
11.36
|
777.87
|
726.5
|
0.71
|
1185.31
|
+
|
0.09
|
-
|
0
|
-
|
0.01
|
-
|
0
|
-
|
0
|
0.01
|
-
|
6.69
|
+
|
Ps1
|
14.25
|
1846.42
|
820.04
|
0
|
0.28
|
-
|
0.04
|
-
|
0
|
-
|
0.02
|
-
|
0
|
-
|
0.91
|
13.51
|
+
|
7.65
|
+
|
Ps2
|
12.23
|
392.91
|
138.08
|
0
|
0.61
|
-
|
0.01
|
-
|
0
|
-
|
0.01
|
-
|
0
|
-
|
3.67
|
59.26
|
+
|
0.07
|
-
|
Ps3
|
11.95
|
483.42
|
294.02
|
0
|
0.07
|
-
|
0.09
|
-
|
0
|
-
|
0.19
|
-
|
0
|
-
|
0
|
0.01
|
-
|
0
|
-
|
Ml1
|
15.84
|
540.63
|
623.68
|
0
|
0
|
-
|
0.04
|
-
|
0
|
-
|
0.02
|
-
|
0
|
-
|
0.47
|
8.01
|
+
|
0
|
-
|
Ml2
|
17.91
|
1161.81
|
901.62
|
0
|
0
|
-
|
0
|
-
|
0
|
-
|
0.01
|
-
|
0
|
-
|
3.51
|
54.33
|
+
|
0
|
-
|
Ml3
|
10.42
|
257.09
|
170.1
|
0
|
0
|
-
|
0
|
-
|
0
|
-
|
0.01
|
-
|
0
|
-
|
0.63
|
10.71
|
+
|
0
|
-
|
Lc1
|
13.56
|
387.48
|
182.24
|
0
|
0.63
|
-
|
0
|
-
|
0
|
-
|
0.12
|
-
|
0
|
-
|
0
|
0
|
-
|
0
|
-
|
Lc2
|
19.96
|
2046.81
|
1290.62
|
0
|
0.10
|
-
|
0
|
-
|
0
|
-
|
0.01
|
-
|
0
|
-
|
0
|
0
|
-
|
0
|
-
|
Lc3
|
13.79
|
797.32
|
530.88
|
0
|
2.05
|
-
|
0
|
-
|
0
|
-
|
0.01
|
-
|
0
|
-
|
0
|
0
|
-
|
0.02
|
-
|
Mo1
|
13.37
|
1338.12
|
1091.21
|
0
|
0
|
-
|
0.04
|
-
|
0
|
-
|
0.01
|
-
|
0
|
-
|
0
|
0
|
-
|
0
|
-
|
Mo2
|
20.16
|
971.29
|
1269.51
|
0
|
0
|
-
|
0.06
|
-
|
0
|
-
|
0.02
|
-
|
0
|
-
|
0
|
0
|
-
|
0
|
-
|
Mo3
|
14.21
|
406.15
|
436.51
|
0
|
0
|
-
|
0.03
|
-
|
0
|
-
|
0.01
|
-
|
0
|
-
|
0
|
0
|
-
|
0
|
-
|
Sv1
|
16.03
|
696.52
|
621.32
|
0
|
0.02
|
-
|
0.01
|
-
|
0
|
-
|
0.02
|
-
|
0
|
-
|
0
|
0
|
-
|
0.01
|
-
|
Sv2
|
15.79
|
444.72
|
464.11
|
0
|
0
|
-
|
0.05
|
-
|
0
|
-
|
0.01
|
-
|
0
|
-
|
0
|
0
|
-
|
9.01
|
+
|
Sv3
|
13.52
|
1690.28
|
985.69
|
0
|
0
|
-
|
0.21
|
-
|
0
|
-
|
0.02
|
-
|
0
|
-
|
0
|
0
|
-
|
14.84
|
+
|
Cs1
|
10.6
|
771.1
|
933.57
|
0
|
0
|
-
|
0
|
-
|
0
|
-
|
0.08
|
-
|
0
|
-
|
0
|
0
|
-
|
0
|
-
|
Cs2
|
15.1
|
3248.6
|
1509.03
|
0
|
0
|
-
|
0
|
-
|
0
|
-
|
0.01
|
-
|
0
|
-
|
0
|
0
|
-
|
0
|
-
|
Cs3
|
17.53
|
1389.32
|
1349.63
|
0
|
0
|
-
|
0
|
-
|
0
|
-
|
0.13
|
-
|
0
|
-
|
0
|
0
|
-
|
0
|
-
|
Os1
|
15.41
|
1067.73
|
836.98
|
0
|
0
|
-
|
2.42
|
+
|
0
|
-
|
0.01
|
-
|
0
|
-
|
0
|
0.01
|
-
|
0.01
|
-
|
Os2
|
16.12
|
1295.36
|
1097.57
|
0
|
0
|
-
|
0.09
|
-
|
0
|
-
|
0.01
|
-
|
0
|
-
|
0.01
|
0.05
|
-
|
0
|
-
|
Os3
|
12.34
|
331.39
|
1459.96
|
0
|
0.01
|
-
|
49.97
|
+
|
0
|
-
|
0.47
|
-
|
0
|
-
|
0.02
|
0
|
-
|
0.35
|
-
|
Lp1
|
14.95
|
796.29
|
602.21
|
0
|
0
|
-
|
0
|
-
|
0
|
-
|
0.02
|
-
|
0
|
-
|
0
|
0
|
-
|
0.02
|
-
|
Lp2
|
13.89
|
918.28
|
545.64
|
0
|
0
|
-
|
0.01
|
-
|
0
|
-
|
0.02
|
-
|
0
|
-
|
0
|
0
|
-
|
0
|
-
|
Lp3
|
14.64
|
850.72
|
657.28
|
0
|
0
|
-
|
0
|
-
|
0
|
-
|
0.02
|
-
|
0
|
-
|
0
|
0
|
-
|
0
|
-
|
Figure S3: Percentages of similar reads between one individual and all others. Each dot represents the percentage of similarity between the studied individual and each of the remaining 32 individuals. Red dots correspond to comparisons between two individuals of the same biotype.
Literature cited
Barré A., de Daruvar A. and Blanchard A. (2004). MolliGen, a database dedicated to the comparative genomics of Mollicutes. Nucleic Acids Res. 32, Database issue, D307-310 URL : http://www.molligen.org
Caillaud MC, Mondor-Genson G, Levine-Wilkinson S, Mieuzet L, Frantz A, Simon JC et al (2004). Microsatellite DNA markers for the pea aphid Acyrthosiphon pisum. Mol Ecol Notes 4(3): 446-448.
Chikhi R. and Rizk G. (2012) Space-efficient and exact De Bruijn graph representation based on a bloom filter. In WABI , volume 7534 of Lecture Notes in Computer Science , pages 236–248. Springer.
Dereeper A., Guignon V., Blanc G., Audic S., Buffet S., Chevenet F., Dufayard J.-F., Guindon S., Lefort V., Lescot M., Claverie J.-M., Gascuel O. (2008) Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucl. Acids Res. Jul 1; 36 (Web Server Issue):W465-9. Epub 2008 Apr 19.
Fukatsu T, Tsuchida T, Nikoh N, Koga R (2001). Spiroplasma symbiont of the pea aphid, Acyrthosiphon pisum (Insecta: Homoptera). Appl Environ Microbiol 67(3): 1284-1291.
Ku C, Lo WS, Chen LL, Kuo CH. (2013) Complete genomes of two dipteran-associated Spiroplasmas provided insights into the origin, dynamics, and impacts of viral invasion in Spiroplasma. Genome Biol Evol. 2013;5(6):1151-64.
Peccoud J, Figueroa CC, Silva AX, Ramirez CC, Mieuzet L, Bonhomme J et al (2008). Host range expansion of an introduced insect pest through multiple colonizations of specialized clones. Mol Ecol 17(21): 4608-4618.
Pritchard JK, Stephens M, Donnelly P (2000). Inference of population structure using multilocus genotype data. Genetics 155: 945-959.
Peccoud J, Bonhomme J, Mahéo F, de la Huerta M, Cosson O, Simon JC (2014). Inheritance patterns of secondary symbionts during sexual reproduction of pea aphid biotypes. Insect Sci 21: 291–300.
Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, Peplies J, Glöckner FO (2013) The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucl. Acids Res. 41 (D1): D590-D596. |