Description
VSEARCH is an open source and free of charge multithreaded 64-bit tool for processing metagenomics nucleotide sequence data. It is designed as an alternative to the widely used USEARCH tool (Edgar 2010) for which the source code is not publicly available, algorithm details are only rudimentary described, and only a memory-confined 32-bit version is freely available for academic use. VSEARCH is able to handle the large datasets that the free 32-bit version of USEARCH cannot take.
VSEARCH includes most commands for analysing nucleotide sequences available in USEARCH version 7 and several of those available in USEARCH version 8, including searching (exact or based on global alignment), clustering (by length, abundance or user-defined order), chimera detection (reference based or de novo), dereplication (full length or prefix), pairwise alignment, reverse complementation, sorting, and subsampling. VSEARCH also includes commands for FASTQ file processing, i.e. format detection, filtering, read quality statistics, and merging of paired reads. Furthermore, VSEARCH extends functionality with several new commands and improvements, including shuffling, rereplication, masking of low-complexity sequences with the well-known DUST algorithm, a choice among different similarity definitions, and FASTQ file format conversion.
Home page
https://github.com/torognes/vsearch
Documentation
https://github.com/torognes/vsearch
License
The VSEARCH code is dual-licensed either under the GNU General Public License version 3 or under the BSD 2-clause license.
Usage
Use
module avail VSEARCH
to see which versions of VSEARCH are available. Use
module load VSEARCH/version
to get access to VSEARCH.