MashMap

Software Description:

MashMap implements a fast and approximate algorithm for computing local alignment boundaries between long DNA sequences. It can be useful for mapping genome assembly or long reads (PacBio/ONT) to reference genome(s). Given a minimum alignment length and an identity threshold for the desired local alignments, Mashmap computes alignment boundaries and identity estimates using k-mers. It does not compute the alignments explicitly, but rather estimates a k-mer based Jaccard similarity using a combination of Minimizers and MinHash. This is then converted to an estimate of sequence identity using the Mash distance. An appropriate k-mer sampling rate is automatically determined using the given minimum local alignment length and identity thresholds. The efficiency of the algorithm improves as both of these thresholds are increased.


Software Home Page: MashMap


Software Documentation:

To run this software in a Linux environment on LCC,
run the command(s):

#To load module
module load ccs/conda/mashmap-2.0

#To unload module
conda deactivate

Center for Computational Sciences