CD-HIT

Software Description:

CD-HIT is a very widely used program for clustering and comparing protein or nucleotide sequences. CD-HIT was originally developed by Dr. Weizhong Li at Dr. Adam Godzik's Lab at the Burnham Institute (now Sanford-Burnham Medical Research Institute)

CD-HIT is very fast and can handle extremely large databases. CD-HIT helps to significantly reduce the computational and manual efforts in many sequence analysis tasks and aids in understanding the data structure and correct the bias within a dataset.


Software Home Page: CD-HIT


Software Documentation:

To run this software in a Linux environment on LCC,
run the command(s):

#To load module, first you need GCC compilers
module swap intel gnu8/8.3.0then
module load ccs/cd-hit/v4.8.1-2019-0228

#To unload module
module unload ccs/cd-hit/v4.8.1-2019-0228

Center for Computational Sciences