Software Description:
CD-HIT is a very widely used program for clustering and comparing protein or nucleotide sequences. CD-HIT was originally developed by Dr. Weizhong Li at Dr. Adam Godzik's Lab at the Burnham Institute (now Sanford-Burnham Medical Research Institute)
CD-HIT is very fast and can handle extremely large databases. CD-HIT helps to significantly reduce the computational and manual efforts in many sequence analysis tasks and aids in understanding the data structure and correct the bias within a dataset.
Software Home Page: CD-HIT
Software Documentation:
To run this software in a Linux environment on LCC,
run the command(s):
#To load module, first you need GCC compilers module swap intel gnu8/8.3.0then module load ccs/cd-hit/v4.8.1-2019-0228 #To unload module module unload ccs/cd-hit/v4.8.1-2019-0228
, multiple selections available,
Center for Computational Sciences