About fetchMG
The program “fetchMG” was written to extract the 40 MGs from genomes and metagenomes in an easy and accurate manner. This is done by utilizing Hidden Markov Models (HMMs) trained on protein alignments of known members of the 40 MGs as well as calibrated cutoffs for each of the 40 MGs. Please note that these cutoffs are only accurate when using complete protein sequences as input files. The output of the program are the protein sequences of the identified proteins, as well as their nucleotide sequences, if the nucleotide sequences of all complete genes are given as an additional input.
fetchMG is available as a stand-alone package, and also as a built in part of MOCAT. If you use fetchMG in your work, please cite the fetchMG article. If you have used the implementation of fetchMG available in MOCAT, please cite both articles below.
The fetchMG script requires the cdbyank, cdbfasta and HMMer3 executables. These software are (c) respecitve authors, and have been installed in the bin folder, within the fetchMG folder.
The fetchMG article is coming soon. You can read the full MOCAT article at PLoS ONE or the PDF version.