JSON

msh

^ http://edamontology.org/format_3911


Mash sketch is a format for sequence / sequence checksum information. To make a sketch, each k-mer in a sequence is hashed, which creates a pseudo-random identifier. By sorting these hashes, a small subset from the top of the sorted list can represent the entire sequence.

Synonyms: min-hash sketch, Mash sketch

Term info

Subsets

formats, edam

Created in

1.22

Documentation

https://mash.readthedocs.io/en/latest/sketches.html, https://raw.githubusercontent.com/marbl/Mash/master/src/mash/capnp/MinHash.capnp, https://en.wikipedia.org/wiki/MinHash, https://doi.org/10.1186/s13059-016-0997-x

Example

https://mash.readthedocs.io/en/latest/tutorials.html#querying-read-sets-against-an-existing-refseq-sketch

Information standard

https://capnproto.org/cxx.html

Organisation

https://www.genome.gov/27562809/phillippy-group/, http://bnbi.org