Somatic Truth

Proposed Cover Art


A validated lineage-derived somatic truth data set enables benchmarking in cancer genome analysis.

Nature Communications Biology 3, Article number: 744 (2020)  


Image featured as banner art on Nature Communications Biotechnology homepage (see screenshot below) 

One of the main difficulties of analyzing sequencing data from somatic (cancer) samples, stems from their inherent noisiness. In order to properly evaluate and compare different algorithms, robust datasets that separate the noise from the true somatic variation are needed. The paper proposes a dataset of short somatic mutations that are validated using a known cell lineage. The illustration playfully references the biblical story of the tree of knowledge. The fruits in the tree reveal lineage diagrams that are either luminous or ghost-like. The luminous fruit depict mutations that adhere to the lineage structure and thus can be explained by a single event and are considered to be 'true'. The ghost fruit depict, non-real mutations or noise events that do not fall under a single branch of the lineage and are thus labeled as 'false'.