Misassemblies in noisy assemblies
I think that all the people who have ever done a genome assembly one day say: "Ok my assembly is cool, but now how I can be sure that it's the best and it doesn't contain a lot of errors ?" We have many technics to evaluate the quality of assemb…
Why I stopped C++
I have the feeling that generally bioinformatician use two languages, a language for small scripts, rapid analysis, prototyping (usually an interpreted language) and another when we have performance needs (usually a compiled language). Until recently thes…
How to reduce the impact of your PAF file on your disk by 95%
Last week Shaun Jackman posted this tweet: I have a 1.2 TB PAF.gz file of minimap2 all-vs-all alignments of 18 flowcells of Oxford Nanopore reads. Yipes. I believe that's my first file to exceed a terabyte. Is there a better way? Perhaps removing th…
State-of-the-art long reads overlapper-compare
Introduction In 2017, Chu et la. wrote a review 1 to present and compare 5 long-read overlapping tools, on 4 datasets (including 2 synthetic ones). This paper is very cool and clear. The authors compare overlappers with respect to peak memory, wall clock …