11 Feb 2021

Nanodrop 260/230 ratios decline with time

 

260/230 ratios as estimated by the Nanodrop can vary a lot among repeat measurements, and decline over time as the drop is left on the pedestal. In contrast, the 260/280 ratio and concentration remains stable over several minutes. I have confirmed this behaviour in multiple tests. I have no idea about the mechanism behind - if it was from evaporation the concentrations should increase. In conclusion - for better accuracy, make replicate measurements, but do so quickly! 


28 Jan 2021

Benchmarking Nanopore basecallers: some observations on the Bonito basecaller

 We have sequenced several fish genomes on our MinION. Whenever there is a new version of the Guppy basecaller I re-basecall a small dataset from each species and align the raw sequences to previously published, independent references. Using Heng Li's one-liner for sequence identity, I get an estimate of the raw error rate of the sequences. 




Frequency distrubutions of percent identity to reference for Species 1. 


The Bonito 441 basecaller (using the res_dna_r941_min_crf_v031 model from Rerio) has a nice improvement in raw accuracy. At the moment this comes at the cost of slower basecalling speeds (~3 times slower on our GTX 1080 GPU). According to ONT a speed upgrade should be coming soon with a new Guppy release!

In my tests Bonito resulted in slightly less total bases, but slightly higher proportion of those reads mapped to the reference (using MiniMap2 and Samtools).
 



In Bonito the low default chunk_size of 720 may be reducing slightly the accuracies. Setting chunk_size instead to 1000 resulted in a small improvement in the accuracies. Setting it to 1200 or higher caused it to crash. 

Lastly, it seems the fastq quality scoring is broken in Bonito, seeing how there is no relation between the quality scores and percent match when mapping the reads to the reference genome (unlike in Guppy):





Plots were made in NanoPlot



25 Jan 2021

Basecalling on the MinION Mk1C - speed up by 3x!

We recently received our new Mk1C MinION sequencer/mini PC. It has a GPU for basecalling, but it is much weaker than the GTX1080 in our standalone MinION PC, so it will probably not be used much for basecalling. Since the Mk1C runs on Linux Ubuntu one can ssh in and run commands from the terminal. In this way I did some benchmarking with various Guppy parameters. This revealed that while the basecalling speed with the "fast" model cannot be improved much, the "HAC" (High Accuracy) model can be sped up by almost 3 times! 



Increasing the chunks_per_runner seems to be the only setting that makes much difference (thanks to https://github.com/sirselim/jetson_nanopore_sequencing) Increasing it to above 512 caused hangs and crashes. In one case I had to force reboot the Mk1C by pressing the power button for ~10 seconds. All tests were done on a single fast5 file using Guppy423 (MinION Release 20.10.3). Use these settings at your own risk! 
 
Best Mk1C basecaller speed:
guppy_basecaller --config dna_r9.4.1_450bps_hac.cfg --input_path /data/jon/fast5 --save_path /data/jon/Guppy423 --qscore_filtering --device auto --num_callers 1  --gpu_runners_per_device 2 --chunks_per_runner 512





Memory use
I used this command to log the memory use every five seconds:
top -d 5 -b | grep 'KiB Mem' >> freeMem.txt
Below is the minimum amount of free memory during each benchmark session (Hac model)

chunks_per_runner    free memory (MB)
48                                  816
256                                286
512                                  78


Getting temperature readings from the terminal:
As a Linux novice I just copy and paste commands I find online and hope it works: 
paste <(cat /sys/devices/virtual/thermal/thermal_zone*/type) <(cat /sys/devices/virtual/thermal/thermal_zone*/temp) | column -s $'\t' -t | sed 's/\(.\)..$/.\1°C/'

Example result:
BCPU-therm        36.5°C
MCPU-therm       36.5°C
GPU-therm          35.0°C
PLL-therm           36.5°C
Tboard_tegra       32.0°C
Tdiode_tegra       33.0°C
PMIC-Die          100.0°C
thermal-fan-est   35.9°C

The 100 degrees for the PMIC-Die is not real. I did a full basecalling of a previous run to see if the basecaller would be stable with the new settings, and there were no issues, but it took several days to complete. The temperatures never got very high. But the fan does make a bit of noise! 






27 Aug 2020

Benchmarking Nanopore basecallers

We have sequenced several fish genomes on our MinION. Whenever there is a new version of the Guppy basecaller I re-basecall a small dataset from each species and align the raw sequences to previously published, independent references. Using Heng Li's one-liner for sequence identity, I get an estimate of the raw error rate of the sequences. 




Each color represents a different fish species' genome. These were all basecalled with the high accuracy model (HAC). In all cases HAC was more accurate than the fast model. As can be seen, new versions often had no improvement in the accuracy, but may have had other improvements with regard to e.g. speed, stability, new functionality, etc. This figure will be updated as new basecaller versions are released. 


The estimated error rate may seem a little on the optimistic side. The rate corresponds roughly to the peak in the distribution of error rates. This distribution has a long tail of higher error rates (although some of this will be removed by quality filtering the reads). Also, it is not clear that the increase in raw accuracy always leads to better assemblies in terms of contig sizes and gene completeness. In our (still limited) experience, often the slightly older models have given better assemblies. Therefore it might be a good idea to keep around the older basecaller versions and not autmatically assume the latest is always the best. 



25 Aug 2020

Phylogenetic tree of Nanopore library kits

About once per year I make this "phylogenetic" tree of Nanopore kits. I find a great way to quickly get an overview of the kits and their characteristics. In time I hope to do the same for Illumina kits. Feel free to use this figure if you find it useful. 

Nanopore library kits 2020