3 Dec 2021

ORG.one sequencing success (and some oddities)

ORG.one is an initiative from Oxford Nanopore wherein one can apply for free Nanopore consumables for genome sequencing of critically endangered species. Together with a colleague I applied earlier this year, and we were accepted. The sequencing reagents arrived a few weeks ago - two flow cells and one LSK110 sequencing kit. This is the first time I have used the LSK110 kit (until now I have used LSK109), and it seems to work very well. The yields were high, 34 and 37 Gbases, respectively. These flow cells stayed alive for almost a week (with frequent nuclease flushes).  I actually added another 24 hours of sequencing time after 5 days of runtime, and got another ~1.6 Gbase of sequence!

A few oddities during the runs: The translocation speed on the flow cell on our MinION Mk1B was slightly high; starting out just above the green zone. The quality score was also marginally lower than for the other flow cell. 



After 4 days, out of nowhere, reads suddenly began going to the "Skipped" folder. A few hours later this behaviour stopped. I have no idea why. 



The other flow cell was run simultaneously on our Mk1C. After a nuclease flush, suddenly the pores on the sides of the sensor chip no longer worked, and a large proportion of the channels had changed status to "Saturated". However, multiple manual Mux scans gradually brought them back to life. The same happened on every subsequent nuclease flush. 




27 Jul 2021

MiSeq post run washes: beware of expired sodium hypochlorite

Bleach, or sodium hypochlorite (NaOCl) is optionally used during MiSeq post-run washes in order to eliminate run-to-run carryover of library template. I noticed a cloudiness in our 5% sodium hypochlorite. This stock solution was purchased several years ago and stored in the fridge as indicated on the label. The bottle had no expiration date. After some online searching it became obvious that bleach has a (very) limited shelf life, depending on the temperature and concentration. After several years our stock had decomposed to saltwater, and seemed to have some fungal growth! Fortunately our MiSeq seemed unaffected; a cursory check found no indications of  run cross-contamination, and a later instrument annual maintenance found the capillaries clear and clean. Nevertheless, the moral of the story is: regularly buy fresh NaOCl for your post-run washes! 






20 Jul 2021

Guppy update - "super-accurate" model

Towards the end of May Oxford Nanopore released a new version of the Guppy basecaller. This version includes the Bonito basecaller model, which I previously tested and found that the quality scoring was broken. You can now select among 3 models; fast, HAC, and sup, with sup ("super accurate") the slowest but most accurate. I put our five genomic test datasets through the new version, using the sup model. I am pleased to see that the quality scoring problem from Bonito has been fixed. The sup model shows a small increase in the raw accuracy. This comes at the cost of slower basecalling speeds. In conclusion, another nice upgrade in accuracy. One of these days I must do some assembly benchmarks* to see if this translates into better assemblies! Previous testing by a colleague of mine indicated that this was not always the case. 

* I just need to learn how to do assemblies :-)



Error rates were calculated using Heng Li's one liner. No quality trimming was applied, except for Species 5 which had a minimum quality score of 7. 




The read quality estimates are now at least somewhat correlated to how similar the sequences are to the reference.





6 May 2021

2021 phylogenetic tree of Nanopore library kits

It's that time of the year again. It is time for my annual phylogenetic tree of Nanopore library kits. It should be pretty self-explanatory. The devices are:

F - Flongle
M - MinION
G - GridION
P - PromethION





Note that the LSK109 kit will be discontinued on Sept. 10 2021, except for all COVID-related projects which will be supported indefinintely.  Please let me know if you spot any mistakes or have suggestions for imprevements. 


11 Feb 2021

Nanodrop 260/230 ratios decline with time

 

260/230 ratios as estimated by the Nanodrop can vary a lot among repeat measurements, and decline over time as the drop is left on the pedestal. In contrast, the 260/280 ratio and concentration remains stable over several minutes. I have confirmed this behaviour in multiple tests. I have no idea about the mechanism behind - if it was from evaporation the concentrations should increase. In conclusion - for better accuracy, make replicate measurements, but do so quickly! 


28 Jan 2021

Benchmarking Nanopore basecallers: some observations on the Bonito basecaller

 We have sequenced several fish genomes on our MinION. Whenever there is a new version of the Guppy basecaller I re-basecall a small dataset from each species and align the raw sequences to previously published, independent references. Using Heng Li's one-liner for sequence identity, I get an estimate of the raw error rate of the sequences. 




Frequency distrubutions of percent identity to reference for Species 1. 


The Bonito 441 basecaller (using the res_dna_r941_min_crf_v031 model from Rerio) has a nice improvement in raw accuracy. At the moment this comes at the cost of slower basecalling speeds (~3 times slower on our GTX 1080 GPU). According to ONT a speed upgrade should be coming soon with a new Guppy release!

In my tests Bonito resulted in slightly less total bases, but slightly higher proportion of those reads mapped to the reference (using MiniMap2 and Samtools).
 



In Bonito the low default chunk_size of 720 may be reducing slightly the accuracies. Setting chunk_size instead to 1000 resulted in a small improvement in the accuracies. Setting it to 1200 or higher caused it to crash. 

Lastly, it seems the fastq quality scoring is broken in Bonito, seeing how there is no relation between the quality scores and percent match when mapping the reads to the reference genome (unlike in Guppy):





Plots were made in NanoPlot



25 Jan 2021

Basecalling on the MinION Mk1C - speed up by 3x!

We recently received our new Mk1C MinION sequencer/mini PC. It has a GPU for basecalling, but it is much weaker than the GTX1080 in our standalone MinION PC, so it will probably not be used much for basecalling. Since the Mk1C runs on Linux Ubuntu one can ssh in and run commands from the terminal. In this way I did some benchmarking with various Guppy parameters. This revealed that while the basecalling speed with the "fast" model cannot be improved much, the "HAC" (High Accuracy) model can be sped up by almost 3 times! 



Increasing the chunks_per_runner seems to be the only setting that makes much difference (thanks to https://github.com/sirselim/jetson_nanopore_sequencing) Increasing it to above 512 caused hangs and crashes. In one case I had to force reboot the Mk1C by pressing the power button for ~10 seconds. All tests were done on a single fast5 file using Guppy423 (MinION Release 20.10.3). Use these settings at your own risk! 
 
Best Mk1C basecaller speed:
guppy_basecaller --config dna_r9.4.1_450bps_hac.cfg --input_path /data/jon/fast5 --save_path /data/jon/Guppy423 --qscore_filtering --device auto --num_callers 1  --gpu_runners_per_device 2 --chunks_per_runner 512





Memory use
I used this command to log the memory use every five seconds:
top -d 5 -b | grep 'KiB Mem' >> freeMem.txt
Below is the minimum amount of free memory during each benchmark session (Hac model)

chunks_per_runner    free memory (MB)
48                                  816
256                                286
512                                  78


Getting temperature readings from the terminal:
As a Linux novice I just copy and paste commands I find online and hope it works: 
paste <(cat /sys/devices/virtual/thermal/thermal_zone*/type) <(cat /sys/devices/virtual/thermal/thermal_zone*/temp) | column -s $'\t' -t | sed 's/\(.\)..$/.\1°C/'

Example result:
BCPU-therm        36.5°C
MCPU-therm       36.5°C
GPU-therm          35.0°C
PLL-therm           36.5°C
Tboard_tegra       32.0°C
Tdiode_tegra       33.0°C
PMIC-Die          100.0°C
thermal-fan-est   35.9°C

The 100 degrees for the PMIC-Die is not real. I did a full basecalling of a previous run to see if the basecaller would be stable with the new settings, and there were no issues, but it took several days to complete. The temperatures never got very high. But the fan does make a bit of noise!