13 Mar 2022

MiSeq post run washes: an update

In a previous post I described how our 5% bleach solution for post-run washes had gone bad. Recently I did three MiSeq runs in a row with various amplicons. This was a chance for me to investigate potential carryover contamination between runs, as usually much time passes and multiple washes occur between runs. I was surprised to find a correlation between amplicon read numbers between the first two runs. Carryover levels were around 0.02% , whereas Illumina's Technical Support Note indicates "as low as 0.001% " after carrying out the bleach post-run wash. Between the second and third runs I did the post-run wash twice; this seems to have eliminated any carryover. The current batch of bleach arrived ~8 months ago; this is already too long ago it seems. 



Run2 read numbers represent carryover contamination from Run1, for the loci ITS and Uni18s. Each point represents an index combination which was used in Run1 but not Run2. 



3 Dec 2021

ORG.one sequencing success (and some oddities)

ORG.one is an initiative from Oxford Nanopore wherein one can apply for free Nanopore consumables for genome sequencing of critically endangered species. Together with a colleague I applied earlier this year, and we were accepted. The sequencing reagents arrived a few weeks ago - two flow cells and one LSK110 sequencing kit. This is the first time I have used the LSK110 kit (until now I have used LSK109), and it seems to work very well. The yields were high, 34 and 37 Gbases, respectively. These flow cells stayed alive for almost a week (with frequent nuclease flushes).  I actually added another 24 hours of sequencing time after 5 days of runtime, and got another ~1.6 Gbase of sequence!

A few oddities during the runs: The translocation speed on the flow cell on our MinION Mk1B was slightly high; starting out just above the green zone. The quality score was also marginally lower than for the other flow cell. 



After 4 days, out of nowhere, reads suddenly began going to the "Skipped" folder. A few hours later this behaviour stopped. I have no idea why. 



The other flow cell was run simultaneously on our Mk1C. After a nuclease flush, suddenly the pores on the sides of the sensor chip no longer worked, and a large proportion of the channels had changed status to "Saturated". However, multiple manual Mux scans gradually brought them back to life. The same happened on every subsequent nuclease flush. 




27 Jul 2021

MiSeq post run washes: beware of expired sodium hypochlorite

Bleach, or sodium hypochlorite (NaOCl) is optionally used during MiSeq post-run washes in order to eliminate run-to-run carryover of library template. I noticed a cloudiness in our 5% sodium hypochlorite. This stock solution was purchased several years ago and stored in the fridge as indicated on the label. The bottle had no expiration date. After some online searching it became obvious that bleach has a (very) limited shelf life, depending on the temperature and concentration. After several years our stock had decomposed to saltwater, and seemed to have some fungal growth! Fortunately our MiSeq seemed unaffected; a cursory check found no indications of  run cross-contamination, and a later instrument annual maintenance found the capillaries clear and clean. Nevertheless, the moral of the story is: regularly buy fresh NaOCl for your post-run washes! 






20 Jul 2021

Guppy update - "super-accurate" model

Towards the end of May Oxford Nanopore released a new version of the Guppy basecaller. This version includes the Bonito basecaller model, which I previously tested and found that the quality scoring was broken. You can now select among 3 models; fast, HAC, and sup, with sup ("super accurate") the slowest but most accurate. I put our five genomic test datasets through the new version, using the sup model. I am pleased to see that the quality scoring problem from Bonito has been fixed. The sup model shows a small increase in the raw accuracy. This comes at the cost of slower basecalling speeds. In conclusion, another nice upgrade in accuracy. One of these days I must do some assembly benchmarks* to see if this translates into better assemblies! Previous testing by a colleague of mine indicated that this was not always the case. 

* I just need to learn how to do assemblies :-)



Error rates were calculated using Heng Li's one liner. No quality trimming was applied, except for Species 5 which had a minimum quality score of 7. 




The read quality estimates are now at least somewhat correlated to how similar the sequences are to the reference.





6 May 2021

2021 phylogenetic tree of Nanopore library kits

It's that time of the year again. It is time for my annual phylogenetic tree of Nanopore library kits. It should be pretty self-explanatory. The devices are:

F - Flongle
M - MinION
G - GridION
P - PromethION





Note that the LSK109 kit will be discontinued on Sept. 10 2021, except for all COVID-related projects which will be supported indefinintely.  Please let me know if you spot any mistakes or have suggestions for imprevements. 


11 Feb 2021

Nanodrop 260/230 ratios decline with time

 

260/230 ratios as estimated by the Nanodrop can vary a lot among repeat measurements, and decline over time as the drop is left on the pedestal. In contrast, the 260/280 ratio and concentration remains stable over several minutes. I have confirmed this behaviour in multiple tests. I have no idea about the mechanism behind - if it was from evaporation the concentrations should increase. In conclusion - for better accuracy, make replicate measurements, but do so quickly! 


28 Jan 2021

Benchmarking Nanopore basecallers: some observations on the Bonito basecaller

 We have sequenced several fish genomes on our MinION. Whenever there is a new version of the Guppy basecaller I re-basecall a small dataset from each species and align the raw sequences to previously published, independent references. Using Heng Li's one-liner for sequence identity, I get an estimate of the raw error rate of the sequences. 




Frequency distrubutions of percent identity to reference for Species 1. 


The Bonito 441 basecaller (using the res_dna_r941_min_crf_v031 model from Rerio) has a nice improvement in raw accuracy. At the moment this comes at the cost of slower basecalling speeds (~3 times slower on our GTX 1080 GPU). According to ONT a speed upgrade should be coming soon with a new Guppy release!

In my tests Bonito resulted in slightly less total bases, but slightly higher proportion of those reads mapped to the reference (using MiniMap2 and Samtools).
 



In Bonito the low default chunk_size of 720 may be reducing slightly the accuracies. Setting chunk_size instead to 1000 resulted in a small improvement in the accuracies. Setting it to 1200 or higher caused it to crash. 

Lastly, it seems the fastq quality scoring is broken in Bonito, seeing how there is no relation between the quality scores and percent match when mapping the reads to the reference genome (unlike in Guppy):





Plots were made in NanoPlot