Today I Learned: May 17, 2016 at 04:02AM

Today I learned: 1) So there are agreements that DNA synthesis companies have to screen any DNA in orders they receive to make sure they aren't being asked to order dangerous stuff like toxin genes or genes for human pathogens. Today I learned that guidelines for screening are ONLY RECOMMENDATIONS, that all such screening is voluntary, and that the official recommendations for screening only apply to nucleotides longer than 200 base pairs. That means that if you can construct, say, a polio virus genome from only 100 bp oligos (which *I* certainly can), then you can do it with ordered DNA. On the one hand, that's a little dismaying. On the other hand, I think if someone's willing to go through the trouble of assembling a genome from 100 bp oligos, then simple screening processes aren't going to stop them. The screens are to stop amateurs and terrorists without a ton of biological expertise from simply ordering a deadly genome custom-built for them, without hampering the massively overwhelming majority of DNA synthesis that is totally harmless (and sometimes useful) 2) DNA has a theoretical storage density of 500 exabytes per cubic millimeter. For some examples of storage density and total readable capacity, see this cool figure: http://ift.tt/1Oxhw5V Here, "This work" is "Next-Generation Digital Information Storage in DNA" (http://ift.tt/1re3BNa if you have Science access, possibly http://ift.tt/1Aixs81 if you don't), in which the authors stored and read a book encoded out in HTML, images and all. They got 22 errors out of something like a megabyte of total stored information. Not particularly close to silicon-level accuracy by any means, but not bad for an in-vitro biological system. They used an encoding scheme where each nucleotide carried 1 bit of information (simply A or C for 0, T or G for 1 (or something like that)). The advantage of that system is that there's some flexibility in encoding, which is particularly nice because it lets you avoid long repeats of the same nucleotides, which are hard to both synthesize and sequence. There are other schemes that are a bit more compressed that can explicitly avoid repeats of any kind, which should make them even more amenable to this kind of storage. Why store digital information in DNA, you might ask? Reading off a hard drive is a hell of a lot faster and cheaper than synthesizing a bunch of DNA and then sequencing it. Hell, reading off a FLOPPY disk is faster and cheaper than syntheziging a bunch of DNA and then sequencing it, even if you don't own a floppy drive! Well, that's true, and DNA sequencing is unlikely to ever get faster than in-silico information retreival. BUT! The costs of DNA synthesis and information readout are plummeting much, much faster than the costs of in-silico data storage devices, per unit of information. In something like a few decades, DNA synthesis and sequencing should be *cheaper* than whatever the future equivalent of hard disks will be. Furthermore, DNA is actually really, really stable over long timescales, even relative to pretty stable stuff like hard drives. If you want to store information for a thousand years, DNA is a surprisingly good choice. Also, as the authors of the above paper point out, "DNA’s essential biological role provides access to natural reading and writing enzymes and ensures that DNA will remain a readable standard for the foreseeable future" -- in other words, your hard floppy drive might be obsolete and therefore effectively unreadable in 100 years, but you can be sure we're going to have technology for sequencing DNA as long as we have a civilization. So, in summary, DNA-based data storage is SLOW and expensive, but it will probably eventually be cheaper than hardware-based data storage, and it stores really well. It's a poor choice of storage media for your laptop. If, on the other hand, you want to take a snapshot of the internet and archive it for future generations, then your best bet might be to encrypt it into nucleotide sequences, synthesize the sequences, and bury a few tubes of the resulting libraries in some nice safe salt mines. 3) Windows has a cool tool for recording steps for reproducing bugs and errors. It's called the Problem Steps Recorder, and you can run it by going to the start menu and typing "psr". Just hit "record", do some stuff, and eventually hit "stop recording". The output is a weird pseudo-HTML file (zipped up, of course) that shows exactly what you did while recording, step by step, with pictures and automatic highlighting on the most relevant objects on the screen. The downside -- the weird pseudo-HTML can only be interpreted by IE, as far as I can tell, and IE can't even export it as normal, readable HTML. You were so close to impressing me, Microsoft. So close!

Today I Learned

Tuesday, May 17, 2016

May 17, 2016 at 04:02AM

No comments:

Post a Comment

Blog Archive