Thursday, May 26, 2016
May 26, 2016 at 03:40AM
Today I Learned: 1) Snail mouths are beautiful! Example: http://ift.tt/27TpMsI On a similar note, have you ever looked inside a sea turtle's mouth? It's bizarre. http://ift.tt/1OOFgT8. Those spiky bits go quite far down the turtle's esophagus. Sea turtles have super-long, coiled-up esophagi, presumably because the jellyfish they largely subsist on aren't very nutrient-rich and take a while to digest, so they need to store a lot of them. The spikes are to keep any jellyfish from escaping when the turtle expells seawater from its stomach. Thanks for sending me down this rabbit hole, Sarah Seid! 2) This is something I already knew about, but every time I run across this I'm amazed anew... in class today, our professor showed us an analysis of some data from some sort of pulsatile gene in one of the university's labs. The data were really simple -- just a list of times between pulses of the gene. If you plot a histogram of those times, it looks pretty much like an exponential distribution -- lots of very short inter-spike times, with a long tail of larger spike times. The question we addressed was this: is the distribution of arrival times actually exponentially-distributed, or is it something that looks superficially like an exponential distribution? Why that question? Well, if the gene's pulses were random, uncorrelated events with equal probability of happening at any particular time, then their inter-pulse distribution would be exponentially distributed. If they're *not* exponentially distributed, it means there's something else going on. In this case, we compared three hypotheses -- the exponential distribution, which, as I said, you get if you assume that every pulse is random and independent of the others; a Weibull distribution, which is what you would get if the underlying process was random but became either more or less likely over time (kind of like the likelihood of equipment failure -- it's more or less random, but becomes more likely as the equipment gets older); and a double-exponential distribution, which would occur if there were two distinct states with different pulse rates that the cell could go into after each pulse. We very quickly went through a Bayesian analysis that gave us, in the end, a ratio of likleihoods between each of the different hypotheses. We also got the best-fit curve for each hypothesis, and compared each to the data. The exponential distribution fit pretty darned well for moderate interpulse times, but deviated noticably at small and large interpulse times. The Weibull distribution fit slightly better, but not much. The double-exponential distribution fit the front half perfectly, and was about as accurate as the other two at high interpulse times. So, which hypothesis was most likely? The double-exponential. Sure, fine, whatever -- what startled me was *how much more likely* the double-exponential was. If you assume before the experiment that the double-exponential and exponential distributions were more or less equally likely, and after accounting for differences in model complexity, the double-exponential was something like TEN to the ONE HUNDRED AND FIFTY FOUR times more likely than the exponential. To put that number just a tiny little bit into perspective, let me write that out, with some benchmarks annotated: 10,000,000,000,000,000,000,000,000,000,000,000∞,000,000,000,000,000,000,000,000,000,000,000,000,00¥0,000◊,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,00†0,000,0°00,00§0,000‡,000*,000,000 * One million. This is a huge number. This, I'm guessing, is larger than most people can accurately conceive. ‡ Roughly the order of magnitude of the number of people on Earth right now. § Approximate US national debt, in $. ° The approximate age of the universe, in seconds. † Roughly the number of atoms in a gram of aluminum. ◊ Lower bound on the number of atoms in the universe. ¥ Upper bound on the number of atoms in the universe. ∞ Roughly the number of molecular events that have occurred in the universe, assuming each atom reacts once every femtosecond. There are essentially no physically relevant numbers above this. So... that number is about 10 billion billion *quadrillion* times larger than the largest number I would typically consider "even possibly physically relevant". The odds of the actual process underlying those pulse data are zero. All from a slight deviation and the extreme ends of the data.... Today I learned that when you crunch the numbers when estimating probabilities of hypotheses, it's really easy to get absolutely ludicrous probability ratios. Completely, utterly, bugnuts ridiculous probability ratios. 3) There's an effect sometimes called the "Monad tutorial fallacy", which comes from a problem in the functional language community that springs up around monads a lot. I should preface this by saying that I don't understand what a monad is. So if you're hoping to learn what a monad is from this TIL, I'm afraid you're going to have to wait (or go look up monads). Here's an apparently-common occurence when someone decides to learn about monads. The monad-expert-in-training goes and looks up an example of a monad. They struggle with that particular monad for a bit, then learn about another one, and then a couple more, and soon they've seen a lot of monads and they get an idea of what a monad is. Then they have an epiphany about what a monad *really* is, or what a really good metaphor is for a monad, and they get really excited. In their excitement, this person writes a tutorial explaining their metaphor -- for example, that monads are like burritos. This metaphor works perfectly well for the newly-minted-knight-of-monads, but anyone reading the tutorial gets *really confused* because they don't actually have any specific examples of what a monad might look like... they just know that monads are kind of like burritos. Except when they're not. Which isn't helpful. Thus, a new, confusing tutorial on monads is born. The moral of the story? It's easier to learn by abstracting from concrete examples than it is to try to learn the abstract idea first. Related: "Monads are hard because there are so many bad monad tutorials getting in the way of finally finding Wadler’s nice paper." -- Luke Gorrie (a hacker and entrepreneur) Also, it occurs to me that this series of TILs is exactly the kind of place an excited new monad-user would post a totally confusing summary of their epipheny on monads. If you notice me doing something similar, please call me out and direct me to this post.
Labels:
IFTTT,
TodayILearned
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment