Friday, January 19, 2018

January 19, 2018 at 09:32PM

Today I Learned: 1) I've wondered for a long time how text-based loading bars work. Many command line programs feature progress bars that somehow fill in using only text... even if the progress bar is in the middle of a line. For example, the Arch Linux package manager prints a lot of lines that look like core 126.8 KiB 906K/s 00:00 [###############-------] 66% where those "-"s eventually fill up with "#"s. But how does a program overwrite its old text? Well, today I finally went and googled exactly that question, and it turns out to be pretty easy -- there's a character called a carriage return ('\r') that sends the print-output-cursor to the beginning of the line. If you carriage return and then start typing, it will overwrite whatever was already in the output, at least if you output to a terminal window (I assume if you write a \r to a file, it will faithfully copy an ASCII '\r' to the file and keep going, but I haven't tried it). I've run into '\r' before, but almost always coupled with a carraige return ('\n') character, which in some systems is used to represent a new line (also, confusingly, '\n'). It never occurred to me that carriage return could be used *without* a newline. This is some seriously useful magic. For one thing, it makes loop-counting diagnostic prints a hell of a lot cleaner -- now if I want to know how a long loop is progressing, I can have a single line of output per loop, instead of printing stacks and stacks of output. 2) Did you know that the sex ratio of humans at birth is not quite 1:1? It's actually more like 107:100 male:female, quite reliably. This is *at birth*, so before any effects of selective infanticide. That 107:100 number is especially surprising in light of theoretical evolutionary game theory that suggests that gender ratios for most binary-gendered organisms should be very, very close to 1:1 -- essentially, if there are more males than females around, then you get a fitness boost by having more female offspring and vice versa, and the only stable equilibrium is even birth ratio. The wiki page on human sex ratio points at at least one potentially good evolutionary explanation for the imbalanced sex ratio. Why do *you* think there might be more male human births than female human births? 3) ...about an alternative to tempered MCMC and annealed MCMC, called adiabatic MCMC, that has some nice theoretical properties over the tempered and annealed versions. A quick reminder: MCMC is, extremely roughly speaking, a class of methods for sampling from the inputs of a function so that the probability of drawing some input is proportional to the value of that function for that input. Usually MCMC is used to sample from probability distributions, in which case it's a way of randomly sampling from a (potentially very complicated) random process. Typical MCMC algorithms use some kind of "walkers" in the input space. By some algorithm, they take a random jump in the input space, and then either accept the jump or stay where they started, depending on the probabilities it calculates at the start and end points. Essentially each walker is a noisy hill-climber -- it will tend to move up hills toward high-probability regions, but it can wander enough that it will also sample lower-probability regions some of the time, too. A major problem for simple walker-based strategies is that if the probability distribution under sample has two high-probability peaks that are separated by a low-probability valley, then walkers that start on one hill can take a really, really long time to wander over to the other one. Parallel-tempered MCMC and annealed MCMC try to get around the problem using a thermodynamic analogy. They essentially "heat" the walkers so that they can move around more easily, then "cool" them to trap them in local hilltops. This does kind of work... but unfortunately, the speed at which you heat and cool matters a lot, and there's no great way to pick a heating or cooling speed. Enter adiabatic MCMC. I'm not going to pretend the mathematics of adiabatic MCMC yet, but I *will* pretend to understand Andrew Gelman's metaphor for it here: http://ift.tt/1oNbdSs. Essentially, tempered MCMC treats temperature as a tunable knob, which it turns manually to achieve nice mixing. Adiabatic MCMC, in contrast, manages temperature by connecting the walker system to a heat bath with some temperature, and lets energy flow between the two according to their relative temperatures. Don't ask me how that actually *happens* at an algorithmic level, but the bottom line is that this more-or-less solves the problem of temperature adjustment. (Addendum: This seems like an awful lot of effort and complex math for relatively little gain. Why do I care about tempered MCMC and adiabatic MCMC? Well, for Techncial Reasons, tempered MCMC lets you compare the probabilities of different models in a Bayesian framework, according to some data. That's pretty powerful stuff, and it's really tricky to do in general. However, I've been warned by actual statisticians that using tempered MCMC for model comparison is Dangerous, so I'm interested in anything that can more safely replace it.)

No comments:

Post a Comment