Monday, October 10, 2016
October 11, 2016 at 01:11AM
Today I Learned: 1) You know those fancy prepackaged salads with multiple lettuces of different colors? Apparently those lettuces are grown together, in parallel rows, so that they can be harvested together and processed and packaged in the field for shipment. The effect is quite beautiful. 2) Last night I ran an experiment evolving agents in a series of iterated prisoner's dilemmas*. The agents were limited to "reactive strategies", which are strategies in which each agent probabilistically determines whether to cooperate or defect based on only the last move of the opponent, where the cooperate rate for each possible previous move can be tuned between 0 and 1. Each generation, each agent plays an iterated prisoner's dilemma against some fraction of the other agents (one at a time). The total fitness of each agent is calculated, then a new population of agents is randomly selected from the original (with duplication possible) where each agent's probability of being picked for the next generation is proportional to its fitness. Finally, each new agent's paramters are jiggled slightly to add variation to the next round of selection. Here are a couple of lessons I've learned from the results: 2a) Taking a random (Cauchy, in this case, which is like a bell curve but with greater probability of a large result) move in one axis and then another random move on the other axis isn't even *close* to a good way to generate random spread around a point (this came up when generating mutations between generations). 2b) Tit-for-tat, which is generally considered a really, really good strategy in an iterated prisoner's dilemma, doesn't seem to be stable in my evolution simulator. If you start with all of the agents clustered in the "cooperate after the opponent cooeperates; defect after the opponent defects" corner of strategy space, they pretty quickly start to spread out. Notably, I got a cluster of strategies at around generation 20 that defected a lot, and they seemed to do much better than the tit-for-tatters. By about generation 65, the agents were totally spread around strategy space. I've read that tit-for-tat is actually really vulnerable to "mistakes" in gameplay, as happens when the agent's probability of cooperation in response to cooperation isn't quite 1. If you have two agents that copy the other one with 99% fidelity, then eventually one of them will mess up. When it does, they get stuck in a vicious cycle of betrayal, which isn't very profitable for either one. Or maybe I've added too much mutation, and nothing is actually stable. Or, alternatively, my population size (100) could be too small to overcome genetic drift with the level of selection I'm applying. 3c) Some possible evidence to support the "too much mutation" hypothesis -- in a much longer simulation (5000 generations), it looks kind of like there's a ton of wandering around, without much in the way of stable strategies. I only printed out population snapshots every 50 generations, so it's hard for me to say right now exactly how the populations are moving, but they're definitely shifting around a lot between snapshots. A common -- but not universal -- trend is that defecting when the opponent cooperates is a highly profitable strategy, wheras whether or not the agents cooperate after a defection doesn't seem to matter much. I don't like the way I handled mutation in last night's run, so I've fixed it and will start another simulation tonight, this time sampling every generation (and perhaps not running quite as long). * I'm assuming most of my readership knows what a prisoner's dilemma is; if I'm wrong about that, let me know and I'll explain in the comments. 3) My new favorite party music playlist: http://ift.tt/2d8WCNW
Labels:
IFTTT,
TodayILearned
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment