We’re well into the season of prospective graduate students visiting graduate schools and trying to decide where to spend possibly the next five or more years of their lives working pretty darn hard doing something they hopefully love. There are a lot of things to consider when thinking about graduate schools and lists exist that are much more thorough than I could achieve, but one thing that many students probably don’t consider is health insurance (I know I didn’t).
“Under-appreciated” is definitely hard to quantify. However, these tools dramatically simplify my day-to-day workflow, and most people I talk to have only heard of them from me. So, my goal with this post is to introduce people to some useful programs they might not have heard of before. Since I use Linux Mint as my primary operating system, all of these programs are Linux-compatible. Most of them are also compatible with other operating systems.
I’ve been lucky in my graduate career to never need to TA to fund myself. However, my advisor made the excellent point that it is important to know whether you like teaching when considering trying to become a professor and I honestly didn’t know whether I did or not. I have to admit that I was pretty intimidated by the idea of TAing at all since we’ve all heard the horror stories of a crazy amount of work and horrible students. Therefore, when asked what my first choice of a class would be, I decided that the intro to programming class that teaches brand new students Python was a safe bet (my undergraduate career was pretty Python heavy). I figured that I should be fairly familiar with all the topics covered and wouldn’t need to brush up much given I still do a lot of scripting in Python.
The other day, I was reading a paper (paywall) on using graph and network theory to quantify properties of ecological landscapes, by Rayfield et al. It is a review summarizing:
- what properties of landscape networks we might want to measure,
- structural levels within networks that we might want to measure these properties at (e.g. node, neighborhood, connected component, etc),
- and metrics that can be used to measure a given property at a given structural level.
The authors found that there was dramatic variation in the number of metrics available in these different categories.
I was particularly struck by this comment, offering a potential explanation for the complete lack of component-level route redundancy metrics:
“This omission could be attributed to, first, the importation of measures from other disciplines that prioritized network efficiency over network redundancy…”
In a later post I’m going to introduce a new artificial life system which I’m working on. This new system is based on Markov Network Brains so I figured I’d take a little time to talk about them. Markov Brains use binary variables and arbitrary logic to implement deterministic or probabilistic finite state machines. They have been used to study behavior, character recognition and game theory among other topics. The majority of the work with Markov Brains has been done in Chris Adami’s lab.
A Markov Brain consists of 3 parts :
- a set of binary variables called the Brain State
- a collection of logic gates
- connections between the variables and the gates
One of my stepping stones towards becoming an evolutionary biologist was playing with engaging programs that combine evolution and artificial life. Although these “games” are often neither intended for educational or research purposes, I found them instrumental in developing an appreciation for the power and creativity of evolution by natural selection. Here is a collection of my favorites in no particular order:
I recently discovered the Paper Machines add-on to Zotero, which allows you to perform visualizations and topic modeling analyses on papers in your Zotero collection. I just so happened to have the complete proceedings of both GECCO 2014 and ALife 2014 kicking around in my Zotero database, so I decided to try comparing them. As a quick background, GECCO, which focuses on Genetic and Evolutionary Computation, and ALife, which focuses on Artificial Life, are the two main computer science* conferences that we in the Devolab tend to go to. There is substantial overlap between these conferences (GECCO has an Artificial Life track, after all), but there are also some fundamental differences in approach and focus.
“What I cannot create, I do not understand”
— Richard Feynman
In the Devolab, we use artificial life systems to improve our understanding of evolutionary dynamics. Specifically, we perform experiments on populations of self-replicating digital organisms that evolve in a natural and unconstrained manner. But why did we decide to focus on artificial life? What are the advantages and drawbacks of using these relatively complex computational systems?
Back in December PNAS published a paper from Oliveira et al. called “Evolutionary limits to cooperation in microbial communities” . My research interests lie right at the intersection of evolution and cooperation so I was fascinated by the idea that evolution imposes limits on cooperation. In this paper, Oliveira et al. examine the evolutionary dynamics of a community of microbes that can exchange a number of valuable secretions between different strains. This experimental setup enables the evolution of cooperation if one genotype focuses on producing one secretion and shares that secretion with a different strain while also gaining access to that other strain’s secretions. Ultimately, however, they found that cooperation only evolved under specific and limited conditions because of the fitness decrease that occurs when an individual isn’t close enough to receive secretions from another strain.
Because I enjoy reading papers that are not in my immediate specialty, I frequently track the interesting information I learn in both the background literature review section as well as the main results of the study. In this feature, I discuss those bits of information that I found relevant to my interests and just generally cool.
I flip-flop between Python, R, and D3 for my data visualizations depending on what exactly I’m doing. My default, though, is definitely Python. One of the most well-established data visualization libraries in Python is Matplotlib. If you dig deep enough in it, you can find a wide variety of features beyond standard graphs. One of the less well-documented of these features is the animation library. The
FuncAnimation class in particular is quite powerful, allowing you to programmatically generate the frames for your animation and compile them together. Jake VanderPlas has a great tutorial on using
FuncAnimation which I’m not going to try to duplicate. Here, I’m just going to focus on a small but critical aspect of using
FuncAnimation that is glossed over elsewhere: blitting.
Here’s how critical blitting is: My first attempted Matplotlib animation took around an hour to render. That wasn’t going to work. Thanks to blitting, I can now render the same animation in under a minute.