Sunday, December 27, 2015

big data, big ideas and a big gorilla

Not a raindrop, or a mushroom, or a dark-eyed junco, just a “ >” and a “#” and a “/”. 

I have been thinking a lot about the paradigm shift in biology towards the computational analysis of big datasets. The most important skill-set for modern biologists is no longer getting crazy dirty, sweaty and blissed-out studying wild populations in the field -- but understanding how to analyze big datasets, at your computer, sitting on your butt, in your half-office. 


Now, you can be a biologist and study gigabytes of information about the natural world (DNA sequences, gene expression data, geographic distributions etc.) without ever being close to a bit of anything natural. Or you can be a biologist who studies the natural world, but does not really know the modern methods of study, just browsing the forest with a keen eye, an old-fashioned big idea (and a little secret). Are either of these biologists compromised, or can they both add valuable knowledge to our understanding of the world? Who wins: the indoorsy computer nerd, or the wild energetic explorer? Blessed are the unicorns who are both, but let’s imagine one isn’t. 


These seemingly disparate activities are unified by thoughtfully designed research questions, and the traits of curiosity, critical thinking and grit. Wake up again and walk into the cool morning forest to find species x, even though you would rather rest more in your crappy tent. Wake up again and work on the script that won’t work, even though you would rather rest more on your pillow-top mattress. Nothing is working, so think of a new approach. Didn’t get the grant you applied for? rewrite and resubmit. Paper got rejected? resubmit to another journal. Whatever it is, forest or FASTA, poison or polytomy, keep going, keep looking and keep thinking critically and creatively. 


So, what can we teach students who are interested in natural history and genetics
 that now need to know Python and R? Do we start by teaching the mechanics of Python, and save the actual pythons for later? And what if you like catching snakes, but you aren’t good at, or drawn to, coding? Is there a place for you as a biologist in this new computational jungle? 

E.O. Wilson said that you don’t have to be great at math to be a biologist, some people freaked out about this. But what he said echoes into the practice of bioinformatics too. Can you be a computer dud, but still do science today? And will this new computational paradigm favor computer wizards who may not know, or care, about complex biological processes like species formation or migrations? And how do you learn to care about these abstract processes if not by seeing and knowing the breathtaking bird, or the scales of the snake, or the big gorilla itself?