Computing in the Cloud
Topic: Computing in the Cloud: What Every Computational Life Scientist Should Know
Meeting dates: April 6-8, 2014
Location: NIMBioS at the University of Tennessee, Knoxville
Russell Zaretzki, Statistics, Univ. of Tennessee
Michael Gilchrist, Ecology & Evolutionary Biology, Univ. of Tennessee
Eric Carr, NIMBioS, Univ. of Tennessee
George Ostrouchov, Oak Ridge National Laboratory, National Institute for Computational Sciences, and Univ. of Tennessee
Brian O'Meara, Ecology & Evolutionary Biology, Univ. of Tennessee
This tutorial brought together a diverse set of computational biologists and modelers who wanted to expand their expertise and learn how to harness big data and computation using the R language.
A wide range of HPC/Cluster/Cloud computing resources exist and are accessible to researchers, such as Amazon EC2, NSF XSEDE, local clusters, and simple multiprocessor shared memory machines. Participants learned about the strengths and weaknesses of the various platforms and how to enable R to utilize them. The strengths and limitations of R for big data and big computation were also discussed. Moving beyond these basics, further sessions provided participants with hands on experience in the following areas:
- Learn about the packages, tools, and data structures that are available in R for computing on HPC resources
- Understand tools such as Rcpp that allow R to easily interface with compiled code for improved performance
- Handle big matrix computations with the pbdR packages
- Produce elegant, publication quality graphics with the ggplot2 package
In addition to the fundamentals, the tutorial gave attendees a perspective on how these tools can be put to use in biological research. Tutorial examples included applications such as Bayesian mixed models in genomics, phylogenetic biogeography, approximate Bayesian computation, and multivariate data reduction in ecological models. Finally, a special session on teaching with R provided insights on how to bring computational science research into the undergraduate classroom.
This hands-on tutorial gave participants an opportunity to begin applying these tools to their own problems. Presentations and sample codes were available for all tutorial sessions. Attendees also consulted with presenters and platform experts to identify the right tools for their problems.
Participants should have a solid working knowledge of the R language. Experience with a lower level programming language (C, C++, Fortran) will also be beneficial but is not required.
Summary Report. TBA
Software, Data & Websites
Schmidt D, Chen WC, Ostrouchov G, Patel P. 2013. pbdBASE: An R package update. [Online]
Schmidt D, Chen WC, Ostrouchov G, Patel P, R Core Team. 2013. pbdDMAT: An R package update. [Online]
Evaluation report (PDF)