References, "Reproducible research tools" course, BIOS 692
General
-
A collection of links to learning resources about Unix, shell best practices, R and python tools for genomics. https://github.com/crazyhottommy/getting-started-with-genomics-tools-and-resources
-
“Tools for Reproducible Research” course by Karl Broman. http://kbroman.org/Tools4RR/
-
“Steps towards reproducible research” resources and reading. http://kbroman.org/steps2rr/pages/resources.html
-
Software Carpentry lessons on Unix, version control, automation, R & Python programming. http://software-carpentry.org/lessons/
-
Software Carpentry reading material on software engineering and scientific computing. http://software-carpentry.org/reading/
-
UW-Madison Software Carpentry Workshop coveryng best practices of coding. https://github.com/UW-Madison-ACI/boot-camps
-
Linux basics and R basics manuals and tutorials by Thomas Girke. https://sites.google.com/a/bioinformatics.ucr.edu/bioinformatics-manuals/home/linux-basics and https://sites.google.com/a/bioinformatics.ucr.edu/bioinformatics-manuals/home/R_BioCondManual
-
“R for Data Science” book by Garrett Grolemund & Hadley Wickham, covers ecosystem of R tools for data analysis and visualization done right. http://r4ds.had.co.nz/
-
Biomedical Data Science Workshops by Stephen Turner. From R basics through data manipulation with
dplyr
, visualization withggplot2
, reproducible research withknitr
. http://bioconnector.org/workshops/index.html -
“Technical Foundations of Informatics” by Michael Freeman and Joel Ross. Introduction to R, Rmarkdown, plotly, shiny, git and github. https://info201-s17.github.io/book/index.html
-
Video: “How not to fool yourself with p-values and other statistic” by Regina Nuzzo, NIH Videocast. https://videocast.nih.gov/launch.asp?23420
-
“Guide to give talks” by Jeff Leek. https://github.com/jtleek/talkguide
-
Video: “How to Speak”, lecture tips from Patrick Winston. https://vimeo.com/101543862
-
Video: “How to Write a Great Research Paper” by Professor Simon Peyton Jones, Microsoft Research. https://www.youtube.com/watch?v=g3dkRsTqdDA
Steps in reproducible research
Overview
-
Geir Kjetil Sandve, Anton Nekrutenko, James Taylor, and Eivind Hovig. “Ten Simple Rules for Reproducible Computational Research.” PLoS Computational Biology 2013. http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003285
-
List, Markus, Peter Ebert, and Felipe Albrecht. “Ten Simple Rules for Developing Usable Software in Computational Biology.” PLoS Computational Biology 2017. http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005265
-
Millman, K Jarrod, and Fernando Pérez. “Developing Open-Source Scientific Practice.” Implementing Reproducible Research. 2014. http://www.jarrodmillman.com/oss-chapter.html. A thorough and practical account of all steps in computational reproducible research.
-
“Ten simple rules” collection of essays covering all professional aspects of scientific career. http://collections.plos.org/ten-simple-rules
-
“Statistics for biologists” one-pagers about statistics, methods, and reproducibility by Nature journals. http://www.nature.com/collections/qghhqm
-
“Points of significance” collection of statistical primers by Nature Methods. https://www.nature.com/collections/qghhqm/pointsofsignificance
-
“Computational Biology Primers” one- or two-pagers on various topics of genomics and bioinformatics by Nature Biotechnology journal. [https://liacs.leidenuniv.nl/~hoogeboomhj/mcb/nature_primer.html]
Linux/bash basics
-
A Book for Anyone to Get Started with Unix. http://seankross.com/the-unix-workbench/, and the GitHub repository, https://github.com/seankross/the-unix-workbench
-
Data Coding 101 – Intro To Bash. Four episodes, video. https://data36.com/data-coding-bash-best-practices/
-
An interactive explainer of any shell command. http://explainshell.com/
-
Unix/Linux command reference sheets. https://cheat-sheets.s3.amazonaws.com/linux-commands-cheat-sheet-new.pdf and https://files.fosswire.com/2007/08/fwunixref.pdf
-
Survival guide for Unix newbies. http://matt.might.net/articles/basic-unix/
-
Settling into Unix tutorial. http://matt.might.net/articles/settling-into-unix/
-
Shell programming with bash tutorial. http://matt.might.net/articles/bash-by-example/
-
Master the power of command-line with a list of one-liner gems. http://www.commandlinefu.com/commands/browse
-
“The Unix shell”, Software Carpentry. https://swcarpentry.github.io/shell-novice/
-
A curated list of Terminal frameworks, plugins & resources for command-line interface (CLI) lovers. http://terminalsare.sexy and https://github.com/k4m4/terminals-are-sexy
Text manipulation with grep, awk, sed, vim
-
Tutorial to
sed
by Bruce Barnett. http://www.grymoire.com/Unix/Sed.html -
Vim introduction and tutorial. https://blog.interlinked.org/tutorials/vim_tutorial.html
-
Interactive Vim tutorial. http://www.openvim.com/
-
Vim reference card. http://web.mit.edu/merolish/Public/vi-ref.pdf
Automating everything
Best practices of data/code organization
-
Tips for organizing projects from Karl Broman. http://kbroman.org/steps2rr/pages/organize.html
-
Organizing data in spreadsheets tutorial. http://kbroman.org/dataorg/. Or, read the paper https://github.com/kbroman/Paper_DataOrg
-
Clean Code, best practices for function names, patterns and anti-patterns, and more on good programming practiceshttp://www.cbs.dtu.dk/courses/27610/clean_code_index.html
-
Ten Simple Rules for Robustifying Your Software. https://github.com/oicr-gsi/robust-paper
-
“Code and Data for the Social Sciences: A Practitioner’s Guide” book by Matthew Gentzkow and Jesse Shapiro, PDF. https://web.stanford.edu/~gentzkow/research/CodeAndData.pdf
-
Wilson, Greg, D. A. Aruliah, C. Titus Brown, Neil P. Chue Hong, Matt Davis, Richard T. Guy, Steven H. D. Haddock, et al. “Best Practices for Scientific Computing.” PLoS Biology 2014. http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1001745
-
Organization of files, folders, code, by DataCarpentry. https://github.com/datacarpentry/rr-organization1
Makefiles
-
“A minimal tutorial on make” by Karl Broman. http://kbroman.org/minimal_make/
-
“Learning about Makefiles” by Dave Tang. http://davetang.org/muse/2015/05/31/learning-about-makefiles/
-
Automation and Make by SoftwareCarpentry. https://swcarpentry.github.io/make-novice/
RStudio, R functions & packages
-
Transform repeated code into functions. http://kbroman.org/steps2rr/pages/functions.html
-
How-to package functions. http://kbroman.org/steps2rr/pages/packages.html
-
Package tutorial by Hillary Parker. https://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/
-
R package primer by Karl Broman. http://kbroman.org/pkg_primer/
-
“R packages” book by Hadley Wickham. http://r-pkgs.had.co.nz/
-
Jeff Leek on developing R packages. https://github.com/jtleek/rpackages
-
sinew
R package for making templates of help headers for functions. https://github.com/metrumresearchgroup/sinew -
pRojects
R package for making project templates. https://github.com/lockedata/pRojects
Reproducible reports
Literate programming with Markdown/KnitR
-
“Turn scripts into reproducible reports” by Karl Broman. http://kbroman.org/steps2rr/pages/reports.html
-
“R Markdown” tutorial by Karl Broman. http://kbroman.org/knitr_knutshell/pages/markdown.html and http://kbroman.org/knitr_knutshell/pages/Rmarkdown.html
-
“A quick introduction to R/markdown” presentation by Peter Ralph, and some R Markdown gotchas (advanced). http://petrelharp.github.io/r-markdown-tutorial/using-rmarkdown.slides.html, and https://petrelharp.github.io/r-markdown-tutorial/gotchas.html
-
R Markdown guides from Rstudio. https://support.rstudio.com/hc/en-us/articles/205368677-R-Markdown-Dynamic-Documents-for-R
-
R markdown reference sheets. https://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf and https://www.rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf
-
Slidify: Modern, simple presentations written in R Markdownhttps://benjaminlmoore.wordpress.com/2014/02/24/slidify-presentations-in-r-markdown/
-
Xaringan, presentation template based on
remark.js
by Yihui Xie. [https://github.com/yihui/xaringan] -
md2googleslides, Markdown to Google Slides converter. https://github.com/googlesamples/md2googleslides
-
An R package to produce posters. https://github.com/pzhaonet/postr. And, a collection of templates to make posters, presentations and publications in R Markdown. https://github.com/exporl/kuleuven-templates
-
Create beautiful and semantically meaningful articles with pandoc. Example at https://pandoc-scholar.github.io, how to at https://github.com/pandoc-scholar/pandoc-scholar
-
Create Powerpoint presentations from R with the OfficeR package // R-bloggers https://www.r-bloggers.com/create-powerpoint-presentations-from-r-with-the-officer-package/amp/
-
CV and resume in Markdown, https://github.com/ryanpeek/markdown_cv
Data manipulation and visualization in R
-
Data Manipulation Using R (& dplyr). PDF slides available at https://ramnarasimhan.files.wordpress.com/2014/10/data-manipulation-using-r_acm2014.pdf, and http://www.slideshare.net/Ram-N/data-manipulation-using-r-acm2014
-
Data Manipulation with
dplyr
. http://datascienceplus.com/data-manipulation-with-dplyr/ -
“Aggregating and analyzing data with dplyr” by Data Carpentry. http://www.datacarpentry.org/R-genomics/04-dplyr.html
-
Do your “data janitor work” like a boss with
dplyr
. http://www.gettinggeneticsdone.com/2014/08/do-your-data-janitor-work-like-boss.html -
“Data visualization in R” by Data Carpentry. http://www.datacarpentry.org/R-genomics/05-data-visualization.html
-
“ggplot2 tutorial/slides/code examples/references” by Jenny Bryan. https://github.com/jennybc/ggplot2-tutorial
-
“R Graph Catalog”, visuals and code examples of graphs made with
ggplot2
. http://shiny.stat.ubc.ca/r-graph-catalog/ -
R Seminar: Introduction to
ggplot2
, comprehensive introduction, from UCLA. http://www.ats.ucla.edu/stat/r/seminars/ggplot2_intro/ggplot2_intro.htm
Reproducible presentation and web-publishing
-
Easy web publishing from R on Rpubs.com. http://rpubs.com/
-
“Easy websites with GitHub Pages” by Karl Broman. http://kbroman.org/simple_site/
-
Creating web documentation with Bookdown. https://github.com/rstudio/bookdown
-
Example, “Bookdown: Authoring Books with R Markdown” by Yihui Xie. https://bookdown.org/yihui/bookdown/
-
Github Pages template for academic personal websites. https://github.com/academicpages/academicpages.github.io
-
“Beautiful Jekyll,” Build a beautiful and simple website in minutes. http://deanattali.com/beautiful-jekyll/
Version control, sharing and collaboration
- How to share data with a statistician, by Jeff Leek group. https://github.com/jtleek/datasharing
Git/GitHub
-
One-pager simple
git
guide. https://rogerdudler.github.io/git-guide/ -
One-pager of
git
commands. https://github.com/kbroman/Tools4RR/blob/master/04_Git/GitCommands/git_notes.md -
Learn
git
interactively in 15 min. https://try.github.io/levels/1/challenges/1 -
Interactive git branching tutorial. http://learngitbranching.js.org/
-
“Git and GitHub guide” by Karl Bromanhttp://kbroman.org/github_tutorial/
-
Software Carpentry course on
git
. https://swcarpentry.github.io/git-novice/ -
Book “Version Control by Example” by Eric Sink. http://ericsink.com/vcbe/
-
Book(down) “Happy Git and GitHub for the useR” by Jenny Bryan. http://happygitwithr.com/
-
How to create pull requests. https://akrabat.com/the-beginners-guide-to-contributing-to-a-github-project/
-
Quick Git and GitHub videos. http://www.dataschool.io/git-and-github-videos-for-beginners/
-
GitHub training videos. https://www.youtube.com/user/GitHubGuides/videos
Licenses
-
“Licensing”, SoftwareCarpentry. https://swcarpentry.github.io/git-novice/11-licensing.html
-
“Pick a License, Any License”. https://blog.codinghorror.com/pick-a-license-any-license/
-
“License your software” by Karl Broman. http://kbroman.org/steps2rr/pages/licenses.html
-
Morin, Andrew, Jennifer Urban, and Piotr Sliz. “A Quick Guide to Software Licensing for the Scientist-Programmer.” PLoS Computational Biology 2012. http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002598
Data/code sharing repositories
- Goodman, Alyssa, Alberto Pepe, Alexander W. Blocker, Christine L. Borgman, Kyle Cranmer, Merce Crosas, Rosanne Di Stefano, et al. “Ten Simple Rules for the Care and Feeding of Scientific Data.” PLoS Computational Biology 2014. Lists all major data sharing repositories. http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003542