Category Archives: Git/GitHub
An introduction to Git and GitHub can be here. What follows is a simple command reference along with more advanced topics like work-flow forking, branch management and trouble-shooting.
GitHub is a web-based hosting service for file archiving and version control. Github uses the locally installed software tool Git. Data science projects use Git and GitHub to provide access and control of project data, source code and narrative text files. In practice, RStudio provides Git.
Version control is an essential features of any project and the benefits are simple:
- Collaboration: Provide a central repository of files for collaboration;
Large data projects in R require consistent work flow principles. The goal is to improve project management. Above all, the work flow process has a clear priority: to shift time spent from low to high value activity. The solution is simple: (1) use a basic project template to manage project files and directories; (2) write code with a common set of tools to improve code flow and efficiency, and (3) extend base R with well accepted R packages that support improved work flow and code syntax.