Session 1_2. What is R and RStudio?
Questions
- What is R?
- Why use R?
- Why not use R?
- Why use RStudio and how does it differ from R?
Learning Objectives
- Know advantages of analyzing data in R
- Know advantages of using RStudio
- Be able to start RStudio on your computer
- Identify the panels of the RStudio interface
- Be able to customize the RStudio layout
Introduction
In this session, we will discuss the basics of R and RStudio, the two essential tools in data visualization of this course. We will cover the advantages of using R and RStudio, how to set up RStudio, and the different panels of the RStudio interface.
What is R?
R is
- A programming language
- Designed for statistical computing and graphics
- Widely used by statisticians, data scientists, and researchers for data analysis and visualization
- An open-source language, which means it is free to use, modify, and distribute
- Offering extensive libraries and powerful data manipulation capabilities
- Separate from RStudio
Why use R?
There are several reasons why R is a popular choice for data analysis, which include:
- Open-source: R is free to use and has a large community of developers who contribute to its growth and development. What is “open-source”?
- Extensive libraries: There are thousands of R packages available for a wide range of tasks, including specialized packages (e.g. bioinformatics). These libraries have been extensively tested and are available for free.
- Data manipulation: R has powerful data manipulation capabilities, making it easy (or at least possible) to clean, process, and analyze large datasets.
- Graphics and visualization: R has excellent tools for creating high-quality graphics and visualizations that can be customized to meet the specific needs of your analysis. In most cases, graphics produced by R are publication-quality.
- Reproducible research: R enables you to create reproducible research by recording your analysis in a script, which can be easily shared and executed by others.
- Cross-platform: R runs on Windows, Mac, and Linux (as well as more obscure systems).
- Interoperability with other languages: R can interface with FORTRAN, C, and many other languages.
- Scalability: R is useful for small and large projects.
Why not use R?
- R cannot do everything.
- R is not always the “best” tool for the job.
- R will not hold your hand. Often, it will slap your hand instead.
- The documentation can be opaque (but there is documentation).
- Finding the right package to do the job you want to do can be challenging; worse, some contributed packages are unreliable.
R License
R is free (yes, totally free!) and distributed under GNU license. In particular, this license allows one to:
- Download the source code
- Modify the source code to your heart’s content
- Distribute the modified source code and even charge money for it, but you must distribute the modified source code under the original GNU license
This license means that R will always be available, will always be open source, and can grow organically without constraint.
What is RStudio?
- IDE = Integrated Development Environment
- Provides a graphical user interface (GUI) for R
- Includes useful features such as a built-in console, syntax-highlighting editor, and tools for plotting, history, debugging, workspace management, and workspace viewing
- Console is where R is actually running
- Can work in here “interactively”
- Run a single command and see the result (
2 + 2
) - This is also where RStudio will run code written in the text editor
Getting started with RStudio
To get started with RStudio, you first need to install both R and RStudio on your computer. Follow these steps:
- Download and install R from the official R website.
- Download and install RStudio from the official RStudio website.
- Launch RStudio. You should see the RStudio interface with four panels.
The RStudio Interface
RStudio’s interface consists of 4 “panels” (see Figure 2):
- The Source for your scripts and documents (top-left, in the default layout)
- Your Environment/History (top-right) which shows all the objects in your working space (Environment) and your command history (History)
- Your Files/Plots/Packages/Help/Viewer (bottom-right)
- The R Console (bottom-left)
No. You can use R without RStudio. However, RStudio makes it easier to write and execute R code, and it provides several useful features that are not available in the basic R console. Note that the only part of RStudio that is actually interacting with R directly is the console. The other panels are simply providing a GUI that enhances the user experience.
You can customize the layout of RStudio to suit your preferences. To do so, go to Tools > Global Options > Appearance. Here, you can change the theme, font size, and panel layout. You can also resize the panels as needed to gain screen real estate (see Figure 3).
Summary
R and RStudio are powerful tools for data analysis and visualization. By understanding the advantages of using R and RStudio and familiarizing yourself with the RStudio interface, you can efficiently analyze and visualize your data. In the following sessions, we will delve deeper into the functionality of R to help you gain a comprehensive understanding of data analysis and visualization.