by

Introduction to R Programming

R is nothing but a programming language, also a free software environment. This software used for predictive analysis, statistical analysis, graphical representation, data modeling, and reporting.

R Programming Language

R is interpreted programming language.

In the interpreted programming language, source code executes without compilation.

Hence, R called a scripting language because scripting languages are often interpreted rather than compiled.

R is a domain-specific language.

R is a high-level programming language.

R has strong abstraction from the details/ inner working of the computer as it high-level language.

R is a versatile programming language.

R is a mixture of programming paradigm/pattern/model. Let us see how R provides the mixture of programming paradigm: At its interior, R is an imperative-type of language where the user can write the script and calculates/execute script one at a time. R supports object-oriented features such as encapsulation. Data and functions clubbed together inside the class. It supports the functional programming, and Functions treated as the first-class object. That means functions act as an object by type and by behavior. Besides, we treat them like a variable. This mixture of imperative-type, object-oriented and functional programming language (programming paradigm) explains that R has a lot of similar things to several other languages. The R Software Environment R is a consolidated suite of software facilities for calculations, data manipulations and for graphical display. The term ‘Environment’ referred to characterize it as ‘fully planned and logical system’ rather than other software tools for data analysis, as they are incremental accretion of the specific and inflexible tool. R environment has a very large collection of packages. Most of the programs written in R, essentially momentary written for a single piece of data analysis. Most of the statisticians, data analyst, data miners, Researchers, markers prefer ‘R’ for the purpose of “to retrieve the data, to clean the data, to analyze the data, to visualize the data and to present the data.”

Evolution of R Programming

R language developed by Ross Ihaka and Robert Gentleman in 1990.

Ross Ihaka and Robert Gentleman both worked together in the statistics department at Auckland University, New Zealand.

R shows its first appearance in the year 1993.

R language based upon the S language.

S language mainly developed by John Chambers in 1970 at Bell laboratories.

R software is an open source software, and it is GNU (General Public License) based project.

R language along with the software, developed by the group of people known as ‘R Core Team.’

Features of R Programming

R is a platform independent that means it runs on any operating system.

R is a simple, most-developed and effective programming language, which includes loops, conditional loops user-defined recursive functions and input-output facilities.

R has the best data handling facility and facility of storage.

R provides a consolidated suite of operators for calculations on arrays, lists, vectors, and matrices.

R provides an extensive, coherent, consolidated collection of standard tools for the data analysis process.

R provides graphical facilities to display directly at the computer or printing at the papers for data analysis.

R software allows the R language to integrate with other programming languages like C, C++, Java, Python, etc.

R has an online community of vibrant/energetic users, and the community keeps on growing with time.

Advantages of R Programming

Open Source

R is an open source software system. This project released under the (General Public License) GNU. It is freely available.

Cross-Platform Compatible or Platform Independent

R is a platform independent that means it runs on any operating system like Windows, Linux, Mac, etc.

Advanced Statistical Language

R supports for the advanced features of statistics. Multiple Packages extended to support the advanced concepts in Statistics.

Outstanding Graphs

The graphical representation is one of the most crucial features of R. It provides multiple facilities to represent graphs effectively.

Flexible and Fun

R is very flexible to work with it. It has a very user-friendly interface.

Extremely Comprehensive

Data miners use R for predictions, and data analyst uses for data analysis, Enterprises uses R for graphical representation and better visualizations. Scientist, statisticians prefer R for data modeling. R includes almost all elements, everything, hence it is comprehensive.

Supports Extensions

To provide advanced features R extended with multiple extensions. The r core team is developing R and extended the R by adding new Packages to it.

Vast Community

R has a broad community of online users. As the popularity of R increases, the energetic, vibrant users clubbed with this community.

A mixture of Programming Paradigm or Relates to other Languages

R is the mixture of imperative type, object-oriented and Functional Programming. That is the reason R relates to other languages. The curly brackets ‘{’, ‘}’ used in coding looks like a C.

Applications of R Programming

Whether service

To predict severe flooding whether services uses R.

Social networking company

To monitor user experience, most of the social networking companies use R.

Newspaper companies

To create infographics and interactive data journalism applications, Newspaper companies uses R.

Statisticians

Statisticians uses R for developing statistical software and for data analysis

Data Scientist

Data scientist uses R for Predictive analysis.

Health-Care Industry

R is the most preferred choice of Health-Care industries.

Why should you adopt R programming?

R is the best mechanism for data analysis, machine learning, and statistics. This programming language allows you to create custom objects, functions, and packages. Like other applications, R record the actions of analysis. Because of that, it makes it easy to reproduce the reports and update them. The result of this, R can try many ideas and factual issues quickly.

R is platform independent; it can smoothly run on any operating system. R provides facility to integrate with other languages and enables to communicate with multiple data sources such as ODBC compliant databases and other statistical packages. (e.g., SAS, STATA, SPSS, and Minitab)

Role of R in the Analysis of Data

Data analysis is a systematic ‘step-by-step’ procedure, which involves multiple stages such as programming, transforming, discovering, modeling, communicating the result. R helps in every phase of data analysis. Let us the role of R in data analysis.

Programming

Programming is one of the crucial phases of data analysis. For the coding purpose, R is the clear and accessible programming tool.

Transforming

Transforming plays a vital role in data analysis. Transformation changes the appearance of data from one form to another. This transformation process has done multiple times on data. Format or appearance of data varies every time. Prediction scope increases for the analyst. This process help in the next phase of analysis. R has a consolidated collection of different libraries specially designed for data science.

Discovering

This phase uses the result of the transformation phase. The first thing to do is ‘investigate the data’ in data discovery. Then this process will refine all the assumptions related to data and analyze them carefully.

Modeling

R provides a vast array of tools. Those tools used to capture the right, perfect and practical model for your data.

Communicating the result

For obtaining the outcome, either need to integrate code, output, and graphs combine them in the report with the R Markdown or build the shiny, interactive app to share with the world.

Download and Installations of R for Windows

Download and Installation of R on your machine is very task. Let us see in details:

Step 1

Click on the link https://cloud.r-project.org

https://cloud.r-project.org is the official site, and it provides the binary files for all primary operating system.

Step 2

Click the link mentioned as Download R for Windows from the Download and install R Panel.

Step 3

Click on the install R for the first time from Subdirectories Panel.

Step 4

Click on Download R 3.5.2 for Windows from R-3.5.2 for windows Panel

It will start downloading the R-3.5.2.exe file.

Step 5

Double-click on the R-3.5.2.exe file

Step 6

Select language, by default it is English.

Click on OK

Step 7

It will display the GNU agreement.

Click on Next to continue the installation.

Step 8

Select the destination location.

Click on Next to continue.

Step 9

Select only those components, which you want to install. By default, all components selected.

Click on Next to continue the installation.

Step 10

Click on Next to continue the installation.

Step 11

Click Next to continue, this will create an R folder to store set up.

If you want to change the destination folder, click on Browse and select the path.

Step 12

Select the options as per your requirement.

Click on Next to continue the installation.

Installation begins.

Step 13

Click on Finish to complete the installation.

R programming Studio

R Studio is the best IDE for R. R studio is a free and open-source integrated development environment (IDE). R studio specially designed for R. JJ Allaire founder of R studio. R studio wrote in Java, JavaScript and C++.

R studio has a well-organized interface so that user able to see graphs, R code, data tables, and output clearly and at the same time. It allows the user to import the ‘Wizard’ like features. So that user easily imports multiple files such as Excel, SPSS (.sav), SAS (sas7bdat), CSV, Stata (.dta) without writing extra code for it. Download the R studio from rstudio.com and install it. Before installing R, your system must have R installed in it.

Example: Let us see the example to understand R programming. Here, will see how to display a message on the screen, to clear the basic syntax of R programming. You can run the program from either command prompt or using IDE. We will run the example here through command prompt.

Command Prompt/Terminal

Double click on the R icon to start the command prompt/Terminal.

If your terminal, prompting ‘$’ symbol then type R and hit enter, that will launch the R interpreter and it will prompt ‘>’ symbol.

If your command prompt, prompting ‘>’ symbol then start for coding.

You can run the program in two ways:

Start coding on the terminal. Using an R Script file.

Start coding on the Terminal

Example1

print ("HELLO JAVATPOINT!") 0 1 2 print ( "HELLO JAVATPOINT!" )

Explanation

Use print function to display the message.

Output message always displays in double quotes.

The output on R Console

Example 2

Str <- “HELLO JAVATPOINT” print (str) 0 1 2 3 Str < - “ HELLO JAVATPOINT ” print ( str )

Explanation

In this program, we are storing the message in a variable. After that, we pass the variable to the print function to display the contents of that variable.

Using the symbol ‘<-‘, we initialize variable.

Output

Start coding using an R Script file

Step 1

Open the text editor and create a file. Type code in that file. Save the file with sample.R.

.R is the extension of the R Script file.

Step 2

Open the terminal and switch to the folder where file saved.

Type following command to run the program.

Rscript sample.R

Once the program executed, it will show the output.

Let us see the example for the same

Example 1

Str <- “Welcome to the world of R programming” Print (str) 0 1 2 3 Str < - “ Welcome to the world of R programming ” Print ( str )

The output on R Console

What is R

Evolution of R

Why use R for Statistical Computing and Graphics?

Features of R

Applications of R Programming in the Real World

Installation Process of R

R Data Structures

R Objects and Class

R Data Interfaces

R Charts and Graphs

R Statistics Examples

R – Mean, Median & Mode R – Linear Regression R – Multiple Regression R – Logistic Regression R – Normal Distribution R – Binomial Distribution R – Poisson Regression R – Analysis of Covariance R – Time Series Analysis R – Nonlinear Least Square R – Decision Tree R – Random Forest R – Survival Analysis R – Chi Square Tests

Reference:

https://en.wikipedia.org/wiki/R_(programming_language)

https://www.r-project.org/about.html

by