Welcome to R
Introduction to R programming Language
R is an object oriented programming language designed to provide facilities like data manipulation, data visualization and statistical data calculations. R provides various operators for matrix and vector manipulations. Much like Python, R provides command line interpreter for easy access. Be it primary school math , high school math or may be higher Mathematics, you can do it all on R’s interactive command line interpreter.
> 3 + 2
[1] 5
> log10(100)
[1] 2
So if you are new to R then you have come to right place. Let’s dive into the world of R and learn some basics of R programming.
Creating a variable
R allows to create a variable using letters, numbers and special symbols like dot (.) and underscore ( _ ) . There is no need to define datatype before assignment.
> data_1 = 3
> data.2 <- 4
> Data_1 = 5
> data_1
[1] 3
> Data_1
[1] 5
As you can see, assignment operator are both = and <- . R is a case sensitive language. So a and A are different variables.
Basic Arithmetic Operations
Addition, subtraction, multiplication, division and exponent are done using their respective operators (+,-,*,/,^). Apart from this there are built in functions for other arithmetic operations.
- Square Root : sqrt()
- Absolute Value : abs()
- Logarithmic Function : log() takes e as default base value. We can change base value by specifying with base argument as log(100,base=10). Since log base2 and log base 10 are common therefore there are builtin function for these two as log2 and log10 respectively.
- Exponential Function : exp()
- Trigonometric Functions : sin(), cos(), tan() functions take a radian value as input.
- Combination : choose() is used to compute Combination problems.
> sqrt(9)
[1] 3
> abs(2^3-3^2)
[1] 1
> cos(2*pi) + cos(pi)
[1] 0
> log2(64)
[1] 6
> log(64) # treated as ln(64)
[1] 4.158883
> choose(4,2) # 4!/((4-2)!*2!)
[1] 6
You must have noticed the word pi . It is a keyword in R, so it can be used directly. # is used for comments. So anything on the right side of # will be ignored by R interpreter.
You can also use function print() to display data or simply write variable name.
So far we have seen some basic arithmetic functions, now let us explore some higher mathematical algebra using R.
Matrix
In order to create a matrix in R we use matrix() function. It takes following (optional)parameters,
- data vector : default value = NA
- nrow : default value = 1
- ncol : default value = 1
- byrow : default value = FALSE
- dimnames : default value = NULL
data vector is used to provide a list of numbers using a vector as data to matrix. Wait ! vector ?
Vector is array or sequence of data. This data can be numeric, string, boolean etc. But how do we define vector in R ? To answer this question we need to explore a new function called combine or c().
Combine Function
Combine function is represented as c(). It combines sequence of data. To understand it better let us see an example.
> data = c(1,2,3,4)
> data
[1] 1 2 3 4
We provide sequence of integers to function c() and stored in variable data. Notice data is now a vector. So now we know what combine functions do and how a vector is initiated. Let’s get back to other parameters of matrix() function.
After data vector we have nrow and ncol which takes integers as number of rows and columns respectively. byrow parameter informs interpreter how values are to be stored from data vector. If this parameter is not defined then values are stored in column-first manner.
Boolean can be used as TRUE/FALSE or T/F
dimnames parameter is used to specify names for rows and columns. It takes list with two vectors as parameters.
> v = c(1,2,3,4,5,6)
> data = matrix(v,nrow = 3, ncol = 2, byrow = T,dimnames = list(c("I","II","III"),c("A","B"))
> data A B
I 1 2
II 3 4
III 5 6> data_2 = matrix()
> data_2
[,1]
[1,] NA
#As you can see without any parameter, default parameter were used.>> data_3 = matrix(v)
[,1]
[1,] 1
[2,] 2
[3,] 3
[4,] 4
[5,] 5
[6,] 6
# byrow parameter is false as default. So values are inserted by column and since number of columns are not specified it takes default value as 1 for ncol.
Vector
There are many built in functions for vector which can come in handy.
- length() : As name suggest it returns number of element in a vector.
- sum() : It returns arithmetic sum of all elements in a vector.
- prod() : It returns product of all elements in a vector.
- cumsum() and cumprod() : It returns cumulative sum and product of a vector.
- sort() : It returns sorted vector in increasing order as default.
- diff() : It returns a sequence of differences between elements in a vector.
> v = c(1,-1,1)
> length(v)
[1] 3
> prod(v)
[1] -1
> sort(v)
[1] -1 1 1
> cumsum(v) # 1, 1+(-1)=0,0+1=1 => 1,0,1
[1] 1 0 1
> cumprod(v) # 1, 1*(-1)=-1, -1*(1) = -1 => 1,-1,-1
[1] 1 -1 -1
> diff(v) # (-1)-1=-2,1-(-1)=2 => -2,2
[1] -2 2
diff() function have two parameters lag and differences. By default, consecutive elements are subtracted that is second element from first, third from second and so on. This is due to the fact that default values for parameters lag and differences is 1. lag parameter takes integer value equal to or greater than 1 as argument. If lag = 2, then diff() will subtract third element from first, fourth element from second and so on. If we change differences parameter to 2 then diff() will perform difference twice, first on original vector and then on output of first diff operation. Lets understand by example.
> v = c(3,5,6,8)
> diff(v) # 5-3,6-5,8-6
[1] 2 1 2
> diff(v, lag=2) # 6-3,8-5
[1] 3 3
> diff(v, differences = 2) # 5-3,6-5,8-6 => 2,1,2 => 1-2,2-1
[1] -1 1
Built-in functions for Matrix
Since R is most suitable programming language for mathematician, so it is bound to have some built-in functions for some heavy mathematics. So R provides functions for some basic matrix operations to make our life easy.
- dim() : It returns dimensions of matrix
- %*% : It is used to multiply two matrices, e.g : A%*%B
- t() : It returns transpose of a matrix
- det() : It returns determinant of a square matrix
- solve() : It returns inverse of a matrix
Before I leave you practicing all this new information you just gained, there’s one more topic which can be helpful while creating vectors or matrices.
Sequences
We have already defined how we can create a vector using sequence of data. However there are other ways to define sequence as well.
We can use colon(:) to define a sequence as 1:9 which will mean sequence of integers from 1 to 9.
> 1:4
[1] 1 2 3 4
> v = 1:4
> v
[1] 1 2 3 4
An enhancement to this method is seq() function. It can also define the incremental value as well.
> seq(1,4,by=2)# start from 1 and increment by 2
[1] 1 3
> seq(1,5,by=3)
[1] 1 4
Sometime we know the initial and final value but don’t know the incremental value but the length of vector. seq() have length parameter to help in such situation.
> seq(1,5,length=3)
[1] 1 3 5
What if we want to create a sequence of same number like 1,1,1,1,1,1. R provides a function rep() for this purpose. This can be useful to create a null vector or matrix.
> rep(0,4)
[1] 0 0 0 0
> matrix(rep(1,4),2)
[,1] [,2]
[1,] 1 1
[2,] 1 1
That is all folks. I hope you found this material useful. But in order to digest this information, we need practice.
So Keep Practicing and Happy Coding ! 😄