Chapter 7 Dataframes in R
7.1 Introduction
In this chapter, we will learn to:
- create dataframe
- select columns
- select rows
- utitlity functions
7.2 Create dataframes
Use data.frame to create dataframes. Below is the function syntax:
args(data.frame)## function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,
## fix.empty.names = TRUE, stringsAsFactors = FALSE)
## NULL
Data frames are basically lists with elements of equal lenght and as such, they are heterogeneous. Let us create a dataframe:
name <- c('John', 'Jack', 'Jill')
age <- c(29, 25, 27)
graduate <- c(TRUE, TRUE, FALSE)
students <- data.frame(name, age, graduate)
students## name age graduate
## 1 John 29 TRUE
## 2 Jack 25 TRUE
## 3 Jill 27 FALSE
7.3 Basic Information
class(students)
## [1] "data.frame"
names(students)
## [1] "name" "age" "graduate"
colnames(students)
## [1] "name" "age" "graduate"
str(students)
## 'data.frame': 3 obs. of 3 variables:
## $ name : chr "John" "Jack" "Jill"
## $ age : num 29 25 27
## $ graduate: logi TRUE TRUE FALSE
dim(students)
## [1] 3 3
nrow(students)
## [1] 3
ncol(students)
## [1] 37.4 Select Columns
7.5 Select Rows
# single row
students[1, ]
## name age graduate
## 1 John 29 TRUE
# multiple row
students[c(1, 3), ]
## name age graduate
## 1 John 29 TRUE
## 3 Jill 27 FALSEIf you have observed carefully, the column names has been coerced to type
factor. This happens because of a default argument in data.frame which is
stringsAsFactors which is set to TRUE. If you do not want to treat it as
factors, set the argument to FALSE.
students <- data.frame(name, age, graduate, stringsAsFactors = FALSE)