Linguistics PhD Student at UC San Diego
data <- read.csv('./sampledata.csv')
head(data, 3)
## Movement Island_Type Island Distance Item
## 1 WH whe non sh 1
## 2 WH whe non sh 2
## 3 WH whe non sh 3
## Sentence Subj_id List Score
## 1 Who thinks that Paul stole the necklace? 1 1 6
## 2 Who thinks that Matt chased the bus? 1 1 2
## 3 Who thinks that Tom sold the television? 1 1 3
suppose that I want to change the values in the dataset such that the value “sh” under the Distance
column is “short”, and the value “lg” is “long”. Here are some ways to achieve this.
#example 1
data$Distance[data$Distance == "sh"] <- "short"
data$Distance[data$Distance == "lg"] <- "long"
#example 2
data = data %>% mutate(Distance = ifelse(Distance == "sh", "short", "long"))
And here’s one way to change the data type of multiple columns (from character to factor).
to_be_converted = c("Movement", "Island_Type", "Island", "Distance")
data[to_be_converted] = lapply(data[to_be_converted], as.factor)
glimpse(data)
## Rows: 512
## Columns: 9
## $ Movement <fct> WH, WH, WH, WH, WH, WH, WH, WH, WH, WH, WH, WH, WH, WH, WH…
## $ Island_Type <fct> whe, whe, whe, whe, whe, whe, whe, whe, whe, whe, whe, whe…
## $ Island <fct> non, non, non, non, non, non, non, non, isl, isl, isl, isl…
## $ Distance <fct> long, long, long, long, long, long, long, long, long, long…
## $ Item <int> 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4…
## $ Sentence <chr> "Who thinks that Paul stole the necklace?", "Who thinks th…
## $ Subj_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ List <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ Score <int> 6, 2, 3, 7, 2, 4, 7, 2, 5, 2, 3, 5, 5, 2, 4, 4, 2, 3, 2, 7…
Alternatively, you can change the dtype of all columns that are of the same type like this:
data = data %>% mutate_if(is.character, as.factor) # or %>% mutate_if(is.integer, as.factor)