Recoding/Labeling Variables

T.E.G. · 2018/03/24 · 3 minute read

This is a simple topic. And there are various ways of recoding and (re)labeling variables in R. But rec function from sjmisc package (Lüdecke 2018) works best for me. Give it a try:

library(tidyverse)
library(strengejacke)

mtcars %>% 
  select(disp, mpg, cyl) %>% 
  rec(cyl,rec= "4=1 [low]; 
                6=2 [mid]; 
                8=3 [high]") %>% 
  to_label(cyl_r)
##     disp  mpg cyl cyl_r
## 1  160.0 21.0   6   mid
## 2  160.0 21.0   6   mid
## 3  108.0 22.8   4   low
## 4  258.0 21.4   6   mid
## 5  360.0 18.7   8  high
## 6  225.0 18.1   6   mid
## 7  360.0 14.3   8  high
## 8  146.7 24.4   4   low
## 9  140.8 22.8   4   low
## 10 167.6 19.2   6   mid
## 11 167.6 17.8   6   mid
## 12 275.8 16.4   8  high
## 13 275.8 17.3   8  high
## 14 275.8 15.2   8  high
## 15 472.0 10.4   8  high
## 16 460.0 10.4   8  high
## 17 440.0 14.7   8  high
## 18  78.7 32.4   4   low
## 19  75.7 30.4   4   low
## 20  71.1 33.9   4   low
## 21 120.1 21.5   4   low
## 22 318.0 15.5   8  high
## 23 304.0 15.2   8  high
## 24 350.0 13.3   8  high
## 25 400.0 19.2   8  high
## 26  79.0 27.3   4   low
## 27 120.3 26.0   4   low
## 28  95.1 30.4   4   low
## 29 351.0 15.8   8  high
## 30 145.0 19.7   6   mid
## 31 301.0 15.0   8  high
## 32 121.0 21.4   4   low

There are other useful functions to transform and recode variables. For example:

mtcars %>% 
  select(disp, mpg, cyl) %>% 
  rec(cyl,rec= "4=1 [low]; 
                6=2 [mid]; 
                8=3 [high]") %>% 
  split_var(disp, n=4) %>% #splitting variable into n equal sized groups
  dicho(mpg, dich.by = "mean") # dichotomize variable based on a criterion 
##     disp  mpg cyl cyl_r disp_g mpg_d
## 1  160.0 21.0   6     2      2     1
## 2  160.0 21.0   6     2      2     1
## 3  108.0 22.8   4     1      1     1
## 4  258.0 21.4   6     2      3     1
## 5  360.0 18.7   8     3      4     0
## 6  225.0 18.1   6     2      3     0
## 7  360.0 14.3   8     3      4     0
## 8  146.7 24.4   4     1      2     1
## 9  140.8 22.8   4     1      2     1
## 10 167.6 19.2   6     2      2     0
## 11 167.6 17.8   6     2      2     0
## 12 275.8 16.4   8     3      3     0
## 13 275.8 17.3   8     3      3     0
## 14 275.8 15.2   8     3      3     0
## 15 472.0 10.4   8     3      4     0
## 16 460.0 10.4   8     3      4     0
## 17 440.0 14.7   8     3      4     0
## 18  78.7 32.4   4     1      1     1
## 19  75.7 30.4   4     1      1     1
## 20  71.1 33.9   4     1      1     1
## 21 120.1 21.5   4     1      1     1
## 22 318.0 15.5   8     3      3     0
## 23 304.0 15.2   8     3      3     0
## 24 350.0 13.3   8     3      4     0
## 25 400.0 19.2   8     3      4     0
## 26  79.0 27.3   4     1      1     1
## 27 120.3 26.0   4     1      1     1
## 28  95.1 30.4   4     1      1     1
## 29 351.0 15.8   8     3      4     0
## 30 145.0 19.7   6     2      2     0
## 31 301.0 15.0   8     3      3     0
## 32 121.0 21.4   4     1      2     1

I recommend checking other strengejacke packages. Especially sjPlot which provides a collection of table and plotting functions. Here is the link to the package website: sjPlot.

References

Lüdecke, Daniel. 2018. Sjmisc: Miscellaneous Data Management Tools. https://CRAN.R-project.org/package=sjmisc.