By default, `R` sorts the levels of a factor alphabetically. When drawing graphs, this results in ‘Alabama First’ graphs, and it’s usually better to sort the elements of a graph by more meaningful principles than alphabetical order. This post illustrates three convenience functions you can use to sort factor levels in `R` according to another covariate, their frequency of occurrence, or manually.

First you’ll need the `dplyr` and `magrittr` packages:

You can download the convenience functions from my Github page or read them in directly into `R`:

### Sorting factor levels by another variable

The code below creates an example dataset with a factor and a covariate:

What we want is to sort the levels of the factor by the covariate mean per factor level (i.e., a-b-e-c-d). The function `sortLvlsByVar.fnc` accomplishes this:

By setting the `ascending` parameter to `FALSE`, the factor levels are sorting descendingly according to the covariate mean:

How this looks like when graphed:

You can change the `R` code from the Github page so that the levels are sorted by another summary statistics, e.g., the covariate median per factor level.

### Sorting factor levels by their frequency of occurrence

Again we’ll first create some data:

We want to order these factor levels by their frequency of occurrence in the dataset (i.e., b-a-d-c-e). `sortLvlsByN.fnc()` accomplishes this:

Or descendingly:

When plotted:

### Customising the order of factor levels

If you want to put the factor levels in a custom order, you can use the `sortLvls.fnc()` function.

Let’s say we, for some reason, want to put the current 5th level (e) first, the current 3rd level (c) second, the 4th 3rd, the 4th 2nd and the 1st last:

You can also just specify which factor levels need to go up front; the order of the other ones stays the same:

18 August 2016