Circular shift of vector (equivalent to numpy.roll)
Question:
I have a vector:
a <- c(1,2,3,4,5)
And I’d like to do something like:
b <- roll(a, 2) # 4,5,1,2,3
Is there a function like that in R? I’ve been googling around, but "R Roll" mostly gives me pages about Spanish pronunciation.
Answers:
How about using head
and tail
…
roll <- function( x , n ){
if( n == 0 )
return( x )
c( tail(x,n) , head(x,-n) )
}
roll(1:5,2)
#[1] 4 5 1 2 3
# For the situation where you supply 0 [ this would be kinda silly! :) ]
roll(1:5,0)
#[1] 1 2 3 4 5
One cool thing about using head
and tail
… you get a reverse roll with negative n
, e.g.
roll(1:5,-2)
[1] 3 4 5 1 2
Here’s an alternative which has the advantage of working even when x
is “rolled” by more than one full cycle (i.e. when abs(n) > length(x)
):
roll <- function(x, n) {
x[(seq_along(x) - (n+1)) %% length(x) + 1]
}
roll(1:5, 2)
# [1] 4 5 1 2 3
roll(1:5, 0)
# [1] 1 2 3 4 5
roll(1:5, 11)
# [1] 5 1 2 3 4
FWIW (and not that it’s worth much) it also works on data.frame
s:
head(mtcars, 1)
# mpg cyl disp hp drat wt qsec vs am gear carb
# Mazda RX4 21 6 160 110 3.9 2.62 16.46 0 1 4 4
head(roll(mtcars, 2), 1)
# gear carb mpg cyl disp hp drat wt qsec vs am
# Mazda RX4 4 4 21 6 160 110 3.9 2.62 16.46 0 1
The package binhf
has the function shift
:
library(binhf)
shift(1:5, places = 2)
#[1] 4 5 1 2 3
places can be positive or negative
You can also use the permute
package:
require(permute)
a <- c(1,2,3,4,5)
shuffleSeries(a, start = 2)
output:
[1] 3 4 5 1 2
The numpy roll
method supports both directions, forward and backward, and it accepts shift parameters greater than the length of the vector. For example:
Python
import numpy
x=numpy.arange(1,6)
numpy.roll(x,-11)
And we get:
array([2, 3, 4, 5, 1])
Or
x=numpy.arange(1,6)
numpy.roll(x,12)
And we get:
array([4, 5, 1, 2, 3])
We can build an R function that takes into consideration the case where the shift parameter is greater than the length of the vector. For example:
R
custom_roll <- function( x , n ){
if( n == 0 | n%%length(x)==0) {
return(x)
}
else if (abs(n)>length(x)) {
new_n<- (abs(n)%%length(x))*sign(n)
return(c( tail(x,new_n) , head(x,-new_n) ))
}
else {
return(c( tail(x,n) , head(x,-n) ))
}
}
Let’s see what we get but taking into consideration again the vector (1,2,3,4,5).
x<-c(1,2,3,4,5)
custom_roll(x,-11)
And we get:
[1] 2 3 4 5 1
Or
x<-c(1,2,3,4,5)
custom_roll(x,12)
And we get:
[1] 4 5 1 2 3
rearrr
also contains roll_elements_vec()
for vectors and roll_elements()
for one or more columns in a data frame.
roll_elements()
can handle grouped data frames and can find the n
setting based on the group members with a given function (e.g. rearrr::median_index()
or rearrr::quantile_index()
).
Roll a vector -2 positions left (i.e. 2 positions right):
library(rearrr)
library(dplyr)
# Roll vector
roll_elements_vec(1:10, n = -2)
> 9 10 1 2 3 4 5 6 7 8
Roll a column in a data frame -2 positions up:
# Set seed
set.seed(1)
# Create a data frame
df <- data.frame(
"x" = 1:10,
"y" = runif(10) * 10,
"g" = rep(1:2, each = 5)
)
# Roll `x` column
roll_elements(df, cols = "x", n = -2)
> # A tibble: 10 x 4
> y g x .n
> <dbl> <int> <int> <list>
> 1 2.66 1 9 <dbl [1]>
> 2 3.72 1 10 <dbl [1]>
> 3 5.73 1 1 <dbl [1]>
> 4 9.08 1 2 <dbl [1]>
> 5 2.02 1 3 <dbl [1]>
> 6 8.98 2 4 <dbl [1]>
> 7 9.45 2 5 <dbl [1]>
> 8 6.61 2 6 <dbl [1]>
> 9 6.29 2 7 <dbl [1]>
> 10 0.618 2 8 <dbl [1]>
The .n
column contains the n
setting applied. This is mostly useful when finding n
with a function.
Roll the x
column within each group in g
:
# Group by `g` and roll `x` within both groups
df %>%
dplyr::group_by(g) %>%
roll_elements(cols = "x", n = -2)
> # A tibble: 10 x 4
> y g x .n
> <dbl> <int> <int> <list>
> 1 2.66 1 4 <dbl [1]>
> 2 3.72 1 5 <dbl [1]>
> 3 5.73 1 1 <dbl [1]>
> 4 9.08 1 2 <dbl [1]>
> 5 2.02 1 3 <dbl [1]>
> 6 8.98 2 9 <dbl [1]>
> 7 9.45 2 10 <dbl [1]>
> 8 6.61 2 6 <dbl [1]>
> 9 6.29 2 7 <dbl [1]>
> 10 0.618 2 8 <dbl [1]>
If we don’t specify one or more columns, the entire data frame is rolled. As mentioned we can find n
with a function, so here we will roll by the median index (index is 1:10, so median = 5.5 and rounded to 6 positions up).
# Roll entire data frame
# Find `n` with the `median_index()` function
roll_elements(df, n_fn = median_index)
> # A tibble: 10 x 4
> x y g .n
> <int> <dbl> <int> <list>
> 1 7 9.45 2 <dbl [1]>
> 2 8 6.61 2 <dbl [1]>
> 3 9 6.29 2 <dbl [1]>
> 4 10 0.618 2 <dbl [1]>
> 5 1 2.66 1 <dbl [1]>
> 6 2 3.72 1 <dbl [1]>
> 7 3 5.73 1 <dbl [1]>
> 8 4 9.08 1 <dbl [1]>
> 9 5 2.02 1 <dbl [1]>
> 10 6 8.98 2 <dbl [1]>
Disclaimer: I am the author of rearrr
. It also contains a roll_values()
function for rolling the value of elements instead of their positions.
Here is a one line solution using indexes and modular arithmetic
roll<-function(v,n)
{
v[(0:(length(v)-1) + n) %% length(v) + 1]
}
From data.table
version >=1.14.9 (in development; see NEWS item #27), shift
supports type = "cyclic"
"where pushed out values are re-introduced at the front/back".
# latest development version that has passed all tests:
# data.table::update_dev_pkg()
library(data.table)
shift(1:5, n = 2, type = "cyclic")
# [1] 4 5 1 2 3
shift
accepts multiple offset values in n
.
shift(1:5, n = -1:1, type = "cyclic")
# [[1]]
# [1] 2 3 4 5 1
#
# [[2]]
# [1] 1 2 3 4 5
#
# [[3]]
# [1] 5 1 2 3 4
The benchmark in NEWS suggests that it is fast.
I have a vector:
a <- c(1,2,3,4,5)
And I’d like to do something like:
b <- roll(a, 2) # 4,5,1,2,3
Is there a function like that in R? I’ve been googling around, but "R Roll" mostly gives me pages about Spanish pronunciation.
How about using head
and tail
…
roll <- function( x , n ){
if( n == 0 )
return( x )
c( tail(x,n) , head(x,-n) )
}
roll(1:5,2)
#[1] 4 5 1 2 3
# For the situation where you supply 0 [ this would be kinda silly! :) ]
roll(1:5,0)
#[1] 1 2 3 4 5
One cool thing about using head
and tail
… you get a reverse roll with negative n
, e.g.
roll(1:5,-2)
[1] 3 4 5 1 2
Here’s an alternative which has the advantage of working even when x
is “rolled” by more than one full cycle (i.e. when abs(n) > length(x)
):
roll <- function(x, n) {
x[(seq_along(x) - (n+1)) %% length(x) + 1]
}
roll(1:5, 2)
# [1] 4 5 1 2 3
roll(1:5, 0)
# [1] 1 2 3 4 5
roll(1:5, 11)
# [1] 5 1 2 3 4
FWIW (and not that it’s worth much) it also works on data.frame
s:
head(mtcars, 1)
# mpg cyl disp hp drat wt qsec vs am gear carb
# Mazda RX4 21 6 160 110 3.9 2.62 16.46 0 1 4 4
head(roll(mtcars, 2), 1)
# gear carb mpg cyl disp hp drat wt qsec vs am
# Mazda RX4 4 4 21 6 160 110 3.9 2.62 16.46 0 1
The package binhf
has the function shift
:
library(binhf)
shift(1:5, places = 2)
#[1] 4 5 1 2 3
places can be positive or negative
You can also use the permute
package:
require(permute)
a <- c(1,2,3,4,5)
shuffleSeries(a, start = 2)
output:
[1] 3 4 5 1 2
The numpy roll
method supports both directions, forward and backward, and it accepts shift parameters greater than the length of the vector. For example:
Python
import numpy
x=numpy.arange(1,6)
numpy.roll(x,-11)
And we get:
array([2, 3, 4, 5, 1])
Or
x=numpy.arange(1,6)
numpy.roll(x,12)
And we get:
array([4, 5, 1, 2, 3])
We can build an R function that takes into consideration the case where the shift parameter is greater than the length of the vector. For example:
R
custom_roll <- function( x , n ){
if( n == 0 | n%%length(x)==0) {
return(x)
}
else if (abs(n)>length(x)) {
new_n<- (abs(n)%%length(x))*sign(n)
return(c( tail(x,new_n) , head(x,-new_n) ))
}
else {
return(c( tail(x,n) , head(x,-n) ))
}
}
Let’s see what we get but taking into consideration again the vector (1,2,3,4,5).
x<-c(1,2,3,4,5)
custom_roll(x,-11)
And we get:
[1] 2 3 4 5 1
Or
x<-c(1,2,3,4,5)
custom_roll(x,12)
And we get:
[1] 4 5 1 2 3
rearrr
also contains roll_elements_vec()
for vectors and roll_elements()
for one or more columns in a data frame.
roll_elements()
can handle grouped data frames and can find the n
setting based on the group members with a given function (e.g. rearrr::median_index()
or rearrr::quantile_index()
).
Roll a vector -2 positions left (i.e. 2 positions right):
library(rearrr)
library(dplyr)
# Roll vector
roll_elements_vec(1:10, n = -2)
> 9 10 1 2 3 4 5 6 7 8
Roll a column in a data frame -2 positions up:
# Set seed
set.seed(1)
# Create a data frame
df <- data.frame(
"x" = 1:10,
"y" = runif(10) * 10,
"g" = rep(1:2, each = 5)
)
# Roll `x` column
roll_elements(df, cols = "x", n = -2)
> # A tibble: 10 x 4
> y g x .n
> <dbl> <int> <int> <list>
> 1 2.66 1 9 <dbl [1]>
> 2 3.72 1 10 <dbl [1]>
> 3 5.73 1 1 <dbl [1]>
> 4 9.08 1 2 <dbl [1]>
> 5 2.02 1 3 <dbl [1]>
> 6 8.98 2 4 <dbl [1]>
> 7 9.45 2 5 <dbl [1]>
> 8 6.61 2 6 <dbl [1]>
> 9 6.29 2 7 <dbl [1]>
> 10 0.618 2 8 <dbl [1]>
The .n
column contains the n
setting applied. This is mostly useful when finding n
with a function.
Roll the x
column within each group in g
:
# Group by `g` and roll `x` within both groups
df %>%
dplyr::group_by(g) %>%
roll_elements(cols = "x", n = -2)
> # A tibble: 10 x 4
> y g x .n
> <dbl> <int> <int> <list>
> 1 2.66 1 4 <dbl [1]>
> 2 3.72 1 5 <dbl [1]>
> 3 5.73 1 1 <dbl [1]>
> 4 9.08 1 2 <dbl [1]>
> 5 2.02 1 3 <dbl [1]>
> 6 8.98 2 9 <dbl [1]>
> 7 9.45 2 10 <dbl [1]>
> 8 6.61 2 6 <dbl [1]>
> 9 6.29 2 7 <dbl [1]>
> 10 0.618 2 8 <dbl [1]>
If we don’t specify one or more columns, the entire data frame is rolled. As mentioned we can find n
with a function, so here we will roll by the median index (index is 1:10, so median = 5.5 and rounded to 6 positions up).
# Roll entire data frame
# Find `n` with the `median_index()` function
roll_elements(df, n_fn = median_index)
> # A tibble: 10 x 4
> x y g .n
> <int> <dbl> <int> <list>
> 1 7 9.45 2 <dbl [1]>
> 2 8 6.61 2 <dbl [1]>
> 3 9 6.29 2 <dbl [1]>
> 4 10 0.618 2 <dbl [1]>
> 5 1 2.66 1 <dbl [1]>
> 6 2 3.72 1 <dbl [1]>
> 7 3 5.73 1 <dbl [1]>
> 8 4 9.08 1 <dbl [1]>
> 9 5 2.02 1 <dbl [1]>
> 10 6 8.98 2 <dbl [1]>
Disclaimer: I am the author of rearrr
. It also contains a roll_values()
function for rolling the value of elements instead of their positions.
Here is a one line solution using indexes and modular arithmetic
roll<-function(v,n)
{
v[(0:(length(v)-1) + n) %% length(v) + 1]
}
From data.table
version >=1.14.9 (in development; see NEWS item #27), shift
supports type = "cyclic"
"where pushed out values are re-introduced at the front/back".
# latest development version that has passed all tests:
# data.table::update_dev_pkg()
library(data.table)
shift(1:5, n = 2, type = "cyclic")
# [1] 4 5 1 2 3
shift
accepts multiple offset values in n
.
shift(1:5, n = -1:1, type = "cyclic")
# [[1]]
# [1] 2 3 4 5 1
#
# [[2]]
# [1] 1 2 3 4 5
#
# [[3]]
# [1] 5 1 2 3 4
The benchmark in NEWS suggests that it is fast.