R function rep() in Python (replicates elements of a list/vector)

Question:

The R function rep() replicates each element of a vector:

> rep(c("A","B"), times=2)
[1] "A" "B" "A" "B"

This is like the list multiplication in Python:

>>> ["A","B"]*2
['A', 'B', 'A', 'B']

But with the rep() R function it is also possible to specifiy the number of repeats for each element of the vector:

> rep(c("A","B"), times=c(2,3))
[1] "A" "A" "B" "B" "B"

Is there such a function availbale in Python ? Otherwise how could one define it ? By the way I’m also interested in such a function for duplicating rows of an array.

Asked By: Stéphane Laurent

||

Answers:

Not sure if there’s a built-in available for this, but you can try something like this:

>>> lis = ["A", "B"]
>>> times = (2, 3)
>>> sum(([x]*y for x,y in zip(lis, times)),[])
['A', 'A', 'B', 'B', 'B']

Note that sum() runs in quadratic time. So, it’s not the recommended way.

>>> from itertools import chain, izip, starmap
>>> from operator import mul
>>> list(chain.from_iterable(starmap(mul, izip(lis, times))))
['A', 'A', 'B', 'B', 'B']

Timing comparions:

>>> lis = ["A", "B"] * 1000
>>> times = (2, 3) * 1000
>>> %timeit list(chain.from_iterable(starmap(mul, izip(lis, times))))
1000 loops, best of 3: 713 µs per loop
>>> %timeit sum(([x]*y for x,y in zip(lis, times)),[])
100 loops, best of 3: 15.4 ms per loop
Answered By: Ashwini Chaudhary
l = ['A','B']
n = [2, 4]

Your example uses strings which are already iterables.
You can produce a result string which is similar to a list.

''.join([e * m for e, m in zip(l, n)])
'AABBBB'

Update: the list comprehension is not required here:

''.join(e * m for e, m in zip(l, n))
'AABBBB'
Answered By: tzelleke

Use numpy arrays and the numpy.repeat function:

import numpy as np

x = np.array(["A", "B"])
print np.repeat(x, [2, 3], axis=0)

['A' 'A' 'B' 'B' 'B']
Answered By: Lukas Graf

Since you say “array” and mention R. You may want to use numpy arrays anyways, and then use:

import numpy as np
np.repeat(np.array([1,2]), [2,3])

EDIT: Since you mention you want to repeat rows as well, I think you should use numpy. np.repeat has an axis argument to do this.

Other then that, maybe:

from itertools import izip, chain, repeat
list(chain(*(repeat(a,b) for a, b in izip([1,2], [2,3]))))

As it doesn’t make the assumption you have a list or string to multiply. Though I admit, passing everything as argument into chain is maybe not perfect, so writing your own iterator may be better.

Answered By: seberg

What do you think about this way?

To repeat a value:

>>> repetitions=[]
>>> torep=3
>>> nrep=5
>>> for i in range(nrep):
>>>     i=torep
>>>     repetitions.append(i)
[3, 3, 3, 3, 3]

To repeat a sequence:

>>> repetitions=[]
>>> torep=[1,2,3,4]
>>> nrep= 2
>>> for i in range(nrep):
>>>     repetitions=repetitions+torep
>>> print(repetitions)
[1, 2, 3, 4, 1, 2, 3, 4]
Answered By: DavidDz

The following might work for you:

>>>[['a','b'],['A','B']]*5


[['a', 'b'], ['A', 'B'], ['a', 'b'], ['A', 'B'], ['a', 'b'], ['A', 'B'], ['a', 'b'], ['A', 'B'], ['a', 'b'], ['A', 'B']]
Answered By: Gopi Krishna Nuti

The numpy.repeat has been mentioned, and that’s clearly the equivalent to what you want. But for completenes’ sake, there’s also repeat from the itertools standard library. However, this is intended for iterables in general, so it doesn’t allow repetions by index (because iterables in general do not have an index defined).

We can use the code given there as a rough equivalent

def repeat(object, times=None):
    # repeat(10, 3) --> 10 10 10
    if times is None:
        while True:
            yield object
    else:
        for i in xrange(times):
            yield object

to define our own generalised repeat:

def repeat_generalised(object, times=None):
    # repeat(10, 3) --> 10 10 10
    if times is None:
        while True:
            yield object
    else:
        for reps, elem in zip(times, object):
            for i in xrange(reps): 
                yield elem

The problem of course is that there’s a lot of possible edge cases you have to define (What should happen if object and times have a different number of elements?), and that would depend on you individual use case.

Answered By: BurnNote

Here is my attempt at a clone of R rep:

def rep(x, times = 1, each = 1, length_out = None):
    if not isinstance(times, list):
        times = [times]

    res = ''.join([str(i) * each for i in x])

    if len(times) > 1:   
        res = ''.join(str(i) * m for i, m in zip(x, times))
    else:
        res = ''.join(res * times[0])
    
    if length_out is None:
        return res
    else:
        return res[0:length_out]

Reproduces the R examples:

rep(range(4), times = 2)
rep(range(4), each = 2)
rep(range(4), times = [2,2,2,2])
rep(range(4), each = 2, length_out = 4)
rep(range(4), each = 2, times = 3)

with the exception that there is no recycling of shorter vectors/lists (imo this is the worst feature of R).

Answered By: jsta
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.