how to convert pandas series to tuple of index and value
Question:
I’m looking for an efficient way to convert a series to a tuple of its index with its values.
s = pd.Series([1, 2, 3], ['a', 'b', 'c'])
I want an array, list, series, some iterable:
[(1, 'a'), (2, 'b'), (3, 'c')]
Answers:
One possibility is to swap the order of the index elements and the values from iteritems
:
res = [(val, idx) for idx, val in s.iteritems()]
EDIT: @Divakar’s answer is faster by about a factor of 2. Building a seriesĀ of random strings for testing:
N = 100000
str_len = 4
ints = range(N)
strs = [None]*N
for i in ints:
strs[i] = ''.join(random.choice(string.ascii_letters) for _ in range(str_len))
s = pd.Series(ints, strs)
Timings:
%timeit res = zip(s,s.index)
>>> 100 loops, best of 3: 14.8 ms per loop
%timeit res = [(val, idx) for idx, val in s.iteritems()]
>>> 10 loops, best of 3: 26.7 ms per loop
Well it seems simply zip(s,s.index)
works too!
For Python-3.x, we need to wrap it with list
–
list(zip(s,s.index))
To get a tuple of tuples, use tuple()
: tuple(zip(s,s.index))
.
Sample run –
In [8]: s
Out[8]:
a 1
b 2
c 3
dtype: int64
In [9]: list(zip(s,s.index))
Out[9]: [(1, 'a'), (2, 'b'), (3, 'c')]
In [10]: tuple(zip(s,s.index))
Out[10]: ((1, 'a'), (2, 'b'), (3, 'c'))
s.items()
or s.iteritems()
do this.
(If you want to get the output as a list rather than an iterator, do: list(s.items())
)
I’m looking for an efficient way to convert a series to a tuple of its index with its values.
s = pd.Series([1, 2, 3], ['a', 'b', 'c'])
I want an array, list, series, some iterable:
[(1, 'a'), (2, 'b'), (3, 'c')]
One possibility is to swap the order of the index elements and the values from iteritems
:
res = [(val, idx) for idx, val in s.iteritems()]
EDIT: @Divakar’s answer is faster by about a factor of 2. Building a seriesĀ of random strings for testing:
N = 100000
str_len = 4
ints = range(N)
strs = [None]*N
for i in ints:
strs[i] = ''.join(random.choice(string.ascii_letters) for _ in range(str_len))
s = pd.Series(ints, strs)
Timings:
%timeit res = zip(s,s.index)
>>> 100 loops, best of 3: 14.8 ms per loop
%timeit res = [(val, idx) for idx, val in s.iteritems()]
>>> 10 loops, best of 3: 26.7 ms per loop
Well it seems simply zip(s,s.index)
works too!
For Python-3.x, we need to wrap it with list
–
list(zip(s,s.index))
To get a tuple of tuples, use tuple()
: tuple(zip(s,s.index))
.
Sample run –
In [8]: s
Out[8]:
a 1
b 2
c 3
dtype: int64
In [9]: list(zip(s,s.index))
Out[9]: [(1, 'a'), (2, 'b'), (3, 'c')]
In [10]: tuple(zip(s,s.index))
Out[10]: ((1, 'a'), (2, 'b'), (3, 'c'))
s.items()
or s.iteritems()
do this.
(If you want to get the output as a list rather than an iterator, do: list(s.items())
)