Dictionary keys and values to separate numpy arrays
Question:
I have a dictionary as
Samples = {5.207403005022627: 0.69973543384229719, 6.8970222167794759: 0.080782939731898179, 7.8338517407140973: 0.10308033284258854, 8.5301143255505334: 0.018640838362318335, 10.418899728838058: 0.14427355015329846, 5.3983946820220501: 0.51319796560976771}
I want to separate the keys
and values
into 2 numpy
arrays.
I tried np.array(Samples.keys(),dtype=np.float)
but i get an error TypeError: float() argument must be a string or a number
Answers:
keys = np.array(dictionary.keys())
values = np.array(dictionary.values())
Just assign all of the values to a list, and then convert to a np.array()
.
import numpy as np
Samples = {5.207403005022627: 0.69973543384229719, 6.8970222167794759: 0.080782939731898179, 7.8338517407140973: 0.10308033284258854, 8.5301143255505334: 0.018640838362318335, 10.418899728838058: 0.14427355015329846, 5.3983946820220501: 0.51319796560976771}
keys = np.array(Samples.keys())
vals = np.array(Samples.values())
Or, if you want to iterate over it:
import numpy as np
Samples = {5.207403005022627: 0.69973543384229719, 6.8970222167794759: 0.080782939731898179, 7.8338517407140973: 0.10308033284258854, 8.5301143255505334: 0.018640838362318335, 10.418899728838058: 0.14427355015329846, 5.3983946820220501: 0.51319796560976771}
keys = vals = []
for k, v in Samples.items():
keys.append(k)
vals.append(v)
keys = np.array(keys)
vals = np.array(vals)
You can use np.fromiter
to directly create numpy
arrays from the dictionary key and values views:
In python 3:
keys = np.fromiter(Samples.keys(), dtype=float)
vals = np.fromiter(Samples.values(), dtype=float)
In python 2:
keys = np.fromiter(Samples.iterkeys(), dtype=float)
vals = np.fromiter(Samples.itervalues(), dtype=float)
On python 3.4, the following simply works:
Samples = {5.207403005022627: 0.69973543384229719, 6.8970222167794759: 0.080782939731898179, 7.8338517407140973: 0.10308033284258854, 8.5301143255505334: 0.018640838362318335, 10.418899728838058: 0.14427355015329846, 5.3983946820220501: 0.51319796560976771}
keys = np.array(list(Samples.keys()))
values = np.array(list(Samples.values()))
The reason np.array(Samples.values())
doesn’t give what you expect in Python 3 is that in Python 3, the values() method of a dict returns an iterable view, whereas in Python 2, it returns an actual list of the keys.
keys = np.array(list(Samples.keys()))
will actually work in Python 2.7 as well, and will make your code more version agnostic. But the extra call to list()
will slow it down marginally.
In Python 3.7:
import numpy as np
Samples = {5.207403005022627: 0.69973543384229719, 6.8970222167794759: 0.080782939731898179, 7.8338517407140973: 0.10308033284258854, 8.5301143255505334: 0.018640838362318335, 10.418899728838058: 0.14427355015329846, 5.3983946820220501: 0.51319796560976771}
keys = np.array(list(Samples.keys()))
vals = np.array(list(Samples.values()))
Note: It’s important to say that in this Python version dict.keys()
and dict.values()
return objects of type dict_keys
and dict_values
, respectively.
If you care about speed (Python 3.7)
rnd = np.random.RandomState(10)
for i in [10,100,1000,10000,100000]:
test_dict = {j:j for j in rnd.uniform(-100,100,i)}
assert len(test_dict) == i
print(f"nFor {i} keysn-----------")
%timeit keys = np.fromiter(test_dict.keys(), dtype=float)
%timeit keys = np.array(list(test_dict.keys()))
np.fromiter is 5-7 times faster
For 10 keys
-----------
712 ns ± 4.77 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.65 µs ± 9.15 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
For 100 keys
-----------
1.87 µs ± 13.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
8.02 µs ± 22.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
For 1000 keys
-----------
13.7 µs ± 27.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
70.5 µs ± 251 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
For 10000 keys
-----------
128 µs ± 70.6 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
698 µs ± 455 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
For 100000 keys
-----------
1.45 ms ± 374 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
7.14 ms ± 6.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
I have a dictionary as
Samples = {5.207403005022627: 0.69973543384229719, 6.8970222167794759: 0.080782939731898179, 7.8338517407140973: 0.10308033284258854, 8.5301143255505334: 0.018640838362318335, 10.418899728838058: 0.14427355015329846, 5.3983946820220501: 0.51319796560976771}
I want to separate the keys
and values
into 2 numpy
arrays.
I tried np.array(Samples.keys(),dtype=np.float)
but i get an error TypeError: float() argument must be a string or a number
keys = np.array(dictionary.keys())
values = np.array(dictionary.values())
Just assign all of the values to a list, and then convert to a np.array()
.
import numpy as np
Samples = {5.207403005022627: 0.69973543384229719, 6.8970222167794759: 0.080782939731898179, 7.8338517407140973: 0.10308033284258854, 8.5301143255505334: 0.018640838362318335, 10.418899728838058: 0.14427355015329846, 5.3983946820220501: 0.51319796560976771}
keys = np.array(Samples.keys())
vals = np.array(Samples.values())
Or, if you want to iterate over it:
import numpy as np
Samples = {5.207403005022627: 0.69973543384229719, 6.8970222167794759: 0.080782939731898179, 7.8338517407140973: 0.10308033284258854, 8.5301143255505334: 0.018640838362318335, 10.418899728838058: 0.14427355015329846, 5.3983946820220501: 0.51319796560976771}
keys = vals = []
for k, v in Samples.items():
keys.append(k)
vals.append(v)
keys = np.array(keys)
vals = np.array(vals)
You can use np.fromiter
to directly create numpy
arrays from the dictionary key and values views:
In python 3:
keys = np.fromiter(Samples.keys(), dtype=float)
vals = np.fromiter(Samples.values(), dtype=float)
In python 2:
keys = np.fromiter(Samples.iterkeys(), dtype=float)
vals = np.fromiter(Samples.itervalues(), dtype=float)
On python 3.4, the following simply works:
Samples = {5.207403005022627: 0.69973543384229719, 6.8970222167794759: 0.080782939731898179, 7.8338517407140973: 0.10308033284258854, 8.5301143255505334: 0.018640838362318335, 10.418899728838058: 0.14427355015329846, 5.3983946820220501: 0.51319796560976771}
keys = np.array(list(Samples.keys()))
values = np.array(list(Samples.values()))
The reason np.array(Samples.values())
doesn’t give what you expect in Python 3 is that in Python 3, the values() method of a dict returns an iterable view, whereas in Python 2, it returns an actual list of the keys.
keys = np.array(list(Samples.keys()))
will actually work in Python 2.7 as well, and will make your code more version agnostic. But the extra call to list()
will slow it down marginally.
In Python 3.7:
import numpy as np
Samples = {5.207403005022627: 0.69973543384229719, 6.8970222167794759: 0.080782939731898179, 7.8338517407140973: 0.10308033284258854, 8.5301143255505334: 0.018640838362318335, 10.418899728838058: 0.14427355015329846, 5.3983946820220501: 0.51319796560976771}
keys = np.array(list(Samples.keys()))
vals = np.array(list(Samples.values()))
Note: It’s important to say that in this Python version dict.keys()
and dict.values()
return objects of type dict_keys
and dict_values
, respectively.
If you care about speed (Python 3.7)
rnd = np.random.RandomState(10)
for i in [10,100,1000,10000,100000]:
test_dict = {j:j for j in rnd.uniform(-100,100,i)}
assert len(test_dict) == i
print(f"nFor {i} keysn-----------")
%timeit keys = np.fromiter(test_dict.keys(), dtype=float)
%timeit keys = np.array(list(test_dict.keys()))
np.fromiter is 5-7 times faster
For 10 keys
-----------
712 ns ± 4.77 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.65 µs ± 9.15 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
For 100 keys
-----------
1.87 µs ± 13.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
8.02 µs ± 22.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
For 1000 keys
-----------
13.7 µs ± 27.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
70.5 µs ± 251 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
For 10000 keys
-----------
128 µs ± 70.6 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
698 µs ± 455 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
For 100000 keys
-----------
1.45 ms ± 374 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
7.14 ms ± 6.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)