numpy.union that preserves order
Question:
Let’s say I have two arrays which have been produced by dropping random values of an original array (elements are unique and unsorted):
orig = np.array([2, 1, 7, 5, 3, 8])
Let’s say these arrays are:
a = np.array([2, 1, 7, 8])
b = np.array([2, 7, 3, 8])
Given just these two arrays, how to merge them (efficiently) so that the dropped values are on their correct positions?
The result should be:
result = np.array([2, 1, 7, 3, 8])
My attempts:
numpy.union1d
is not suitable, because it always sorts:
np.union1d(a, b) # array([1, 2, 3, 7, 8])
Maybe pandas could help?
This (not what I want) uses the first array in full, and then appends the leftover values of the second one:
pd.concat([pd.Series(index=a, dtype=int), pd.Series(index=b, dtype=int)], axis=1).index.to_numpy()
# array([2, 1, 7, 8, 3])
Answers:
Use Index.union
with sort=False
:
c = pd.Index(a).union(b, sort=False).to_numpy()
print (c)
[2 1 7 8 3]
Let’s say I have two arrays which have been produced by dropping random values of an original array (elements are unique and unsorted):
orig = np.array([2, 1, 7, 5, 3, 8])
Let’s say these arrays are:
a = np.array([2, 1, 7, 8])
b = np.array([2, 7, 3, 8])
Given just these two arrays, how to merge them (efficiently) so that the dropped values are on their correct positions?
The result should be:
result = np.array([2, 1, 7, 3, 8])
My attempts:
numpy.union1d
is not suitable, because it always sorts:
np.union1d(a, b) # array([1, 2, 3, 7, 8])
Maybe pandas could help?
This (not what I want) uses the first array in full, and then appends the leftover values of the second one:
pd.concat([pd.Series(index=a, dtype=int), pd.Series(index=b, dtype=int)], axis=1).index.to_numpy()
# array([2, 1, 7, 8, 3])
Use Index.union
with sort=False
:
c = pd.Index(a).union(b, sort=False).to_numpy()
print (c)
[2 1 7 8 3]