How to transform array of strings into matrix with python
Question:
what would be the pythonic way to transform multiple arrays of strings into a matrix, where each input string gets its position in the new matrix based on a lexicographical order (or is there even a better criterion?).
In the end, I would like to be able to query the final matrix strings based on a normalized, common criterion and also be able to find out from which inputarray each particular string originally came from.
So for example if I iterate over a bunch of arrays like such (pseudocode!):
array1 = {'01abc','aabc','cba','xyz','999','zz','ZZ'}
array2 = {'0c','aabc','cc','xz','aZZ'}
array3+n = {'...','...','...','....
I’d like to transform that it into something like this:
name 0 9 a c x z Z
array1 01abc 999 aabc cba xyz zz ZZ
array2 0c aabc cc xz
array2 aZZ
array3...
I already tried googling 2 hours to find my way, but I just don’t have the right terminology to describe my problem properly enough… any ideas that can point me into the right direction will be greatly appreciated.
Answers:
You might want to try numpy
:
what would be the pythonic way to transform multiple arrays of strings into a matrix, where each input string gets its position in the new matrix based on a lexicographical order (or is there even a better criterion?).
In the end, I would like to be able to query the final matrix strings based on a normalized, common criterion and also be able to find out from which inputarray each particular string originally came from.
So for example if I iterate over a bunch of arrays like such (pseudocode!):
array1 = {'01abc','aabc','cba','xyz','999','zz','ZZ'}
array2 = {'0c','aabc','cc','xz','aZZ'}
array3+n = {'...','...','...','....
I’d like to transform that it into something like this:
name 0 9 a c x z Z
array1 01abc 999 aabc cba xyz zz ZZ
array2 0c aabc cc xz
array2 aZZ
array3...
I already tried googling 2 hours to find my way, but I just don’t have the right terminology to describe my problem properly enough… any ideas that can point me into the right direction will be greatly appreciated.
You might want to try numpy
: