How to create an array from two 2d arrays, based in conditionals and random selection with PYTHON

Question:

I´m trying to create an array using two 2d arrays and conditionals. The first array created randomly with numpy is:

A = [[0 0 0 1 0 1 1 0 0 1 0],
     [0 0 1 1 1 1 1 1 0 1 1],
     [0 0 0 1 0 1 1 1 0 0 0],
     [1 1 0 1 0 0 0 1 1 1 0]]

(hypothetically all columns will have at least "1" value)

and the second array is:

B = ["a","b","c","d"]

I´m trying to create an array, selecting randomly only "1" value in each column (the row containing "1" doesn´t matter). When I find "1", the position (in selected row) must be linked to array "B", take the value in "B" and finally allocate it in array "C". For example, evaluating column "0", the only possible value is A[0,3]=1, so B=["d"] (the only possible value) and this value must be the 1st value for array "C". Column "3" can take any value from "B".

For example the full array I´m looking for could be the following one:

C= ["d","d","b","a","b","c","a","d","d","a","b"]

I´m trying to create "C" with the following code:

import numpy as np
A=np.random.randint(2, size=(4,11)) 
A=np.array(A)

C=[] 
var=0

B=["a1","b1","c2","d2"]

for i in range(11):
    C.append(var)
    R=np.random.randint(0,4)             
    if A[R,0+i]==1:
        var=B[R]        
    else:
        var=0
print(C)

The result is the following one:

[0, 0, 'a1', 'a1', 'd2', 0, 'd2', 'd2', 'd2', 0, 0]

This code doesn’t complete the work, I can’t find "1" in several columns. I’ve been trying different methods, including: coordinates, loops and generators. But I can’t find one that really works.

Asked By: Hernan19

||

Answers:

I might be struggling to understand your question. A big problem here is that there is no guarantee that there are any 1’s inside a column of A.

But, assuming you have a special A where this is guaranteed, then your second problem is that your random R picks from all rows; it needs to pick from only those with 1’s.
You want to achieve this in 2 steps:

  1. Find all the valid rows for a given column
  2. Select one out those randomly

Lastly, you append to C before you actually compute something. This also means you had to define a var outside the loop, which should have been a big warning sign for you.

All in all:

import numpy as np
A=np.random.randint(2, size=(4,11))

C=[] 
B=["a1","b1","c2","d2"]

for column in A.transpose():
    valid_rows = np.where(column == 1)[0]
    R = np.random.choice(valid_rows)
    C.append(B[R])

print(C)
Answered By: Mikael Öhman