Separating unique values within the same dataset

Question:

I have a huge dataset. What I am trying to do is separate unique names and calculate the genetic algorithm of data with the same name. to illustrate

Assume the following table

Name    price     quantity     
a1.     100.       6
a2.     30.        20
a1      250.       125
a1.     5.         20
a2.     90.        200
a2.     50.        705

so I want to calculate the genetic algorithm of a1 and a2 separately to get the best solution for x1-x3. I have already coded the genetic algorithm for the whole dataset, but I am confused about how to calculate a1 and a2 separately within the same dataset.

Note: I have used pandas to import my dataset

Asked By: it's M

||

Answers:

With regards to @IgnatiusReilly for adding to the question in the comments.

If you want to slice your DataFrame into chunks for every unique name to perform calculations over them, you may do the following:

# assume there's a function ga() that calculates the genetic algorithm for a column 
# and your DataFrame is df
for name in df.Name.unique():
    ga(df.loc[df.Name == name])

As for applying this to calculate geneticalgorithm, I have a humble assumption this might look like this:

from geneticalgorithm import geneticalgorithm as ga

for name in df.Name.unique():
    s = df.loc[df.Name == name].to_numpy()  # convert to ndarray
    # take columns from 1 to 3 if 'Name' is indexed 0
    varbound = np.array([[np.min(s[:, 1]), np.max(s[:, 1])],
                         [np.min(s[:, 2]), np.max(s[:, 2])],
                         [np.min(s[:, 3]), np.max(s[:, 3])]])
    model = ga(function=equ,  # this function has been declared somewhere before
               dimension=3,
               variable_type='int',
               variable_boundaries=varbound)
    model.run()
Answered By: n.shabankin
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.