Distribute value based on distance – pandas
Question:
I have a pandas series as below which represents a distance from the point.
[53.14215611146749,
68.38785810979309,
73.39766970890767,
62.55809054344587,
61.65664511406602,
67.30027137647377,
53.18713806914155,
44.731280834334726,
56.775601188099564,
48.70050608722388,
52.058305375893326,
43.86576801918336,
40.7737929704101,
48.14255989609582,
66.1008274224399,
64.14163441281926,
74.19664861255285,
39.2240226739306,
43.19696483020877,
56.19908015933838,
52.75637095625348,
72.94738557673514,
68.43340406723989,
55.3282335432387,
14.063542002183079,
65.42855719150613,
64.57283331956563,
55.7359195807484,
46.47999444513133,
67.90377608318023,
45.20662678388409,
60.89180227702483,
55.498083875456516,
66.27610947615794,
22.845312320987933,
55.68089486356418,
37.17249970361574,
31.996833503627055,
8.004580252301615,
57.701300278758396,
55.68491218953833,
49.69553182988371,
62.857799532025695,
64.4881255962942,
56.52268246645304,
53.70335480184844,
47.68999060768484,
70.19624652173587,
46.06795699534889,
43.10099872431838,
39.554771098646285,
39.35671580350949,
63.702434120074344,
63.97772732522319,
55.721880827686036,
40.45931808597496,
54.21292609875884,
16.675696857750737,
41.39693209332469,
54.254194394652124,
20.35367440002031,
59.334225047181775,
54.27213837753601,
71.06412120928913,
32.77778238532229,
44.525878012920465,
21.633778938745873,
62.82994949086467,
53.74149464908078,
43.55381629902706,
62.06565329192727,
60.84255535670013,
69.61908591774456,
62.42694740156783,
24.305272369346792,
53.67321288616061,
41.51862527126106,
53.06105426101688,
52.29598013335814,
56.56807611441198,
54.44050075253376,
69.0021067468377,
51.62347224748836,
52.22881959386345,
46.864584528474325,
47.012472881260116,
51.646519289015345,
45.619767144941775,
8.519681182197928,
68.36194018083505,
24.231287660834653,
54.903970823297854,
38.8830383852113,
60.93359558626832,
60.20955222539125,
49.42302250000802,
44.393460251409024,
41.81242566960246,
56.712784485916885,
19.378568603967558]
Now, I need to distribute a number (For example 15) across these 100 numbers.
If the distance is lower it should get a larger proportion of 15. If the distance is huge it should get lower proportion of 15.
i.e output of 14.063542002183079 should get a larger proportion (since it is closer), whereas 53.14215611146749 should get smaller (since it is far)
However, the sum of the 100 derived numbers should give me 15 (i.e. the value 15 is distributed based on distance).
What I have tried so far
I used softmax function to convert these distance numbers(list of 100) between 0 and 1.
Multipled each number in the list by 15. Now, the sum of the numbers in the list is 15.but the values are not distributed based on distance.
Answers:
You need to add all the distances up, then divide each distance by the sum of all distances. This should give you the percentage (value between 0 and 1) each value has in comparison to the sum. Then multiply by 15 and you’re done.
EDIT:
I misunderstood you, here is a solution that distributes the values linearly.
p_list = [20, 50, 100]
distribution_val = 15
# assuming the point that is the most distant gets the value 0
for i, p in enumerate(p_list):
p_list[i] = p*-1 + max(p_list)
sum = sum(p_list)
for i, p in enumerate(p_list):
val = p / sum
val *= distribution_val
p_list[i] = val
print(p_list) # [9.23, 5.77, 0] => sum = 15
If you want to give the most distant point also a value, just add max(p_list) + min_points
N = 15
dists = 1 / (s - N).abs()
weights = dists / dists.sum()
result = weights * N
- find the distances as absolute difference to N but inverted so closer the better
- get weights as the normalized distances by their sum
- distribute N with weights
I have a pandas series as below which represents a distance from the point.
[53.14215611146749,
68.38785810979309,
73.39766970890767,
62.55809054344587,
61.65664511406602,
67.30027137647377,
53.18713806914155,
44.731280834334726,
56.775601188099564,
48.70050608722388,
52.058305375893326,
43.86576801918336,
40.7737929704101,
48.14255989609582,
66.1008274224399,
64.14163441281926,
74.19664861255285,
39.2240226739306,
43.19696483020877,
56.19908015933838,
52.75637095625348,
72.94738557673514,
68.43340406723989,
55.3282335432387,
14.063542002183079,
65.42855719150613,
64.57283331956563,
55.7359195807484,
46.47999444513133,
67.90377608318023,
45.20662678388409,
60.89180227702483,
55.498083875456516,
66.27610947615794,
22.845312320987933,
55.68089486356418,
37.17249970361574,
31.996833503627055,
8.004580252301615,
57.701300278758396,
55.68491218953833,
49.69553182988371,
62.857799532025695,
64.4881255962942,
56.52268246645304,
53.70335480184844,
47.68999060768484,
70.19624652173587,
46.06795699534889,
43.10099872431838,
39.554771098646285,
39.35671580350949,
63.702434120074344,
63.97772732522319,
55.721880827686036,
40.45931808597496,
54.21292609875884,
16.675696857750737,
41.39693209332469,
54.254194394652124,
20.35367440002031,
59.334225047181775,
54.27213837753601,
71.06412120928913,
32.77778238532229,
44.525878012920465,
21.633778938745873,
62.82994949086467,
53.74149464908078,
43.55381629902706,
62.06565329192727,
60.84255535670013,
69.61908591774456,
62.42694740156783,
24.305272369346792,
53.67321288616061,
41.51862527126106,
53.06105426101688,
52.29598013335814,
56.56807611441198,
54.44050075253376,
69.0021067468377,
51.62347224748836,
52.22881959386345,
46.864584528474325,
47.012472881260116,
51.646519289015345,
45.619767144941775,
8.519681182197928,
68.36194018083505,
24.231287660834653,
54.903970823297854,
38.8830383852113,
60.93359558626832,
60.20955222539125,
49.42302250000802,
44.393460251409024,
41.81242566960246,
56.712784485916885,
19.378568603967558]
Now, I need to distribute a number (For example 15) across these 100 numbers.
If the distance is lower it should get a larger proportion of 15. If the distance is huge it should get lower proportion of 15.
i.e output of 14.063542002183079 should get a larger proportion (since it is closer), whereas 53.14215611146749 should get smaller (since it is far)
However, the sum of the 100 derived numbers should give me 15 (i.e. the value 15 is distributed based on distance).
What I have tried so far
I used softmax function to convert these distance numbers(list of 100) between 0 and 1.
Multipled each number in the list by 15. Now, the sum of the numbers in the list is 15.but the values are not distributed based on distance.
You need to add all the distances up, then divide each distance by the sum of all distances. This should give you the percentage (value between 0 and 1) each value has in comparison to the sum. Then multiply by 15 and you’re done.
EDIT:
I misunderstood you, here is a solution that distributes the values linearly.
p_list = [20, 50, 100]
distribution_val = 15
# assuming the point that is the most distant gets the value 0
for i, p in enumerate(p_list):
p_list[i] = p*-1 + max(p_list)
sum = sum(p_list)
for i, p in enumerate(p_list):
val = p / sum
val *= distribution_val
p_list[i] = val
print(p_list) # [9.23, 5.77, 0] => sum = 15
If you want to give the most distant point also a value, just add max(p_list) + min_points
N = 15
dists = 1 / (s - N).abs()
weights = dists / dists.sum()
result = weights * N
- find the distances as absolute difference to N but inverted so closer the better
- get weights as the normalized distances by their sum
- distribute N with weights