How can I extract data from a series with square brackets and calculate the average in python
Question:
I got a series of data like this:
[1,3]
[1,3]
[2,4]
[3]
[4]
Every row contains 1 or 2 values, I need to extract them and calculate the average.
The expected outputs are like this:
2
2
3
3
4
I have no idea how to remove the square bracket and comma to read the numerical value of the data properly and calculate the average.
Answers:
If you have a series called s
containing string representations of lists such as:
s = pd.Series(["[1,3]","[1,3]","[2,4]","[3]","[4]"])
0 [1,3]
1 [1,3]
2 [2,4]
3 [3]
4 [4]
You can apply a lambda function to first convert each row into a list of strings (credit goes to this answer), then convert each element of the list into an int, then calculate the mean:
s = s.apply(lambda row: [i.strip() for i in row[1:-1].replace('"',"").split(',')])
s = s.apply(lambda row: [int(n) for n in row])
s.apply(lambda x: sum(x)/len(x))
0 2.0
1 2.0
2 3.0
3 3.0
4 4.0
You can try doing this by creating some variable called something like "total". You can loop through the array add each value to total. At the end, divide by len(inputArray)
or just the size of array given.
For example, if we got this array:
a = [5,4,9]
you can use this code:
a = [5,4,9]
total = 0
for num in a: #for every item in array
total += a #total will equal 18 at end of loop
avg = total / len(a) # divide to get average.
#print or use variable avg. expected output: 6
or if you are wanting a way to convert string like "[0,3]" to an array, that is using string functions:
a = "[3,5]" #or a = any string given
a = a[1:]#delete first "["
a = a[:len(a)-1]# delete last "]"
a = a.split(",")#get all numbers separated by ","
#finally, turn all strings into numbers
for i in range(0, len(a) - 1):
a[i] = int(a[i])
And then use the average calculator in first code
if you question is like this
s=[[1,3],[1,3],[2,4],[3],[4]]
then you can write code as
for i in s:
print(int(sum(i)/len(i)))
the output will be
2
2
3
3
4
I got a series of data like this:
[1,3]
[1,3]
[2,4]
[3]
[4]
Every row contains 1 or 2 values, I need to extract them and calculate the average.
The expected outputs are like this:
2
2
3
3
4
I have no idea how to remove the square bracket and comma to read the numerical value of the data properly and calculate the average.
If you have a series called s
containing string representations of lists such as:
s = pd.Series(["[1,3]","[1,3]","[2,4]","[3]","[4]"])
0 [1,3]
1 [1,3]
2 [2,4]
3 [3]
4 [4]
You can apply a lambda function to first convert each row into a list of strings (credit goes to this answer), then convert each element of the list into an int, then calculate the mean:
s = s.apply(lambda row: [i.strip() for i in row[1:-1].replace('"',"").split(',')])
s = s.apply(lambda row: [int(n) for n in row])
s.apply(lambda x: sum(x)/len(x))
0 2.0
1 2.0
2 3.0
3 3.0
4 4.0
You can try doing this by creating some variable called something like "total". You can loop through the array add each value to total. At the end, divide by len(inputArray)
or just the size of array given.
For example, if we got this array:
a = [5,4,9]
you can use this code:
a = [5,4,9]
total = 0
for num in a: #for every item in array
total += a #total will equal 18 at end of loop
avg = total / len(a) # divide to get average.
#print or use variable avg. expected output: 6
or if you are wanting a way to convert string like "[0,3]" to an array, that is using string functions:
a = "[3,5]" #or a = any string given
a = a[1:]#delete first "["
a = a[:len(a)-1]# delete last "]"
a = a.split(",")#get all numbers separated by ","
#finally, turn all strings into numbers
for i in range(0, len(a) - 1):
a[i] = int(a[i])
And then use the average calculator in first code
if you question is like this
s=[[1,3],[1,3],[2,4],[3],[4]]
then you can write code as
for i in s:
print(int(sum(i)/len(i)))
the output will be
2
2
3
3
4