pop index out of range error in Python when modifying a copy of lists and dictionaries while iterating through the original dictionary
Question:
I have two Python lists. One list, called area = [1500, 500, 1500, 2000, 2500, 2000]
, stores the areas of homes. Another list, called price = [30000, 10000, 20000, 40000, 50000, 45000]
, stores the corresponding prices of these homes. I created a dictionary called priceDict
, whose keys are the unique areas of homes and whose values are lists of the prices of homes of the same area. I created another dictionary, called priceIndexDict
, whose keys are the unique areas of homes and whose values are lists of indices of the prices of homes of the same area as shown in the price
list.
I wrote the following Python code to create the two dictionaries.
area = [1200, 1200, 1200, 2000]
price = [15000, 11000, 17000, 25000]
priceDict = {}
priceIndexDict = {}
zipList = list(zip(area, price))
zipListIndex = 0
for k, v in zipList:
priceDict.setdefault(k, []).append(v)
priceIndexDict.setdefault(k, []).append(zipListIndex)
zipListIndex += 1
print(priceDict)
print(priceIndexDict)
After I correctly got the priceDict = {1200: [15000, 11000, 17000], 2000: [25000]}
and priceIndexDict = {1200: [0, 1, 2], 2000: [3]}
, I would then like to create the new list newPrice
and the new dictionary newPriceDict
with the same contents as price
and priceDict
but with outliers of house prices for homes of the same size removed, as well as the new list newArea
with the same content as “`area“ but with the corresponding areas for the house price outliers removed.
The following steps were used to determine the outliers:
- Select the home to test.
- Create a list of prices of other homes of the same size. It will be called compList in the examples.
- If there are no other homes of the same size, the house being tested is not an outlier.
- Otherwise:
- Calculate the mean price, P[m], and the standard deviation, σ, for the homes compList.
- If |price[i] – P[m]|> 3 * σ, the house is an outlier.
I wrote the following Python code. Note that function definitions for the mean()
, variance()
and stdev()
functions are not shown here.
newArea = area.copy()
newPrice = price.copy()
newPriceDict = priceDict.copy()
for (houseArea, housePrice) in priceDict.items():
if len(housePrice) == 1:
continue
for i in range(len(housePrice)):
compList = housePrice.copy()
compList.pop(i)
if abs(housePrice[i] - mean(compList)) > 3 * stdev(compList):
newArea.pop(priceIndexDict[houseArea][i])
newPrice.pop(priceIndexDict[houseArea][i])
newPriceDict[houseArea].pop(i)
print(newArea)
print(newPrice)
print(newPriceDict)
However, when I executed the code, the IDE displayed the following error:
line 44, in <module>
compList.pop(i)
IndexError: pop index out of range
How can I fix this error?
Answers:
newPriceDict[houseArea].pop(i)
modifies housePrice
. newPriceDict
is a shallow copy of priceDict
, so the values in the dictionary are the same as the values in PriceDict
.
Use copy.deepcopy()
to make a deep copy.
from copy import deepcopy
from statistics import mean, stdev
area = [1200, 1200, 1200, 2000]
price = [15000, 11000, 17000, 25000]
priceDict = {}
priceIndexDict = {}
zipList = list(zip(area, price))
zipListIndex = 0
for k, v in zipList:
priceDict.setdefault(k, []).append(v)
priceIndexDict.setdefault(k, []).append(zipListIndex)
zipListIndex += 1
print(priceDict)
print(priceIndexDict)
newArea = area.copy()
newPrice = price.copy()
newPriceDict = deepcopy(priceDict)
for (houseArea, housePrice) in priceDict.items():
if len(housePrice) == 1:
continue
for i in range(len(housePrice)):
compList = housePrice.copy()
compList.pop(i)
if abs(housePrice[i] - mean(compList)) > 3 * stdev(compList):
newArea.pop(priceIndexDict[houseArea][i])
newPrice.pop(priceIndexDict[houseArea][i])
newPriceDict[houseArea].pop(i)
print(newArea)
print(newPrice)
print(newPriceDict)
I have two Python lists. One list, called area = [1500, 500, 1500, 2000, 2500, 2000]
, stores the areas of homes. Another list, called price = [30000, 10000, 20000, 40000, 50000, 45000]
, stores the corresponding prices of these homes. I created a dictionary called priceDict
, whose keys are the unique areas of homes and whose values are lists of the prices of homes of the same area. I created another dictionary, called priceIndexDict
, whose keys are the unique areas of homes and whose values are lists of indices of the prices of homes of the same area as shown in the price
list.
I wrote the following Python code to create the two dictionaries.
area = [1200, 1200, 1200, 2000]
price = [15000, 11000, 17000, 25000]
priceDict = {}
priceIndexDict = {}
zipList = list(zip(area, price))
zipListIndex = 0
for k, v in zipList:
priceDict.setdefault(k, []).append(v)
priceIndexDict.setdefault(k, []).append(zipListIndex)
zipListIndex += 1
print(priceDict)
print(priceIndexDict)
After I correctly got the priceDict = {1200: [15000, 11000, 17000], 2000: [25000]}
and priceIndexDict = {1200: [0, 1, 2], 2000: [3]}
, I would then like to create the new list newPrice
and the new dictionary newPriceDict
with the same contents as price
and priceDict
but with outliers of house prices for homes of the same size removed, as well as the new list newArea
with the same content as “`area“ but with the corresponding areas for the house price outliers removed.
The following steps were used to determine the outliers:
- Select the home to test.
- Create a list of prices of other homes of the same size. It will be called compList in the examples.
- If there are no other homes of the same size, the house being tested is not an outlier.
- Otherwise:
- Calculate the mean price, P[m], and the standard deviation, σ, for the homes compList.
- If |price[i] – P[m]|> 3 * σ, the house is an outlier.
I wrote the following Python code. Note that function definitions for the mean()
, variance()
and stdev()
functions are not shown here.
newArea = area.copy()
newPrice = price.copy()
newPriceDict = priceDict.copy()
for (houseArea, housePrice) in priceDict.items():
if len(housePrice) == 1:
continue
for i in range(len(housePrice)):
compList = housePrice.copy()
compList.pop(i)
if abs(housePrice[i] - mean(compList)) > 3 * stdev(compList):
newArea.pop(priceIndexDict[houseArea][i])
newPrice.pop(priceIndexDict[houseArea][i])
newPriceDict[houseArea].pop(i)
print(newArea)
print(newPrice)
print(newPriceDict)
However, when I executed the code, the IDE displayed the following error:
line 44, in <module>
compList.pop(i)
IndexError: pop index out of range
How can I fix this error?
newPriceDict[houseArea].pop(i)
modifies housePrice
. newPriceDict
is a shallow copy of priceDict
, so the values in the dictionary are the same as the values in PriceDict
.
Use copy.deepcopy()
to make a deep copy.
from copy import deepcopy
from statistics import mean, stdev
area = [1200, 1200, 1200, 2000]
price = [15000, 11000, 17000, 25000]
priceDict = {}
priceIndexDict = {}
zipList = list(zip(area, price))
zipListIndex = 0
for k, v in zipList:
priceDict.setdefault(k, []).append(v)
priceIndexDict.setdefault(k, []).append(zipListIndex)
zipListIndex += 1
print(priceDict)
print(priceIndexDict)
newArea = area.copy()
newPrice = price.copy()
newPriceDict = deepcopy(priceDict)
for (houseArea, housePrice) in priceDict.items():
if len(housePrice) == 1:
continue
for i in range(len(housePrice)):
compList = housePrice.copy()
compList.pop(i)
if abs(housePrice[i] - mean(compList)) > 3 * stdev(compList):
newArea.pop(priceIndexDict[houseArea][i])
newPrice.pop(priceIndexDict[houseArea][i])
newPriceDict[houseArea].pop(i)
print(newArea)
print(newPrice)
print(newPriceDict)