Python List is adding my first item twice
Question:
I’ve been trying to debug this problem most of this afternoon and I can’t figure out what I am doing wrong. I have a yaml file read into a list of strings and I am trying to load up that data into some custom objects.
from typing import Any, List
import yaml
class Bar:
def __init__(self, yml_node:Any):
self.name = yml_node["name"]
self.obj_type = yml_node["obj_type"]
def __str__(self) -> str:
return f'{{ "name": "{self.name}", "type": "{self.obj_type}" }}'
class Baz:
def __init__(self, yml_node:Any):
self.name = yml_node["name"]
self.obj_type = yml_node["obj_type"]
def __str__(self) -> str:
return f'{{ "name": "{self.name}", "type": "{self.obj_type}" }}'
class Foo:
def __init__(self, yml_node:Any):
self.bars:List[Bar] = []
self.bazs:List[Baz] = []
print(f"self1 = {self}")
for b in yml_node["bars"]:
print(f"Adding {b}")
new_bar_obj = Bar(b)
print(f"new_bar_obj = {new_bar_obj}")
self.bars.extend([new_bar_obj])
print(f"self(in for) = {self}")
print(f"self2 = {self}")
for b in yml_node["bazs"]:
print(f"Adding topic {b}")
new_baz_obj = Baz(b)
print(f"new_baz_obj = {new_baz_obj}")
self.bazs.extend([new_baz_obj])
print(f"self3 = {self}")
def __str__(self) -> str:
return f'{{ "bars": {stringify_list(self.bars)}, "bazs": {stringify_list(self.bazs)} }}'
def stringify_list(l:List[Any]) -> str:
first = True
rtn_str = "[n"
for i in l:
if(first):
rtn_str += f"{i}"
first = False
rtn_str += f", {i}"
rtn_str += "]n"
return rtn_str
def get_obj(file_content:List[str], obj_name:str) -> Foo:
yml_file = yaml.safe_load("".join(file_content))
print(f"Yaml Object: {yml_file[obj_name]}")
rtn = Foo(yml_file[obj_name])
print(f"Returning Object: {rtn}")
return rtn
yml_file_string = """
---
foo1:
bars:
- name: foo1_bar_1
obj_type: bar
- name: foo1_bar_2
obj_type: bar
bazs:
- name: foo1_baz_1
obj_type: baz
- name: foo1_baz_2
obj_type: baz
"""
yml_file_lines = yml_file_string.splitlines(True)
foo1_obj = get_obj(yml_file_lines, "foo1")
print(f"foo1_obj = {foo1_obj}")
I expect this to work by loading the yaml file lines into a yaml_file
object, then passing in yaml_file["foo1"]
part into the Foo constructor. That all happens in the get_obj()
function.
I expect the constructor to return foo1_obj
which will contain two lists (bars
and bazs
) which will each contain two Bar
or Baz
objects. What I am getting instead is 3 Bar
or Baz
objects in each list. The first item in each list is repeated 1 time. What am I doing wrong?
Here is the output I get instead:
Yaml Object: {'bars': [{'name': 'foo1_bar_1', 'obj_type': 'bar'}, {'name': 'foo1_bar_2', 'obj_type': 'bar'}], 'bazs': [{'name': 'foo1_baz_1', 'obj_type': 'baz'}, {'name': 'foo1_baz_2', 'obj_type': 'baz'}]}
self1 = { "bars": [
]
, "bazs": [
]
}
Adding {'name': 'foo1_bar_1', 'obj_type': 'bar'}
new_bar_obj = { "name": "foo1_bar_1", "type": "bar" }
self(in for) = { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }]
, "bazs": [
]
}
Adding {'name': 'foo1_bar_2', 'obj_type': 'bar'}
new_bar_obj = { "name": "foo1_bar_2", "type": "bar" }
self(in for) = { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_2", "type": "bar" }]
, "bazs": [
]
}
self2 = { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_2", "type": "bar" }]
, "bazs": [
]
}
Adding topic {'name': 'foo1_baz_1', 'obj_type': 'baz'}
new_baz_obj = { "name": "foo1_baz_1", "type": "baz" }
Adding topic {'name': 'foo1_baz_2', 'obj_type': 'baz'}
new_baz_obj = { "name": "foo1_baz_2", "type": "baz" }
self3 = { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_2", "type": "bar" }]
, "bazs": [
{ "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_2", "type": "baz" }]
}
Returning Object: { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_2", "type": "bar" }]
, "bazs": [
{ "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_2", "type": "baz" }]
}
foo1_obj = { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_2", "type": "bar" }]
, "bazs": [
{ "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_2", "type": "baz" }]
}
Try the code here (most of the print()’s in the code were me trying to figure out where the extra object was getting added to the lists):
https://trinket.io/python3/877a72fe9b
Answers:
You’ve only got one copy in each, but your stringify_list
function is broken; this code:
for i in l:
if(first):
rtn_str += f"{i}"
first = False
rtn_str += f", {i}"
adds i
(without a prefix) on the first loop, and also unconditionally adds it with a ,
prefix on every loop. Thus, the first item ends up added twice.
Change it to:
for i in l:
if first:
rtn_str += f"{i}"
first = False
else:
rtn_str += f", {i}"
and the problem will go away. That said, manually implementing such a function is kind of silly; I might suggest looking at the pprint
module if you want custom formatted list
output, rather than reimplementing it from scratch like this.
Note that there are much easier and more efficient ways to do this even if pprint
isn’t an option, if nothing else, the loop could be removed in favor of:
rtn_str += ', '.join(map(str, l))
which efficiently converts them all to strings and joins them all together in a bulk operation that costs O(n)
rather than the O(n²)
work that repeated string concatenation costs (even if CPython optimizes it a bit, it’s not wise to rely on it, it’s a very brittle optimization).
Your problem is in this function:
def stringify_list(l:List[Any]) -> str:
first = True
rtn_str = "[n"
for i in l:
if(first):
rtn_str += f"{i}"
first = False
rtn_str += f", {i}"
rtn_str += "]n"
return rtn_str
As you can see, you unconditionally add the , {i}
part of your string, regardless of it being the first item or not. Simply putting the "non first" concatenation in an else
clause would fix it:
for i in l:
if(first):
rtn_str += f"{i}"
first = False
else:
rtn_str += f", {i}"
However, although the use of the first
state variable is somewhat ingenuous, if a bit incorrect in your case, Python is a language that often frees one from these basic tasks. In this case, the simpler way of doing it is by using the .join
method of strings: it automatically formats separator-embedded items in a nice way, with no extra logic needed to exclude the separator before the first item. Join combined with an f-string means your function can be simply stated as:
def stringify_list(l:List[Any]) -> str:
return f"[n{', '.join(str(i) for i in l)}]n"
I’ve been trying to debug this problem most of this afternoon and I can’t figure out what I am doing wrong. I have a yaml file read into a list of strings and I am trying to load up that data into some custom objects.
from typing import Any, List
import yaml
class Bar:
def __init__(self, yml_node:Any):
self.name = yml_node["name"]
self.obj_type = yml_node["obj_type"]
def __str__(self) -> str:
return f'{{ "name": "{self.name}", "type": "{self.obj_type}" }}'
class Baz:
def __init__(self, yml_node:Any):
self.name = yml_node["name"]
self.obj_type = yml_node["obj_type"]
def __str__(self) -> str:
return f'{{ "name": "{self.name}", "type": "{self.obj_type}" }}'
class Foo:
def __init__(self, yml_node:Any):
self.bars:List[Bar] = []
self.bazs:List[Baz] = []
print(f"self1 = {self}")
for b in yml_node["bars"]:
print(f"Adding {b}")
new_bar_obj = Bar(b)
print(f"new_bar_obj = {new_bar_obj}")
self.bars.extend([new_bar_obj])
print(f"self(in for) = {self}")
print(f"self2 = {self}")
for b in yml_node["bazs"]:
print(f"Adding topic {b}")
new_baz_obj = Baz(b)
print(f"new_baz_obj = {new_baz_obj}")
self.bazs.extend([new_baz_obj])
print(f"self3 = {self}")
def __str__(self) -> str:
return f'{{ "bars": {stringify_list(self.bars)}, "bazs": {stringify_list(self.bazs)} }}'
def stringify_list(l:List[Any]) -> str:
first = True
rtn_str = "[n"
for i in l:
if(first):
rtn_str += f"{i}"
first = False
rtn_str += f", {i}"
rtn_str += "]n"
return rtn_str
def get_obj(file_content:List[str], obj_name:str) -> Foo:
yml_file = yaml.safe_load("".join(file_content))
print(f"Yaml Object: {yml_file[obj_name]}")
rtn = Foo(yml_file[obj_name])
print(f"Returning Object: {rtn}")
return rtn
yml_file_string = """
---
foo1:
bars:
- name: foo1_bar_1
obj_type: bar
- name: foo1_bar_2
obj_type: bar
bazs:
- name: foo1_baz_1
obj_type: baz
- name: foo1_baz_2
obj_type: baz
"""
yml_file_lines = yml_file_string.splitlines(True)
foo1_obj = get_obj(yml_file_lines, "foo1")
print(f"foo1_obj = {foo1_obj}")
I expect this to work by loading the yaml file lines into a yaml_file
object, then passing in yaml_file["foo1"]
part into the Foo constructor. That all happens in the get_obj()
function.
I expect the constructor to return foo1_obj
which will contain two lists (bars
and bazs
) which will each contain two Bar
or Baz
objects. What I am getting instead is 3 Bar
or Baz
objects in each list. The first item in each list is repeated 1 time. What am I doing wrong?
Here is the output I get instead:
Yaml Object: {'bars': [{'name': 'foo1_bar_1', 'obj_type': 'bar'}, {'name': 'foo1_bar_2', 'obj_type': 'bar'}], 'bazs': [{'name': 'foo1_baz_1', 'obj_type': 'baz'}, {'name': 'foo1_baz_2', 'obj_type': 'baz'}]}
self1 = { "bars": [
]
, "bazs": [
]
}
Adding {'name': 'foo1_bar_1', 'obj_type': 'bar'}
new_bar_obj = { "name": "foo1_bar_1", "type": "bar" }
self(in for) = { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }]
, "bazs": [
]
}
Adding {'name': 'foo1_bar_2', 'obj_type': 'bar'}
new_bar_obj = { "name": "foo1_bar_2", "type": "bar" }
self(in for) = { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_2", "type": "bar" }]
, "bazs": [
]
}
self2 = { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_2", "type": "bar" }]
, "bazs": [
]
}
Adding topic {'name': 'foo1_baz_1', 'obj_type': 'baz'}
new_baz_obj = { "name": "foo1_baz_1", "type": "baz" }
Adding topic {'name': 'foo1_baz_2', 'obj_type': 'baz'}
new_baz_obj = { "name": "foo1_baz_2", "type": "baz" }
self3 = { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_2", "type": "bar" }]
, "bazs": [
{ "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_2", "type": "baz" }]
}
Returning Object: { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_2", "type": "bar" }]
, "bazs": [
{ "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_2", "type": "baz" }]
}
foo1_obj = { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_2", "type": "bar" }]
, "bazs": [
{ "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_2", "type": "baz" }]
}
Try the code here (most of the print()’s in the code were me trying to figure out where the extra object was getting added to the lists):
https://trinket.io/python3/877a72fe9b
You’ve only got one copy in each, but your stringify_list
function is broken; this code:
for i in l:
if(first):
rtn_str += f"{i}"
first = False
rtn_str += f", {i}"
adds i
(without a prefix) on the first loop, and also unconditionally adds it with a ,
prefix on every loop. Thus, the first item ends up added twice.
Change it to:
for i in l:
if first:
rtn_str += f"{i}"
first = False
else:
rtn_str += f", {i}"
and the problem will go away. That said, manually implementing such a function is kind of silly; I might suggest looking at the pprint
module if you want custom formatted list
output, rather than reimplementing it from scratch like this.
Note that there are much easier and more efficient ways to do this even if pprint
isn’t an option, if nothing else, the loop could be removed in favor of:
rtn_str += ', '.join(map(str, l))
which efficiently converts them all to strings and joins them all together in a bulk operation that costs O(n)
rather than the O(n²)
work that repeated string concatenation costs (even if CPython optimizes it a bit, it’s not wise to rely on it, it’s a very brittle optimization).
Your problem is in this function:
def stringify_list(l:List[Any]) -> str:
first = True
rtn_str = "[n"
for i in l:
if(first):
rtn_str += f"{i}"
first = False
rtn_str += f", {i}"
rtn_str += "]n"
return rtn_str
As you can see, you unconditionally add the , {i}
part of your string, regardless of it being the first item or not. Simply putting the "non first" concatenation in an else
clause would fix it:
for i in l:
if(first):
rtn_str += f"{i}"
first = False
else:
rtn_str += f", {i}"
However, although the use of the first
state variable is somewhat ingenuous, if a bit incorrect in your case, Python is a language that often frees one from these basic tasks. In this case, the simpler way of doing it is by using the .join
method of strings: it automatically formats separator-embedded items in a nice way, with no extra logic needed to exclude the separator before the first item. Join combined with an f-string means your function can be simply stated as:
def stringify_list(l:List[Any]) -> str:
return f"[n{', '.join(str(i) for i in l)}]n"