Python List is adding my first item twice

Question:

I’ve been trying to debug this problem most of this afternoon and I can’t figure out what I am doing wrong. I have a yaml file read into a list of strings and I am trying to load up that data into some custom objects.

from typing import Any, List
import yaml

class Bar:
    def __init__(self, yml_node:Any):
        self.name = yml_node["name"]
        self.obj_type = yml_node["obj_type"]
    def __str__(self) -> str:
        return f'{{ "name": "{self.name}", "type": "{self.obj_type}" }}'
class Baz:
    def __init__(self, yml_node:Any):
        self.name = yml_node["name"]
        self.obj_type = yml_node["obj_type"]
    def __str__(self) -> str:
        return f'{{ "name": "{self.name}", "type": "{self.obj_type}" }}'

class Foo:
    def __init__(self, yml_node:Any):
        self.bars:List[Bar] = []
        self.bazs:List[Baz] = []

        print(f"self1 = {self}")
        for b in yml_node["bars"]:
            print(f"Adding {b}")
            new_bar_obj = Bar(b)
            print(f"new_bar_obj =  {new_bar_obj}")
            self.bars.extend([new_bar_obj])
            print(f"self(in for) = {self}")

        print(f"self2 = {self}")
        for b in yml_node["bazs"]:
            print(f"Adding topic {b}")
            new_baz_obj = Baz(b)
            print(f"new_baz_obj =  {new_baz_obj}")
            self.bazs.extend([new_baz_obj])
        print(f"self3 = {self}")

    def __str__(self) -> str:
        return f'{{ "bars": {stringify_list(self.bars)}, "bazs": {stringify_list(self.bazs)} }}'

def stringify_list(l:List[Any]) -> str:
    first = True

    rtn_str = "[n"
    for i in l:
        if(first):
            rtn_str += f"{i}"
            first = False
        rtn_str += f", {i}"

    rtn_str += "]n"
    return rtn_str

def get_obj(file_content:List[str], obj_name:str) -> Foo:
    yml_file = yaml.safe_load("".join(file_content))

    print(f"Yaml Object: {yml_file[obj_name]}")
    rtn = Foo(yml_file[obj_name])
    print(f"Returning Object: {rtn}")
    return rtn

yml_file_string = """
---
foo1:
    bars:
        - name: foo1_bar_1
          obj_type: bar
        - name: foo1_bar_2
          obj_type: bar
    bazs:
        - name: foo1_baz_1
          obj_type: baz
        - name: foo1_baz_2
          obj_type: baz
"""

yml_file_lines = yml_file_string.splitlines(True)

foo1_obj = get_obj(yml_file_lines, "foo1")
print(f"foo1_obj = {foo1_obj}")

I expect this to work by loading the yaml file lines into a yaml_file object, then passing in yaml_file["foo1"] part into the Foo constructor. That all happens in the get_obj() function.

I expect the constructor to return foo1_obj which will contain two lists (bars and bazs) which will each contain two Bar or Baz objects. What I am getting instead is 3 Bar or Baz objects in each list. The first item in each list is repeated 1 time. What am I doing wrong?

Here is the output I get instead:

Yaml Object: {'bars': [{'name': 'foo1_bar_1', 'obj_type': 'bar'}, {'name': 'foo1_bar_2', 'obj_type': 'bar'}], 'bazs': [{'name': 'foo1_baz_1', 'obj_type': 'baz'}, {'name': 'foo1_baz_2', 'obj_type': 'baz'}]}
self1 = { "bars": [
]
, "bazs": [
]
 }
Adding {'name': 'foo1_bar_1', 'obj_type': 'bar'}
new_bar_obj =  { "name": "foo1_bar_1", "type": "bar" }
self(in for) = { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }]
, "bazs": [
]
 }
Adding {'name': 'foo1_bar_2', 'obj_type': 'bar'}
new_bar_obj =  { "name": "foo1_bar_2", "type": "bar" }
self(in for) = { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_2", "type": "bar" }]
, "bazs": [
]
 }
self2 = { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_2", "type": "bar" }]
, "bazs": [
]
 }
Adding topic {'name': 'foo1_baz_1', 'obj_type': 'baz'}
new_baz_obj =  { "name": "foo1_baz_1", "type": "baz" }
Adding topic {'name': 'foo1_baz_2', 'obj_type': 'baz'}
new_baz_obj =  { "name": "foo1_baz_2", "type": "baz" }
self3 = { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_2", "type": "bar" }]
, "bazs": [
{ "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_2", "type": "baz" }]
 }
Returning Object: { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_2", "type": "bar" }]
, "bazs": [
{ "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_2", "type": "baz" }]
 }
foo1_obj = { "bars": [
{ "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_1", "type": "bar" }, { "name": "foo1_bar_2", "type": "bar" }]
, "bazs": [
{ "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_1", "type": "baz" }, { "name": "foo1_baz_2", "type": "baz" }]
 }

Try the code here (most of the print()’s in the code were me trying to figure out where the extra object was getting added to the lists):
https://trinket.io/python3/877a72fe9b

Asked By: Kris

||

Answers:

You’ve only got one copy in each, but your stringify_list function is broken; this code:

for i in l:
    if(first):
        rtn_str += f"{i}"
        first = False
    rtn_str += f", {i}"

adds i (without a prefix) on the first loop, and also unconditionally adds it with a , prefix on every loop. Thus, the first item ends up added twice.

Change it to:

for i in l:
    if first:
        rtn_str += f"{i}"
        first = False
    else:
        rtn_str += f", {i}"

and the problem will go away. That said, manually implementing such a function is kind of silly; I might suggest looking at the pprint module if you want custom formatted list output, rather than reimplementing it from scratch like this.

Note that there are much easier and more efficient ways to do this even if pprint isn’t an option, if nothing else, the loop could be removed in favor of:

rtn_str += ', '.join(map(str, l))

which efficiently converts them all to strings and joins them all together in a bulk operation that costs O(n) rather than the O(n²) work that repeated string concatenation costs (even if CPython optimizes it a bit, it’s not wise to rely on it, it’s a very brittle optimization).

Answered By: ShadowRanger

Your problem is in this function:

def stringify_list(l:List[Any]) -> str:
    first = True

    rtn_str = "[n"
    for i in l:
        if(first):
            rtn_str += f"{i}"
            first = False
        rtn_str += f", {i}"

    rtn_str += "]n"
    return rtn_str

As you can see, you unconditionally add the , {i} part of your string, regardless of it being the first item or not. Simply putting the "non first" concatenation in an else clause would fix it:

    for i in l:
        if(first):
            rtn_str += f"{i}"
            first = False
        else:
             rtn_str += f", {i}"
   

However, although the use of the first state variable is somewhat ingenuous, if a bit incorrect in your case, Python is a language that often frees one from these basic tasks. In this case, the simpler way of doing it is by using the .join method of strings: it automatically formats separator-embedded items in a nice way, with no extra logic needed to exclude the separator before the first item. Join combined with an f-string means your function can be simply stated as:

def stringify_list(l:List[Any]) -> str:
    return f"[n{', '.join(str(i) for i in l)}]n" 
   
Answered By: jsbueno
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.