How to edit a file with multiple YAML documents in Python

Question:

I have the following YAML file:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nodejs
  namespace: test
  labels:
    app: hello-world
spec:
  selector:
    matchLabels:
      app: hello-world
  replicas: 100
  template:
    metadata:
      labels:
        app: hello-world
    spec:
      containers:
      - name: hello-world
        image: test/first:latest
        ports:
        - containerPort: 80
        resources:
          limits:
            memory: 2500Mi
            cpu: "2500m"
          requests:
            memory: 12Mi
            cpu: "80m"
---
apiVersion: v1
kind: Service
metadata:
  name: nodejs
spec:
  selector:
    app: hello-world
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
      nodePort: 30082   
  type: NodePort

I need to edit the YAML file using Python, I have tried the code below but it is not working for a file with multiple YAML documents. you can see the below image:
enter image description here

import ruamel.yaml

yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
yaml.explicit_start =  True

with open(r"D:deployment.yml") as stream:
   data = yaml.load_all(stream)

test = data[0]['metadata']
test.update(dict(name="Tom1"))
test.labels(dict(name="Tom1"))

test = data['spec']
test.update(dict(name="sfsdf"))

with open(r"D:deploymentCopy.yml", 'wb') as stream:
    yaml.dump(data, stream)

you can refer the link for more info : Python: Replacing a String in a YAML file

Asked By: microset

||

Answers:

"It is not working" is not very specific description of what is the problem.

load_all() yields each document, so you would normally use it using:

for data in yaml.load_all(stream):
    # work on the data of each individual document

if you want all the data in an indexable list, as you do, you have to list() to make a list of the generated data:

     data = list(yaml.load_all(stream))

If you load a number of documents in variable data with .load_all() it is more than likely that you don’t want to dump data into a single object (using .dump()), but instead want to use .dump_all(), so you get each element of data dumped in a seperate document:

with open(r"D:deploymentCopy.yaml", 'wb') as stream:
    yaml.dump(data, stream)

ruamel.yaml cannot distinguish between dumping a data structure that has a list (i.e. YAML sequence) at its root or dumping a list of data structures that should go in different documents. So you have to make that distinction using .dump() resp. .dump_all()

Apart from that, the official YAML FAQ on the yaml.org website indicates that the recommended extension for files with YAML documents is .yaml . There are probably some projects that have not been updated since this became the recommendation (16 years ago, i.e. at least since September 2006).

Answered By: Anthon

I have added full script with reference of Anthon’s Answer

Please find below script for the reference:

import ruamel.yaml

yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
yaml.explicit_start =  True

with open(r"D:deployment.yml") as stream:
    data=list(yaml.load_all(stream))

data[0]['metadata']['namespace']="namespace"
data[0]['metadata']['labels']['app']="namespace"
data[0]['spec']['template']['spec']['containers'][0]['name']="test"

data[1]['spec']['selector']['app']="test"

with open(r"D:deploymentCopy.yml", 'wb') as stream:
    yaml.dump_all(data, stream)
Answered By: microset