python use Pyyaml and keep format

Question:

Here is a config file, I use PyYAML to change some value from it and then I write some config, but it will change my format, it confuses me.

 $ results.yaml 
 nas:
     mount_dir: '/nvr'
     mount_dirs: ['/mount/data0', '/mount/data1', '/mount/data2']

# yaml.py

import yaml.py

conf = open("results.conf", "r")
results = yaml.load(conf)
conf.close()

result['nas']['mount_dirs'][0]= "haha"

with open('/home/zonion/speedio/speedio.conf', 'w') as conf:
    yaml.dump(speedio, conf, default_flow_style=False)

conf.close()

but it change my format,what should I do?

# cat results.conf
nas:
  mount_dir: /nvr
  mount_dirs:
  - haha
  - /mount/data1
  - /mount/data2
Asked By: haroldT

||

Answers:

ruamel implements a round-trip loader and dumper, try:

import ruamel.yaml
conf = open("results.conf", "r")
results = ruamel.yaml.load(conf, ruamel.yaml.RoundTripLoader)
conf.close()
results['nas']['mount_dirs'][0] = "haha"
with open('/home/zonion/speedio/speedio.conf', 'w') as conf:
  ruamel.yaml.dump(results, conf, ruamel.yaml.RoundTripDumper)
Answered By: flyx

If you use ruamel.yaml ¹, you can relatively easily achieve this, by combining this and this answer here on StackOverlow.

By default ruamel.yaml normalizes to an indent of 2, and drops superfluous quotes. As you don’t seem to want that, you have to either explicitly set the indent, or have ruamel.yaml analyse the input, and tell it to preserve quotes:

import sys
import ruamel.yaml
import ruamel.yaml.util

yaml_str = """
nas:
    mount_dir: '/nvr'
    mount_dirs: ['/mount/data0', '/mount/data1', '/mount/data2']
"""

result, indent, block_seq_indent = ruamel.yaml.util.load_yaml_guess_indent(
    yaml_str, preserve_quotes=True)
result['nas']['mount_dirs'][0] = "haha"
ruamel.yaml.round_trip_dump(result, sys.stdout, indent=indent,
                            block_seq_indent=block_seq_indent)

instead of the load_yaml_guess_indent() invocation you can do:

result = ruamel.yaml.round_trip_load(yaml_str, preserve_quotes=True)
indent = 4
block_sequence_indent = None 

If you want haha to be (single) quoted in the output make it a SingleQuotedScalarString:

result['nas']['mount_dirs'][0] = 
       ruamel.yaml.scalarstring.SingleQuotedScalarString("haha")

with that the output will be:

nas:
    mount_dir: '/nvr'
    mount_dirs: ['haha', '/mount/data1', '/mount/data2']

(given that your short example input has no block style sequences, the block_sequence_indent cannot be determined and will be None)


When using the newer API you have control over the indent of the mapping and sequences seperately:

yaml = ruamel.yaml.YAML()
yaml.indent(mapping=4, sequence=6, offset=3)  # not that that looks nice
data = yaml.load(some_stream)
yaml.dump(data, some_stream)

This will make your YAML formatted consistently if it wasn’t so to begin with, and make no further changes after the first round-trip.


¹ Disclaimer: I am the author of that package.

Answered By: Anthon

ruamel.yaml unfortunately does not completely preserve original format, quoting its docs:

Although individual indentation of lines is not preserved, you can
specify separate indentation levels for mappings and sequences
(counting for sequences does not include the dash for a sequence
element) and specific offset of block sequence dashes within that
indentation.

I do not know any Python library that does that.

When I need to change a YAML file without touching its format I reluctantly use regexp (reluctantly as it’s almost as bad as parsing XHTML with it).

Please feel free to suggest a better solution if you know any, I would gladly learn about it!

Answered By: Greg Dubicki

Try to load it first and the dump like that:

import ruamel.yaml
yaml_str = f"""
    nas:
        mount_dir: '/nvr'
        mount_dirs: ['/mount/data0', '/mount/data1', '/mount/data2']"""
yaml = ruamel.yaml.YAML()
data = yaml.load(yaml_str)

with open("test.yaml", 'w') as outfile:
    yaml.dump(data, outfile)
outfile.close()
Answered By: arseniyy123
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.