Preserving long single line string as is when round-triping in ruamel

Question:

In the below code, I’m trying to write a load and write a YAML string back to ensure that it retains the spacing as is.

import ruamel.yaml

yaml_str = """
long_single_line_text: Hello, there. This is nothing but a long single line text which is more that a 100 characters long
"""

yaml = ruamel.yaml.YAML()  # defaults to round-trip
yaml.preserve_quotes = True
yaml.allow_duplicate_keys = True
yaml.explicit_start = True

data = yaml.load(yaml_str)
yaml.dump(data, sys.stdout)

And the result is

long_single_line_text: Hello, there. This is nothing but a long single line text which
  is more that a 100 characters long

Here the line breaks at around character 87, I’m not sure if this is a setting that can be configured but having the long line as is would help me not have huge diffs when adding new keys.

If I set to a longer width via yaml.width then the multi-line string become a long single string, so can’t do that.

Is there anyway I can keep the string as in for long single line scalars?

Asked By: thebenman

||

Answers:

There is only one parameter for wrapping .width. It determines the wrapping point for scalars (quoted and unquoted) and for flow style collections. If your input was wrapped inconsistently then that is normalized, just like inconsistent indentation would be.

One of the reasons ruamel.yaml was changed to use a new API is that before (as in PyYAML), introducing new parameters was difficult, since they needed to be passed from Loader instances to all the other instances it invokes ( Parser, Constructor, Composer, Resolver, Serializer, etc.) during construction, resulting in code changes in many files.

The .width parameter controls self.best_width with the Emitter. In the answer here the altnernative writer for double_width_scalars uses that in the if statement:

        if (
            0 < end < len(text) - 1
            and (ch == u' ' or start >= end)
            and self.column + (end - start) > self.best_width
            and split
        ):

If you replace self.best_width in there by self.dumper.dqwidth and replace yaml.width = 27 with yaml.dqwidth = 27 then only your double quoted scalars will be wrapped at 27.

In general, in the new API, the instances used during loading and dumping are non-fleeting (and therefore have an explicit initialisation before every use) and within such an instance you can access the YAML() instance using self.loader resp self.dumper

Answered By: Anthon
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.