Python opening/loading yaml file changes values (numbers ?) if they contain a colon : and less than 3 digits after the colon

Question:

Simple example.yml file

Base:
    StartTime: 645:0
    EndTimes: 645:023
    MidTimes: 645:02
    mac: 99:19:b9:fa:37:99
    MissionStartTimestamp: -2037:14522
    MissionEndTimestamp: -2037:14522

When it is loaded into python

import yaml

with open("example.yml", 'r') as file:
    example_ = yaml.safe_load(file)
print(yaml.dump(example_, default_flow_style=False))

results:

Base:
  EndTimes: 645:023
  MidTimes: 38702
  MissionEndTimestamp: -2037:14522
  MissionStartTimestamp: -2037:14522
  StartTime: 38700
  mac: 99:19:b9:fa:37:99

for whatever reason, any "number" value with a single colon that has 2 or fewer trailing digits gets converted to another "number"…

also tried:

import yaml

with open("example.yml", 'r') as file:
    example_ = yaml.load(file, Loader=yaml.CLoader)
print(yaml.dump(example_, default_flow_style=False))

same results (same with Loader=yaml.CSafeLoader, CFullLoader, CUnsafeLoader)

the other loader, has different results,
CBaseLoader turns it into a single quote string:

Base:
  EndTimes: 645:023
  MidTimes: '645:02'
  MissionEndTimestamp: -2037:14522
  MissionStartTimestamp: -2037:14522
  StartTime: '645:0'
  mac: 99:19:b9:fa:37:99

Looks like CBaseLoader is the best, but adding the single quotes isn’t great, will now have to add another step to strip those quotes… any way around this? to get it to load as the other values load.

UPDATE#1

Based on @ubaumann’s answer, I’ve add this follow up.

install ruamel.yaml – conda install -c conda-forge ruamel.yaml or pip install ruamel.yaml

change the file header info

import sys
from ruamel.yaml import YAML
yaml=YAML(typ="rt")

and the open/dump calls

with open("example.yml", 'r') as file:
    example_ = yaml.load(file)
yaml.dump(example_, sys.stdout)

result


Base:
  StartTime: 645:0000
  EndTimes: 645:023
  MidTimes: 645:02
  mac: 99:19:b9:fa:37:99
  MissionStartTimestamp: -2037:14522
  MissionEndTimestamp: -2037:14522

if you modify the line yaml=YAML(typ="rt") to yaml=YAML(typ="safe") you’ll get all of them in strings:


Base: {EndTimes: '645:023', MidTimes: '645:02', MissionEndTimestamp: '-2037:14522',
  MissionStartTimestamp: '-2037:14522', StartTime: '645:0000', mac: '99:19:b9:fa:37:99'}

Asked By: Biaspoint

||

Answers:

YAML integers can be formatted in different ways and using the : will interpret it as sexagesimal (base 60)

https://yaml.org/type/int.html

Using “:” allows expressing integers in base 60, which is convenient for time and angle values

Answered By: ubaumann

PyYAML parses as subset of YAML 1.1 and in that specification there are sexagesimal numbers, essentially for processing values with minutes and seconds (like time, arcs). Since this let to a lot of confusion this was quickly dropped from the YAML 1.2 specification, but PyYAML was never upgraded since 2009 when that spec came out.

You can upgrade to a YAML 1.2 like my ruamel.yaml and get the result you expect.

Answered By: Anthon
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.