Python opening/loading yaml file changes values (numbers ?) if they contain a colon : and less than 3 digits after the colon
Question:
Simple example.yml
file
Base:
StartTime: 645:0
EndTimes: 645:023
MidTimes: 645:02
mac: 99:19:b9:fa:37:99
MissionStartTimestamp: -2037:14522
MissionEndTimestamp: -2037:14522
When it is loaded into python
import yaml
with open("example.yml", 'r') as file:
example_ = yaml.safe_load(file)
print(yaml.dump(example_, default_flow_style=False))
results:
Base:
EndTimes: 645:023
MidTimes: 38702
MissionEndTimestamp: -2037:14522
MissionStartTimestamp: -2037:14522
StartTime: 38700
mac: 99:19:b9:fa:37:99
for whatever reason, any "number" value with a single colon that has 2 or fewer trailing digits gets converted to another "number"…
also tried:
import yaml
with open("example.yml", 'r') as file:
example_ = yaml.load(file, Loader=yaml.CLoader)
print(yaml.dump(example_, default_flow_style=False))
same results (same with Loader=yaml.CSafeLoader
, CFullLoader
, CUnsafeLoader
)
the other loader, has different results,
CBaseLoader
turns it into a single quote string:
Base:
EndTimes: 645:023
MidTimes: '645:02'
MissionEndTimestamp: -2037:14522
MissionStartTimestamp: -2037:14522
StartTime: '645:0'
mac: 99:19:b9:fa:37:99
Looks like CBaseLoader
is the best, but adding the single quotes isn’t great, will now have to add another step to strip those quotes… any way around this? to get it to load as the other values load.
UPDATE#1
Based on @ubaumann’s answer, I’ve add this follow up.
install ruamel.yaml – conda install -c conda-forge ruamel.yaml
or pip install ruamel.yaml
change the file header info
import sys
from ruamel.yaml import YAML
yaml=YAML(typ="rt")
and the open/dump calls
with open("example.yml", 'r') as file:
example_ = yaml.load(file)
yaml.dump(example_, sys.stdout)
result
Base:
StartTime: 645:0000
EndTimes: 645:023
MidTimes: 645:02
mac: 99:19:b9:fa:37:99
MissionStartTimestamp: -2037:14522
MissionEndTimestamp: -2037:14522
if you modify the line yaml=YAML(typ="rt")
to yaml=YAML(typ="safe")
you’ll get all of them in strings:
Base: {EndTimes: '645:023', MidTimes: '645:02', MissionEndTimestamp: '-2037:14522',
MissionStartTimestamp: '-2037:14522', StartTime: '645:0000', mac: '99:19:b9:fa:37:99'}
Answers:
YAML integers can be formatted in different ways and using the :
will interpret it as sexagesimal (base 60)
https://yaml.org/type/int.html
Using “:” allows expressing integers in base 60, which is convenient for time and angle values
PyYAML parses as subset of YAML 1.1 and in that specification there are sexagesimal numbers, essentially for processing values with minutes and seconds (like time, arcs). Since this let to a lot of confusion this was quickly dropped from the YAML 1.2 specification, but PyYAML was never upgraded since 2009 when that spec came out.
You can upgrade to a YAML 1.2 like my ruamel.yaml and get the result you expect.
Simple example.yml
file
Base:
StartTime: 645:0
EndTimes: 645:023
MidTimes: 645:02
mac: 99:19:b9:fa:37:99
MissionStartTimestamp: -2037:14522
MissionEndTimestamp: -2037:14522
When it is loaded into python
import yaml
with open("example.yml", 'r') as file:
example_ = yaml.safe_load(file)
print(yaml.dump(example_, default_flow_style=False))
results:
Base:
EndTimes: 645:023
MidTimes: 38702
MissionEndTimestamp: -2037:14522
MissionStartTimestamp: -2037:14522
StartTime: 38700
mac: 99:19:b9:fa:37:99
for whatever reason, any "number" value with a single colon that has 2 or fewer trailing digits gets converted to another "number"…
also tried:
import yaml
with open("example.yml", 'r') as file:
example_ = yaml.load(file, Loader=yaml.CLoader)
print(yaml.dump(example_, default_flow_style=False))
same results (same with Loader=yaml.CSafeLoader
, CFullLoader
, CUnsafeLoader
)
the other loader, has different results,
CBaseLoader
turns it into a single quote string:
Base:
EndTimes: 645:023
MidTimes: '645:02'
MissionEndTimestamp: -2037:14522
MissionStartTimestamp: -2037:14522
StartTime: '645:0'
mac: 99:19:b9:fa:37:99
Looks like CBaseLoader
is the best, but adding the single quotes isn’t great, will now have to add another step to strip those quotes… any way around this? to get it to load as the other values load.
UPDATE#1
Based on @ubaumann’s answer, I’ve add this follow up.
install ruamel.yaml – conda install -c conda-forge ruamel.yaml
or pip install ruamel.yaml
change the file header info
import sys
from ruamel.yaml import YAML
yaml=YAML(typ="rt")
and the open/dump calls
with open("example.yml", 'r') as file:
example_ = yaml.load(file)
yaml.dump(example_, sys.stdout)
result
Base:
StartTime: 645:0000
EndTimes: 645:023
MidTimes: 645:02
mac: 99:19:b9:fa:37:99
MissionStartTimestamp: -2037:14522
MissionEndTimestamp: -2037:14522
if you modify the line yaml=YAML(typ="rt")
to yaml=YAML(typ="safe")
you’ll get all of them in strings:
Base: {EndTimes: '645:023', MidTimes: '645:02', MissionEndTimestamp: '-2037:14522',
MissionStartTimestamp: '-2037:14522', StartTime: '645:0000', mac: '99:19:b9:fa:37:99'}
YAML integers can be formatted in different ways and using the :
will interpret it as sexagesimal (base 60)
https://yaml.org/type/int.html
Using “:” allows expressing integers in base 60, which is convenient for time and angle values
PyYAML parses as subset of YAML 1.1 and in that specification there are sexagesimal numbers, essentially for processing values with minutes and seconds (like time, arcs). Since this let to a lot of confusion this was quickly dropped from the YAML 1.2 specification, but PyYAML was never upgraded since 2009 when that spec came out.
You can upgrade to a YAML 1.2 like my ruamel.yaml and get the result you expect.