Brightway2 – Unlinked and missing cfs when importing Simapro LCIA methods

Question:

I am importing Swiss building database UVEK’s LCIA methods to Brightway with SimaProLCIACSVImporter()

Code:

lcia = SimaProLCIACSVImporter(
    "C:\Users\...\UVEK_Simapro_LCIA_2022.CSV",
    biosphere="biosphere3"
)
lcia.apply_strategies()
lcia.statistics()
print("size biosphere3: {0}".format(str(len(Database("biosphere3")))))

Results:

Extracted 34 methods in 0.91 seconds
Applying strategy: normalize_units
Applying strategy: set_biosphere_type
Applying strategy: normalize_simapro_biosphere_categories
Applying strategy: normalize_simapro_biosphere_names
Applying strategy: set_biosphere_type
Applying strategy: drop_unspecified_subcategories
Applying strategy: normalize_biosphere_categories
Applying strategy: normalize_biosphere_names
Applying strategy: link_iterable_by_fields
Applying strategy: match_subcategories
Applied 10 strategies in 0.87 seconds
34 methods
18229 cfs
14312 unlinked cfs

size biosphere3: 4427

I then use add_missing_cfs() with the idea to add the missing flows to the biosphere3 database (in order to easily import the LCI datasets built over those flows).

Code:

lcia.add_missing_cfs()
lcia.statistics()
print("size biosphere3: {0}".format(str(len(Database("biosphere3")))))

Results:

Vacuuming database 
Writing activities to SQLite3 database:
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01
Title: Writing activities to SQLite3 database:
  Started: 06/13/2022 12:23:24
  Finished: 06/13/2022 12:23:25
  Total time elapsed: 00:00:01
  CPU %: 79.20
  Memory %: 1.85
Added 7156 new biosphere flows
34 methods
18229 cfs
14312 unlinked cfs

size biosphere3: 11583

The results shows that the number of unlinked flows is unchanged (~14000). New flows have been added to the database (~7000) but it doesn’t equal the number of unlinked cfs. Maybe I misunderstood unlinked flows and missing cfs…

Questions:

What is the relation between biosphere flows, unlinked cfs and missing cfs that have been added to the biosphere db ?

What is the best way to "complete" the biosphere3 db with the missing flows defined in the imported LCIA methods in order to have all the cfs linked ?

Asked By: Mija Frossard

||

Answers:

An excellent question, but unfortunately not one with an easy answer. This is something I am looking into, but it will take some time, as it needs to be done correctly.

You probably already know this, but in case you don’t – what is matching? We need to link the text attributes which identify a product, flow, or activity, with an object in our relational database. In theory, these attributes should match, and our job is easy. It becomes harder when people use inconsistent or incorrect attributes for what are supposed to be the same objects.

Different players in the LCA world do their best to make their data and software easy to use, but sometimes this means that the different players make changes to things like names, location identifiers, etc. Moreover, there are different starting lists of names.

The default data in Brightway (what gets installed when you call bw2io.bw2setup() is from ecoinvent version 3.8. This isn’t "correct", it is just a default. The database biosphere3 is from ecoinvent version 3. But this isn’t the same as UVEK, which is based on ecoinvent version 2.

The UVEK database is self-contained and internally consistent, and its LCIA method characterization factors should match the flow names of the UVEK database itself (at least as long as they come from the same source, e.g. SimaPro CSV export). So the best way to use this LCI/LCIA in Brightway would be to use these data in their own set of Brightway databases.

There will be a project to natively implement UVEK and its LCIA factors in Brightway, but this will only happen by the end of July (at the earliest).

Answered By: Chris Mutel

I managed to overcome this problem, but I am not sure if this is the 100% correct way. Nevertheless, I thought I would share my solution with you. Maybe this also serves @Chris Mutel as possible bug fixing…

I had the same problem as discussed in this issue. Some background info: In my case, I created an empty biosphere3 database, then imported my specific LCIA method using LCIACSVImporter which resulted in 12’621 unlinked characterization factors. I then applied the strategy add_missing_cfs, which added the missing biosphere flows successfully to the biosphere3 database (Note: I have checked this by extracting all the biosphere3 flows to an excel table, which resulted in a table with 12’621 rows. I therefore assume the addition has worked successfully).

However, although applying the strategies (apply.strategies()) again, I still got 12’200 unlinked characterization factors although the biosphere flows were there in the biosphere3 database. Having looked at the added biosphere flows, I then saw that the CAS-Nr. was not imported by the LCIACSVImporter, although the information had been there in the raw CSV file. The problem therefore had to be in Brightway’s attempt to link data.

The solution to the problem was (at least in my case), to tell Brightway that the linking should be based solely on the code parameter. And it worked! I’m not sure, what standard procedure is defined by Brightway for linking. I assume, that a combination of the parameters name, CAS_number, categories and unit is used. If this is true, then it would make very much sense, that linkage would not work, because some information such as CAS_number is currently not available because it is not imported by the LCIACSVImporter.

So, how did I tell Brightway to link solely on the code parameter. I applied the following:

Import your LCIA method as you would normally:

my_method = bw.SimaProLCIACSVImporter('SimaPro_LCIA_file.csv', biosphere = "biosphere3")
my_method.apply_strategies()
my_method.statistics()

After that, write the missing characterization factors:

my_method.add_missing_cfs()
my_method.apply_strategies()

Linking on the code parameter can be done this way:

import functools
from bw2io.strategies.generic import link_iterable_by_fields

my_method.apply_strategy(functools.partial(
                         link_iterable_by_fields,
                         other = (obj for obj in bw.Database("biosphere3")),
                         kind = "biosphere",
                         fields = ["code"]
                         ))
my_method.statistics()
my_method.write()

Hope that helps!

Answered By: Cédric Furrer
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.