Skip to content

Emissions#1754

Open
louispt1 wants to merge 42 commits into
masterfrom
emissions
Open

Emissions#1754
louispt1 wants to merge 42 commits into
masterfrom
emissions

Conversation

@louispt1

@louispt1 louispt1 commented May 21, 2026

Copy link
Copy Markdown
Member

Summary of changes

Emissions Calculation Framework
  • New DirectEmissions module implementing mass-balance equations for fossil and biogenic CO2
  • Tracks carbon flows: input content (A) + utilization (B) - output content (C) - capture (D) = emissions (E)
  • New MoleculeEmissions module for emissions reporting on molecule nodes
  • Support for CO2 capture (via ccs_capture_rate) and utilization (via co2_utilisation_per_mj)
Carbon Content Tracing
  • RecursiveFactor::DirectEmissions module for tracing CO2 content through supply chains
  • Handles mixed carriers (network gas, crude oil) by recursively calculating weighted composition
  • New emissions_skip_crude_oil_mix edge group for forcing weighted mix calculations
GQL & Dataset Integration
  • New EMISSIONS() GQL function for accessing emissions data from datasets
  • Emissions data loading infrastructure in Dataset::Import and Etsource::Loader
  • Graph-level emissions hash storage
Supporting Infrastructure
  • emissions node group for nodes participating in emissions tracking
  • ccus_captured node group for CO2 removal/capture technologies
    • Uses lazy memoization for ccus_captured? check to avoid Marshal serialization issues
  • Updated biogenic emissions (primary) to use free_co2_factor for capture calculations with consistent results
Other
  • Reporting methods return nil for non-emissions nodes (checked via with_emissions_node)
  • Atlas gem updated to support emissions data structures

Note: Atlas reference should be updated once the Atlas emissions branch has been merged.
Goes with:

@louispt1 louispt1 requested a review from noracato May 21, 2026 18:50
@kndehaan

Copy link
Copy Markdown
Member

Note that direct_co2_output_content_carriers_biogenic doesn't give the correct results yet, also leading to incorrect results for methods that depend on this one.

This has still to with the potential_co2_conversion_per_mj attribute missing on carriers, as a consequence going to recursion where it should not always be so. This is something that we still need to look into.

Copy link
Copy Markdown
Member Author

@kndehaan I have pushed a fix for the biogenic output content carriers now

@louispt1 louispt1 marked this pull request as ready for review May 27, 2026 14:59

@noracato noracato left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried that the UPDATE part does not have enough validation.

Can you please add tests that show updating nonexistent keys will not work? I have a feeling we will need the DatasetAttributes module after all (it is still on the emissions-gql branch)

Comment thread app/models/qernel/graph.rb Outdated
# See Qernel::Dataset#assign_dataset_attributes to understand what's going on:
call_on_each_qernel_object(:assign_dataset_attributes)
# Manually assign emissions hash (not a DatasetAttributes object)
@emissions = dataset&.data&.[](:emissions) || {}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain once more why not to use DatasetAttributes? Like this there is no validation on what users can set as an emissions attribute.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I was not really looking at UPDATE yet, so I thought to keep it as light as possible for read.
However, you are right that it's cleaner to just use the DatasetAttributes approach for consistency and validation.

I re-instated the DatasetAttributes approach and added more tests to cover the case of nonexistent keys etc. I expect more spec will be needed based on the changes for 1990 but I think it's best to process that separately.

Comment thread spec/models/gql/runtime/functions/update_spec.rb
@kndehaan

kndehaan commented Jun 1, 2026

Copy link
Copy Markdown
Member

Adding myself as reviewer to check the descriptions for the methods (it should be understandable and correct from a modeller's perspective as well).

@kndehaan kndehaan left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put up some text suggestions and placed one comment about removing a method. Let me know if you have questions @louispt1

Comment thread app/models/qernel/node_api/direct_emissions.rb Outdated
Comment thread app/models/qernel/node_api/direct_emissions.rb
Comment thread app/models/qernel/node_api/direct_emissions.rb Outdated
Comment thread app/models/qernel/node_api/direct_emissions.rb Outdated
Comment thread app/models/qernel/node_api/direct_emissions.rb Outdated
Comment thread app/models/qernel/node_api/direct_emissions.rb Outdated
Comment thread app/models/qernel/node_api/direct_emissions.rb Outdated
Comment on lines +125 to +135
# Total CO2 utilised (consumed as feedstock) at this node.
#
# Currently returns only fossil utilisation, as biogenic utilisation is always 0.
#
# @return [Float, nil] Total CO2 utilised in kg, or nil if node is not in emissions group
def direct_co2_input_utilisation
with_emissions_node do
direct_co2_input_utilisation_fossil
# Potentially in the future: + direct_co2_input_utilisation_biogenic (currently 0)
end
end

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we decided to remove this method for now, as we currently don't use it right now, right? @louispt1

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this method snuck back in with the csv serialiser merge - I will remove it

Comment thread app/models/qernel/emissions.rb Outdated
Comment on lines +26 to +27
VALID_GHG_TYPES = %w[co2 other_ghg n2o ch4 hfc pfc sf6 nf3].freeze
VALID_GHG_PATTERN = /^(#{VALID_GHG_TYPES.join('|')})(_\d{4})?$/.freeze

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to comment again, but let's not use constants with carriers and patterns. These keys have nothing to do with engine validations, they are part of the dataset pipeline IMO - or at least ETSource. The engine should not carry this knowledge.

@louispt1 louispt1 Jun 2, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to apologise, you're spot on. I pushed a change removing the validation.
Edit:
However now I need to figure out how to handle validation a little smarter - still working on it :)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@noracato what do you think about this approach?

@noracato noracato left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to keep nitpicking. There is still logic present in the module that I feel over complicates it.

Comment thread app/models/qernel/emissions.rb Outdated
Comment on lines +70 to +77
# For setters, check if the emission key exists in the dataset
if method_name.to_s.end_with?('=')
data_key = scoped_method(method_name.to_s.sub(/=$/, ''))
@emissions.dataset_has_key?(data_key)
else
# Getters always respond (may return nil if key doesn't exist)
true
end

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# For setters, check if the emission key exists in the dataset
if method_name.to_s.end_with?('=')
data_key = scoped_method(method_name.to_s.sub(/=$/, ''))
@emissions.dataset_has_key?(data_key)
else
# Getters always respond (may return nil if key doesn't exist)
true
end
# Getters always respond (may return nil if key doesn't exist)
return true unless method_name.to_s.end_with?('=')
@emissions.dataset_has_key?(
scoped_method(method_name.to_s.sub(/=$/, ''))
)

General approach is good. This is a bit pythonesque ;)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, what was wrong with:

def respond_to_missing?(method_name, include_private = false)
  data_key = scoped_method(method_name).split('=').first

  @emissions.respond_to?(data_key) || super
end

Comment thread app/models/qernel/emissions.rb Outdated
Comment on lines +102 to +103
# Check both string and symbol keys since datasets may use either
dataset_attributes.key?(key.to_s) || dataset_attributes.key?(key.to_sym)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel this should be handled here?

@kndehaan kndehaan left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a critical bug that needs to be fixed. It concerns using the EMISSIONS() function for setting demand on dataset value molecule nodes. It works for the nl datasets (ETDataset datasets), but I think it does not work as it should for ETLocal datasets.

Example blank nl2023 scenario, querying the three things below gives the expected, desired other ghg emissions (don't mind the differences in decimals):

EACH(
  EMISSIONS(agriculture_non_specified, non_energetic, other_ghg),
  MV(agriculture_non_specified_non_energetic_other_ghg, demand),
  MV(agriculture_non_specified_non_energetic_other_ghg, direct_reporting_emissions_other_ghg_emissions)
)

[
  17,888.11512,
  17,888,115,120.0,
  17,888,115,120.0,
]

This is however not the case for for example blank NO_norway scenario (and other countries as well):

EACH(
  EMISSIONS(agriculture_non_specified, non_energetic, other_ghg),
  MV(agriculture_non_specified_non_energetic_other_ghg, demand),
  MV(agriculture_non_specified_non_energetic_other_ghg, direct_reporting_emissions_other_ghg_emissions)
)



[
  4,516.46047,
  0.0,
  0.0,
]

So I think something in reading with ~ demand on the molecule nodes goes wrong when it's a ETLocal dataset (derived dataset).

I know that for energy nodes, the derived dataset first looks if graph_methods is specified. If not, it falls back to what is defined with ~, whereas full datasets directly look at ~ and don't look at graph_methods.

Might it be that for the molecule nodes, something has to be configured for derived datasets to also look at values specified with~?

@noracato

Copy link
Copy Markdown
Member

I'll have a look. Gut feeling is that you are right. There might also be a hidden scaling factor involved.

@kndehaan

Copy link
Copy Markdown
Member

Additionally, if I set a hard-coded value on the molecule node like ~ demand = 1000.0, querying the demand for this node for Norway returns zero, whereas it does return the hard-coded value for nl2023. This confirms that something doesn't go right for setting demand with ~ method for derived datasets.

@noracato

Copy link
Copy Markdown
Member

I found the culprit.

When calculating the initial graph, there is an extra thing applied for Derived datasets (ETLocal datasets). It is called ZeroMoleculeNodes. And it just sets the demand for all molecule nodes to zero.

Here you can see a comment saying "Derived datasets have no molecule flows."

I will try to find out why this was setup like this, and if we can lift it, or if we can create a special case for "floating" nodes.

@mabijkerk

Copy link
Copy Markdown
Member

In the end, working with the emission methods as they are defined now is not the most intuitive: it is a bit too elaborate in practice. How did you experience this @kndehaan? Not saying this is a must have, but still good to flag.

More straightforward might be:

current method label proposed method label
direct_co2_input_content_carriers_fossil direct_co2_content_input_fossil
direct_co2_input_utilisation_fossil direct_co2_utilisation_fossil
direct_co2_output_content_carriers_fossil direct_co2_content_output_fossil
n.a. direct_co2_production_fossil
direct_co2_output_production_capture_fossil direct_co2_capture_fossil
direct_co2_output_production_emissions_fossil direct_co2_emissions_fossil

Also, a direct_reporting_emissions_co2_emissions method would be useful. It is not the most intuitive that I have to use the direct_reporting_emissions_total_ghg_emissions method to get the complete picture of a node.

@kndehaan

Copy link
Copy Markdown
Member

@mabijkerk let's review your comment in the next increment. I agree that the naming of the methods has not been most intuitive. Also, being able to query the direct reporting CO2 emissions of a node would be useful.

@kndehaan kndehaan dismissed their stale review June 19, 2026 06:58

Dismissing the requested changes as the required solution has been implemented via Atlas, Transformer, ETSource.

louispt1 and others added 29 commits June 19, 2026 10:57
…co2_output_production_emissions_fossil and spec
…thod as direct_co2_input_utilisation_fossil is sufficient
…_utilised and emissions_lulucf_removals checks from spec
* Expand ConfiguredCSVSerializer with node group expansion, molecule support, and emissions CSV endpoints

* Add ghg_carrier method to DirectEmissions/MoleculeEmissions and update test fixture of the direct_emissions_csv

* Add ghg_carrier to MoleculeEmissions specs and simplify DirectEmissions specs to use faster mock-based approach

* Return symbols from ghg_carrier instead of strings
Co-authored-by: kndehaan <102598197+kndehaan@users.noreply.github.com>
* Add a validation lib spec for node values per dataset

* Add dev and test modes for graph data validation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants