Skip to content

gcages.cmip7_scenariomip.gridding_emissions#

Handling of gridding emissions

Classes:

Name Description
SpatialResolutionOption

Spatial resolution option

Functions:

Name Description
get_complete_gridding_index

Get the index of complete gridding data

to_global_workflow_emissions

Convert gridding emissions to global workflow emissions

to_global_workflow_emissions_from_stacked

Convert pre-stacked gridding emissions to global workflow emissions

Attributes:

Name Type Description
CO2_BIOSPHERE_SECTORS_GRIDDING tuple[str, ...]

Sectors that come from biospheric CO2 reservoirs (gridding naming convention)

CO2_FOSSIL_SECTORS_GRIDDING tuple[str, ...]

Sectors that come from or go to fossil CO2 reservoirs (gridding naming convention)

COMPLETE_GRIDDING_SECTORS_CDR tuple[str, ...]

Complete set of sectors for gridding CDR sectors

COMPLETE_GRIDDING_SECTORS_EXCEPT_CDR tuple[str, ...]

Complete set of sectors for gridding excluding CDR sectors

COMPLETE_GRIDDING_SPECIES tuple[str, ...]

Complete set of species for gridding

CO2_BIOSPHERE_SECTORS_GRIDDING module-attribute #

CO2_BIOSPHERE_SECTORS_GRIDDING: tuple[str, ...] = (
    "Agriculture",
    "Agricultural Waste Burning",
    "Forest Burning",
    "Grassland Burning",
    "Peat Burning",
)

Sectors that come from biospheric CO2 reservoirs (gridding naming convention)

Not a perfect split with CO2_FOSSIL_SECTORS_GRIDDING, but the best we can do.

CO2_FOSSIL_SECTORS_GRIDDING module-attribute #

CO2_FOSSIL_SECTORS_GRIDDING: tuple[str, ...] = (
    "Aircraft",
    "BECCS",
    "International Shipping",
    "Energy Sector",
    "Industrial Sector",
    "Other CDR",
    "Enhanced Weathering",
    "Direct Air Capture",
    "Ocean",
    "Biochar",
    "Soil Carbon Management",
    "Residential Commercial Other",
    "Solvents Production and Application",
    "Transportation Sector",
    "Waste",
)

Sectors that come from or go to fossil CO2 reservoirs (gridding naming convention)

BECCS is here because the carbon is stored permanently (or assumed to be). It is grown then removed from the land pool, so is 'net zero' from the land pool's point of view (and handling this really well requires running a carbon cycle model to determine the possible uptake from the BECCS land-use, which isn't how the split between modelling domains works at the moment).

There is the same issue for some non-land CDR e.g. ocean alkalinity stuff. Again, a handling sophisticiated enough to capture this properly is beyond the scope of the fossil/biosphere split we're making here.

Not a perfect split with CO2_BIOSPHERE_SECTORS_GRIDDING, but the best we can do.

COMPLETE_GRIDDING_SECTORS_CDR module-attribute #

COMPLETE_GRIDDING_SECTORS_CDR: tuple[str, ...] = (
    "BECCS",
    "Enhanced Weathering",
    "Direct Air Capture",
    "Ocean",
    "Biochar",
    "Soil Carbon Management",
)

Complete set of sectors for gridding CDR sectors

COMPLETE_GRIDDING_SECTORS_EXCEPT_CDR module-attribute #

COMPLETE_GRIDDING_SECTORS_EXCEPT_CDR: tuple[str, ...] = (
    "Agricultural Waste Burning",
    "Agriculture",
    "Aircraft",
    "Energy Sector",
    "Forest Burning",
    "Grassland Burning",
    "Industrial Sector",
    "International Shipping",
    "Peat Burning",
    "Residential Commercial Other",
    "Solvents Production and Application",
    "Transportation Sector",
    "Waste",
    "Other CDR",
)

Complete set of sectors for gridding excluding CDR sectors

COMPLETE_GRIDDING_SPECIES module-attribute #

COMPLETE_GRIDDING_SPECIES: tuple[str, ...] = (
    "CO2",
    "CH4",
    "N2O",
    "BC",
    "CO",
    "NH3",
    "OC",
    "NOx",
    "Sulfur",
    "VOC",
)

Complete set of species for gridding

SpatialResolutionOption #

Bases: StrEnum

Spatial resolution option

Attributes:

Name Type Description
MODEL_REGION

Data reported at the (IAM) model region level

WORLD

Data reported at the world (i.e. global) level

Source code in src/gcages/cmip7_scenariomip/gridding_emissions.py
class SpatialResolutionOption(StrEnum):
    """Spatial resolution option"""

    WORLD = "world"
    """Data reported at the world (i.e. global) level"""

    MODEL_REGION = "model_region"
    """Data reported at the (IAM) model region level"""

MODEL_REGION class-attribute instance-attribute #

MODEL_REGION = 'model_region'

Data reported at the (IAM) model region level

WORLD class-attribute instance-attribute #

WORLD = 'world'

Data reported at the world (i.e. global) level

get_complete_gridding_index #

get_complete_gridding_index(
    model_regions: tuple[str, ...],
    world_gridding_sectors: tuple[str, ...] = (
        "Aircraft",
        "International Shipping",
    ),
    world_region: str = "World",
    region_level: str = "region",
    variable_level: str = "variable",
    table: str = "Emissions",
    level_separator: str = "|",
) -> MultiIndex

Get the index of complete gridding data

Parameters:

Name Type Description Default
model_regions tuple[str, ...]

Model regions to use while reaggregating

required
world_gridding_sectors tuple[str, ...]

Sectors that should only be gridded at the world level

('Aircraft', 'International Shipping')
world_region str

The value used when the data represents the sum over all regions

'World'
region_level str

Region level in the data index

'region'
variable_level str

Variable level in the data index

'variable'
table str

Name of the 'table' for emissions

Used to process and create variable names

'Emissions'
level_separator str

Separator between levels in the variable names

'|'

Returns:

Type Description
MultiIndex

Index of complete gridding data

Source code in src/gcages/cmip7_scenariomip/gridding_emissions.py
def get_complete_gridding_index(  # noqa: PLR0913
    model_regions: tuple[str, ...],
    world_gridding_sectors: tuple[str, ...] = (
        "Aircraft",
        "International Shipping",
    ),
    world_region: str = "World",
    region_level: str = "region",
    variable_level: str = "variable",
    table: str = "Emissions",
    level_separator: str = "|",
) -> pd.MultiIndex:
    """
    Get the index of complete gridding data

    Parameters
    ----------
    model_regions
        Model regions to use while reaggregating

    world_gridding_sectors
        Sectors that should only be gridded at the world level

    world_region
        The value used when the data represents the sum over all regions

    region_level
        Region level in the data index

    variable_level
        Variable level in the data index

    table
        Name of the 'table' for emissions

        Used to process and create variable names

    level_separator
        Separator between levels in the variable names

    Returns
    -------
    :
        Index of complete gridding data
    """
    complete_world_variables = [
        level_separator.join([table, species, sectors])
        for species, sectors in itertools.product(
            COMPLETE_GRIDDING_SPECIES, world_gridding_sectors
        )
    ]
    world_required = pd.MultiIndex.from_product(
        [complete_world_variables, [world_region]], names=[variable_level, region_level]
    )

    model_region_sectors_except_cdr = sorted(
        set(COMPLETE_GRIDDING_SECTORS_EXCEPT_CDR) - set(world_gridding_sectors)
    )
    complete_model_region_variables_except_cdr = [
        level_separator.join([table, species, sectors])
        for species, sectors in itertools.product(
            COMPLETE_GRIDDING_SPECIES, model_region_sectors_except_cdr
        )
    ]

    complete_model_region_variables_cdr = [
        level_separator.join([table, "CO2", sectors])
        for sectors in COMPLETE_GRIDDING_SECTORS_CDR
    ]

    model_region_required = pd.MultiIndex.from_product(
        [
            [
                *complete_model_region_variables_except_cdr,
                *complete_model_region_variables_cdr,
            ],
            model_regions,
        ],
        names=[variable_level, region_level],
    )

    res: pd.MultiIndex = world_required.append(model_region_required)  # type: ignore # pandas-stubs out of date

    return res

to_global_workflow_emissions #

to_global_workflow_emissions(
    gridding_emissions: DataFrame,
    time_name: str = "year",
    region_level: str = "region",
    world_region: str = "World",
    global_workflow_co2_fossil_sector: str = "Fossil",
    global_workflow_co2_biosphere_sector: str = "Biosphere",
    co2_fossil_sectors: tuple[
        str, ...
    ] = CO2_FOSSIL_SECTORS_GRIDDING,
    co2_biosphere_sectors: tuple[
        str, ...
    ] = CO2_BIOSPHERE_SECTORS_GRIDDING,
    sectors_level: str = "sectors",
    species_level: str = "species",
    co2_name: str = "CO2",
) -> DataFrame

Convert gridding emissions to global workflow emissions

Parameters:

Name Type Description Default
gridding_emissions DataFrame

Gridding emissions

required
time_name str

Name of the time axis in gridding_emissions

'year'
region_level str

Region level in the data index

'region'
world_region str

The value used when the data represents the sum over all regions

'World'
global_workflow_co2_fossil_sector str

Name of the CO2 'sector' with fossil origins to use in the output

'Fossil'
global_workflow_co2_biosphere_sector str

Name of the CO2 'sector' with biospheric origins to use in the output

'Biosphere'
co2_fossil_sectors tuple[str, ...]

Sectors to assume have an origin in fossil CO2 reservoirs

CO2_FOSSIL_SECTORS_GRIDDING
co2_biosphere_sectors tuple[str, ...]

Sectors to assume have an origin in biospheric CO2 reservoirs

CO2_BIOSPHERE_SECTORS_GRIDDING
sectors_level str

Sectors level in the data index

'sectors'
species_level str

Species level in the data index

'species'
co2_name str

String that indicates emissions of CO2 in variable names

'CO2'

Returns:

Type Description
DataFrame

Global workflow emissions

Source code in src/gcages/cmip7_scenariomip/gridding_emissions.py
def to_global_workflow_emissions(  # noqa: PLR0913
    gridding_emissions: pd.DataFrame,
    time_name: str = "year",
    region_level: str = "region",
    world_region: str = "World",
    global_workflow_co2_fossil_sector: str = "Fossil",
    global_workflow_co2_biosphere_sector: str = "Biosphere",
    co2_fossil_sectors: tuple[str, ...] = CO2_FOSSIL_SECTORS_GRIDDING,
    co2_biosphere_sectors: tuple[str, ...] = CO2_BIOSPHERE_SECTORS_GRIDDING,
    sectors_level: str = "sectors",
    species_level: str = "species",
    co2_name: str = "CO2",
) -> pd.DataFrame:
    """
    Convert gridding emissions to global workflow emissions

    Parameters
    ----------
    gridding_emissions
        Gridding emissions

    time_name
        Name of the time axis in `gridding_emissions`

    region_level
        Region level in the data index

    world_region
        The value used when the data represents the sum over all regions

    global_workflow_co2_fossil_sector
        Name of the CO2 'sector' with fossil origins to use in the output

    global_workflow_co2_biosphere_sector
        Name of the CO2 'sector' with biospheric origins to use in the output

    co2_fossil_sectors
        Sectors to assume have an origin in fossil CO2 reservoirs

    co2_biosphere_sectors
        Sectors to assume have an origin in biospheric CO2 reservoirs

    sectors_level
        Sectors level in the data index

    species_level
        Species level in the data index

    co2_name
        String that indicates emissions of CO2 in variable names

    Returns
    -------
    :
        Global workflow emissions
    """
    stacked: pd.DataFrame = (
        split_sectors(  # type: ignore
            gridding_emissions,
            middle_level=species_level,
            bottom_level=sectors_level,
        )
        .stack()
        .unstack("sectors")
    )

    world_locator = stacked.index.get_level_values(region_level) == world_region
    region_sector_df = stacked.loc[~world_locator]
    sector_df = stacked.loc[world_locator].reset_index("region", drop=True)

    gw_sector_df, gw_total_df = to_global_workflow_emissions_from_stacked(
        region_sector_df=region_sector_df,
        sector_df=sector_df,
        time_name=time_name,
        region_level=region_level,
        global_workflow_co2_fossil_sector=global_workflow_co2_fossil_sector,
        global_workflow_co2_biosphere_sector=global_workflow_co2_biosphere_sector,
        co2_fossil_sectors=co2_fossil_sectors,
        co2_biosphere_sectors=co2_biosphere_sectors,
        sectors_level=sectors_level,
        species_level=species_level,
        co2_name=co2_name,
    )

    gw_sector_df_input_like = set_index_levels_func(
        combine_sectors(
            gw_sector_df,  # type: ignore # fix when moving to pandas-openscm
            middle_level=species_level,
            bottom_level=sectors_level,
        ),
        {region_level: world_region},
    ).unstack(time_name)
    gw_total_df_input_like = set_index_levels_func(
        combine_species(gw_total_df, bottom_level=species_level),  # type: ignore # fix when moving to pandas-openscm
        {region_level: world_region},
    ).unstack(time_name)

    res = pd.concat(
        [
            df.reorder_levels(gridding_emissions.index.names)
            for df in [gw_total_df_input_like, gw_sector_df_input_like]
        ]
    )
    return res

to_global_workflow_emissions_from_stacked #

to_global_workflow_emissions_from_stacked(
    region_sector_df: DataFrame,
    sector_df: DataFrame,
    time_name: str,
    region_level: str,
    global_workflow_co2_fossil_sector: str,
    global_workflow_co2_biosphere_sector: str,
    co2_fossil_sectors: tuple[str, ...],
    co2_biosphere_sectors: tuple[str, ...],
    sectors_level: str,
    species_level: str,
    co2_name: str,
) -> tuple[
    Series[NP_FLOAT_OR_INT], Series[NP_FLOAT_OR_INT]
]

Convert pre-stacked gridding emissions to global workflow emissions

Parameters:

Name Type Description Default
region_sector_df DataFrame

Data with region and sector levels

required
sector_df DataFrame

Data with sector levels only

required
time_name str

Name of the time axis in gridding_emissions

required
region_level str

Region level in the data index

required
global_workflow_co2_fossil_sector str

Name of the CO2 'sector' with fossil origins to use in the output

required
global_workflow_co2_biosphere_sector str

Name of the CO2 'sector' with biospheric origins to use in the output

required
co2_fossil_sectors tuple[str, ...]

Sectors to assume have an origin in fossil CO2 reservoirs

required
co2_biosphere_sectors tuple[str, ...]

Sectors to assume have an origin in biospheric CO2 reservoirs

required
sectors_level str

Sectors level in the data index

required
species_level str

Species level in the data index

required
co2_name str

String that indicates emissions of CO2 in variable names

required

Returns:

Type Description
sectors

Global workflow emissions with a sector level

totals

Global workflow emissions only with totals (no region or sector level)

Source code in src/gcages/cmip7_scenariomip/gridding_emissions.py
def to_global_workflow_emissions_from_stacked(  # noqa: PLR0913
    region_sector_df: pd.DataFrame,
    sector_df: pd.DataFrame,
    time_name: str,
    region_level: str,
    global_workflow_co2_fossil_sector: str,
    global_workflow_co2_biosphere_sector: str,
    co2_fossil_sectors: tuple[str, ...],
    co2_biosphere_sectors: tuple[str, ...],
    sectors_level: str,
    species_level: str,
    co2_name: str,
) -> tuple[pd.Series[NP_FLOAT_OR_INT], pd.Series[NP_FLOAT_OR_INT]]:  # type: ignore # pandas-stubs out of date
    """
    Convert pre-stacked gridding emissions to global workflow emissions

    Parameters
    ----------
    region_sector_df
        Data with region and sector levels

    sector_df
        Data with sector levels only

    time_name
        Name of the time axis in `gridding_emissions`

    region_level
        Region level in the data index

    global_workflow_co2_fossil_sector
        Name of the CO2 'sector' with fossil origins to use in the output

    global_workflow_co2_biosphere_sector
        Name of the CO2 'sector' with biospheric origins to use in the output

    co2_fossil_sectors
        Sectors to assume have an origin in fossil CO2 reservoirs

    co2_biosphere_sectors
        Sectors to assume have an origin in biospheric CO2 reservoirs

    sectors_level
        Sectors level in the data index

    species_level
        Species level in the data index

    co2_name
        String that indicates emissions of CO2 in variable names

    Returns
    -------
    sectors
        Global workflow emissions with a sector level

    totals
        Global workflow emissions only with totals (no region or sector level)
    """
    region_sector_df_region_sum = groupby_except(region_sector_df, region_level).sum()

    sector_df_full = pd.concat([sector_df, region_sector_df_region_sum], axis="columns")

    co2_locator = (sector_df_full.index.get_level_values(species_level) == co2_name) & (
        sector_df_full.index.get_level_values("table") == "Emissions"
    )

    non_co2: pd.Series[NP_FLOAT_OR_INT] = sector_df_full[~co2_locator].sum(  # type: ignore # pandas-stubs out of date
        axis="columns"
    )

    not_used_cols = sorted(
        set(sector_df_full.columns)
        - {
            *co2_biosphere_sectors,
            *co2_fossil_sectors,
        }
    )
    if not_used_cols:
        msg = (
            "For the given inputs, not all CO2 sectors will be used.\n"
            f"{not_used_cols=}\n"
            f"{co2_fossil_sectors=}\n"
            f"{co2_biosphere_sectors=}\n"
        )
        raise AssertionError(msg)

    co2_fossil = set_index_levels_func(
        sector_df_full.loc[co2_locator, list(co2_fossil_sectors)].sum(axis="columns"),
        {sectors_level: global_workflow_co2_fossil_sector},
    )
    co2_biosphere = set_index_levels_func(
        sector_df_full.loc[co2_locator, list(co2_biosphere_sectors)].sum(
            axis="columns"
        ),
        {sectors_level: global_workflow_co2_biosphere_sector},
    )

    totals = non_co2
    sectors: pd.Series[NP_FLOAT_OR_INT] = pd.concat(  # type: ignore # pandas-stubs out of date
        [
            df.reorder_levels(co2_fossil.index.names)
            for df in [co2_fossil, co2_biosphere]
        ]
    )

    return sectors, totals