gcages.aggregation#
Aggregation helpers
Functions:
| Name | Description |
|---|---|
aggregate_df_level |
Aggregate a level in a pd.DataFrame |
get_region_sector_sum |
Get the sum over regions and sectors |
aggregate_df_level #
aggregate_df_level(
indf: DataFrame,
level: str,
on_clash: str = "raise",
component_separator: str = "|",
min_components_output: int = 1,
) -> DataFrame
Aggregate a level in a pd.DataFrame
Here, aggregate means 'walk up the components in the level values'
and create their totals.
For example, if indf has a metadata value like
"Emission|CO2|Energy|Demand"
then this could walk up the tree to create
"Emission|CO2|Energy" and "Emissions|CO2"
metadata values too.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
indf
|
DataFrame
|
Data to process |
required |
level
|
str
|
Level to aggregate |
required |
on_clash
|
str
|
What to do if there is a clash while aggregating. Options:
|
'raise'
|
component_separator
|
str
|
Separator between components within the values of |
'|'
|
min_components_output
|
int
|
Minimum number of components to include in the output This helps avoid creating aggregates for components you don't care about (e.g. you might not care about a "Emissions" aggregate if you have metadata values like "Emissions|CO2" and "Emissions|CH4"). |
1
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Aggregated data (i.e. both the input and the newly aggregated timeseries) |
Source code in src/gcages/aggregation.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 | |
get_region_sector_sum #
get_region_sector_sum(
indf: DataFrame,
region_level: str = "region",
world_region: str = "World",
split_sectors: Callable[
[DataFrame], DataFrame
] = partial(split_sectors, bottom_level="sectors"),
sectors_level: str = "sectors",
) -> DataFrame
Get the sum over regions and sectors
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
indf
|
DataFrame
|
Input data to sum |
required |
region_level
|
str
|
Region level in the data index |
'region'
|
world_region
|
str
|
The value used when the data represents the sum over all regions |
'World'
|
split_sectors
|
Callable[[DataFrame], DataFrame]
|
Callable to use to split sectors from other levels in |
partial(split_sectors, bottom_level='sectors')
|
sectors_level
|
str
|
Name of the sectors level once the data is split (Should be consistent with |
'sectors'
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Region-sector sum of To meet other conventions, the output has a region level
with value |