Configuration file for processing
Processing raw datasets often requires adjusting parameters, such as outlier detection, corrections, measurement-specific adjustments or metadata.
When processing data with an individual analysis script,
configuration parameters can be directly passed to the processing function
AQGRawData.process().
This can be done (1) manually, written within the script or (2) via loading and passing the content of a configuration file.
Recommendation: always use a config file
We generally recommended to put configuration parameters in a configuration file, because this provides the following advantages:
- analysis scripts remain general and applicable to other datasets
- it serves as documentation and is human-readable
- it can be archived with the data
- it ensures reproducibility
Passing information: within processing script
This is just an example. All available options / parameters can be passed this way to the process funtcion.
from gravitools.aqg import read_aqg_raw_dataset
rawdata = read_aqg_raw_dataset("20240620_163341.zip")
rawdata.process(
operator="John Doe",
comment="This is a comment.",
dg_syst="123 nm/s²",
)
Some parameters needs to be passed within one variable. So if you do not want to use/read a config file, you have to store the config in a variable first and then pass it to the process()-function.
from gravitools.aqg import read_aqg_raw_dataset
rawdata = read_aqg_raw_dataset("20240620_163341.zip")
manual_config = {
'detect_outliers':
{
'threshold': 3000,
'sigma_threshold': 5,
'neighbors': 15,
'min_gap': '30s',
},
'coriolis_correction' : False
}
rawdata.process(**manual_config)
Passing information: from config-file
The configuration file needs to match YAML-format.
Here is an example format of a config-file:
# Sensor head setup orientation to assume, if not otherwise specified
default_orientation: 0°
syst_uncertainty: 102.4 nm/s²
detect_outliers:
sigma_threshold: 3
neighbors: 15
min_gap: '60s'
# Configuration for an individual instrument
meter AQG-B02:
syst_uncertainty: 102.4 nm/s²
dg_syst: 14 nm/s²
tilt_offset:
- -2.4269e-06 rad
- -5.526e-07 rad
2023-12-01:
dg_syst: 120 nm/s²
# Configuration for a specific measurement location
point CA:
pressure_admittance: -3 nm/s²/hPa
vgg : -3008 nm/s²/m
# Configuration for a specific dataset
dataset 20240620_163341:
operator: John Doe
orientation: 180°
point: CA
comment:
This is a comment text.
outlier_ranges:
2024-01-01 13:00:00 .. 2024-01-01 13:15:00:
comment: Earthquake
change_tilt_offsets:
- -7.9e-06 rad
- -6.89e-05 rad
The function read_config() can be used to read the file to a Python
dictionary.
from gravitools.config import read_config
config = read_config("config.yml")
This variable can then be passed to the processing function.
from gravitools.aqg import read_aqg_raw_dataset
rawdata = read_aqg_raw_dataset("20240620_163341.zip")
rawdata.process(**config)
Units are mandatory
It is mandatory to put units behind all values. See here for the standard units.
Subsets of a configuration file
When measurements are conducted regularly at the same site or even when two measurements are done at the same day at the same site, some things differ in between these two measurement.
Usually, at least the instrument orientation is different.
But also other things like the height of the AQG-setup with respect to the point might change or outliers should be stored individually.
For this reason, subsets of configuration files can be created and then passed to AQGRawData.process().
Recommended: one config-file per site or survey
We recommend not only keeping all additional information in configuration files but thinking about "how your measurement designs work". Maybe it makes sense to have one large config file (storing everything) or one file per site or per survey. This is entirely up to the operator but should work with individual measurement structures.
To create a subset of a loaded configuration file, use combine_dataset_config() to
select the parameters that are relevant for a specific processing workflow or setup.
All conditions passed to the function will look for a unique match with the provided config-file entires.
from gravitools.config import combine_dataset_config
dataset_config = combine_dataset_config(
config,
dataset="20240620_163341",
meter="AQG-B02",
point="CA",
date="2024-01-01"
)
print(dataset_config)
{
"comment": "This is a comment text.",
"default_orientation": 0,
"dg_syst": 120 nm/s²,
"operator": "John Doe",
"orientation": 0,
"point": "CA",
"pressure_admittance": -3 nm/s²/hPa,
"vgg": -3008 nm/s²/m,
"syst_uncertainty": 102.4 nm/s²,
}
This subset is then passed to the processing function.
from gravitools.aqg import read_aqg_raw_dataset
rawdata = read_aqg_raw_dataset("20240620_163341.zip")
rawdata.process(**dataset_config)
Configuration file structure
Depending on the keywords where values are stored in configuration files, they will be applied in different circumastances to the data to be processed.
Available keys
The allowed keys in a config file, where the function to create subsets (combine_dataset_config) is also tailored to, are limited.
- global keys (no name, no indention)
meter(indendet)point(indendet)dataset(indendet)
Global keys
Parameters at the top-level of the configuration structure will apply globally to all datasets.
# This is a global configuration
default_orientation: 0°
meter
Similarly, parameters inside a meter block apply to all datasets taken with
this instrument.
meter [IDENTIFIER]:
# Systematic bias correction
dg_syst: 123 nm/s²
point
Parameters that apply to all datasets at a specific location can be placed
inside a point block.
point [IDENTIFIER]:
# Atmospheric pressure correction admittance factor
pressure_admittance: -3 nm/s²/hPa
dataset
Parameters inside a dataset block apply to a specific dataset. When combining
parameters, dataset parameters take highest priority.
dataset [IDENTIFIER]:
orientation: 0°
Time-variable parameters
Parameters that change over time, such as instrument characterization, can be placed in a date-block. This is shown here as example for the systematic instrumental bias value.
meter [IDENTIFIER]:
dg_syst: 14 nm/s²
2023-12-01:
# Systematic bias correction after 2023-12-01
dg_syst: 120 nm/s²
Available configuration parameters
Here is a list of the available and accepted configuration file parameters, which can be parsed and implemented.
Generally, it is possible to pass all options that will extend, define or modifiy input to the function AQGRawData.process() and/or the attribute metadata of AQGRawData.
Where does other passed variables are passed to?
All key:value pairs passed via config-files or within script to the described funcions and which DO NOT MATCH any of the function inputs will be passed to the metadata. This means they will also be included in the final dataset.
Entires starting with `dg_
All entires passed and starting with dg_ will be treated as corrections and also included in the attribute const_correcions of AQGRawData calculations.
The table indicates at which level these keys are valid and if they can be defined only for a certain time period (temporal).
| variable | unit | available for keys | usage |
|---|---|---|---|
| orientation | ° | global point |
Defines angle of instrument sensor head orientation towards North |
| dg_* | nm/s² | global dataset |
|
| syst_uncertainty | nm/s² | global meter point time veraible |
Instrumental systmatic uncertainty |
| vgg | nm/s²/m | global point |
Vertical gravity gradient |
| pressure_admittance | nm/s²/hPa | point | The admittance factor used for atmospheric pressure correction |
| station_height_difference | m | point dataset |
Offset to apply to the measurement reference height |
| tilt_offset | rad | global | A two-tuple of old (=used) tilt offset values to recalculate the tilt angles |
| change_tilt_offset | rad | dataset | A two-tuple of new tilt offset values to recalculate the tilt angles |
| time_period | - | dataset | A two-tuple of a start and a end date to which the data will be cut to |
| recalculate_dg_pressure | true/false | global dataset |
recalculate_dg_pressure |
| recalculate_dg_polar | true/false | global dataset |
recalculate the polar motion correction |
| detect_outliers | - | global dataset |
A dictionary of parameters for outlier detection, see detect_outliers(). Leave unspecified (or None), to use default values. Pass False, to deactivate outlier detection |
| coriolis_correction | true/false | global point dataset |
Exclude Coriolis effect correction |
Example of complete configuration file
default_orientation: 0°
detect_outliers:
threshold: 3000
sigma_threshold: 3
neighbors: 15
min_gap: '30s'
coriolis_correction: false
# Configuration for an individual instrument
meter AQG-B02:
syst_uncertainty: 102.4 nm/s²
dg_syst: 14 nm/s²
tilt_offset:
- -0.00243 rad
- -0.00055 rad
2023-12-01:
dg_syst: 105 nm/s²
# Configuration for a specific measurement location
point PILLARFG5:
pressure_admittance: -3 nm/s²/hPa
vgg: -3220 nm/s²/m
# Configuratoin for specific datasets
dataset 20200827_120756:
operator: Marvin Reich
site: Mueritz
orientation: 0°
point: PILLARFG5
station_height_difference: 0.02 m
comment: Within survey AQG Germany Tour
outlier_ranges:
2020-08-27 12:07:59 .. 2020-08-27 12:08:08:
comment: ''
2020-08-27 12:23:55 .. 2020-08-27 12:24:22:
comment: ''
dataset 20250410_111852:
operator: Marvin Reich, Stephan Schröder
site: Mueritz
orientation: 0°
point: PILLARFG5
change_tilt_offset:
- -0.0024695 rad
- -0.0006185 rad
comment: 1 rubber pad used on each foot. Orientation North.
dataset 20250410_135906:
operator: Marvin Reich, Stephan Schröder
site: Mueritz
orientation: 180°
point: PILLARFG5
change_tilt_offset:
- -0.0024695 rad
- -0.0006185 rad
comment: 1 rubber pad used on each foot. Orientation South.