Skip to content

Configuration file for processing

Processing raw datasets often requires adjusting parameters, such as outlier detection, corrections, measurement-specific adjustments or metadata.

When processing data with an individual analysis script, configuration parameters can be directly passed to the processing function AQGRawData.process().

This can be done (1) manually, written within the script or (2) via loading and passing the content of a configuration file.

Recommendation: always use a config file

We generally recommended to put configuration parameters in a configuration file, because this provides the following advantages:

  • analysis scripts remain general and applicable to other datasets
  • it serves as documentation and is human-readable
  • it can be archived with the data
  • it ensures reproducibility

Passing information: within processing script

This is just an example. All available options / parameters can be passed this way to the process funtcion.

from gravitools.aqg import read_aqg_raw_dataset

rawdata = read_aqg_raw_dataset("20240620_163341.zip")
rawdata.process(
    operator="John Doe",
    comment="This is a comment.",
    dg_syst="123 nm/s²",
)

Some parameters needs to be passed within one variable. So if you do not want to use/read a config file, you have to store the config in a variable first and then pass it to the process()-function.

from gravitools.aqg import read_aqg_raw_dataset

rawdata = read_aqg_raw_dataset("20240620_163341.zip")

manual_config = {
    'detect_outliers':
        {
        'threshold': 3000,
        'sigma_threshold': 5,
        'neighbors': 15,
        'min_gap': '30s',
        },
     'coriolis_correction' : False
    }

rawdata.process(**manual_config)

Passing information: from config-file

The configuration file needs to match YAML-format.

Here is an example format of a config-file:

config.yml
# Sensor head setup orientation to assume, if not otherwise specified
default_orientation: 
syst_uncertainty: 102.4 nm/s²
detect_outliers:
  sigma_threshold: 3
  neighbors: 15
  min_gap: '60s'

# Configuration for an individual instrument
meter AQG-B02:
  syst_uncertainty: 102.4 nm/s²
  dg_syst: 14 nm/s²
  tilt_offset:
    - -2.4269e-06 rad
    - -5.526e-07 rad
  2023-12-01:
    dg_syst: 120 nm/s²

# Configuration for a specific measurement location
point CA:
  pressure_admittance: -3 nm/s²/hPa
  vgg : -3008 nm/s²/m

# Configuration for a specific dataset
dataset 20240620_163341:
  operator: John Doe
  orientation: 180°
  point: CA
  comment:
    This is a comment text.
  outlier_ranges:
    2024-01-01 13:00:00 .. 2024-01-01 13:15:00:
      comment: Earthquake
  change_tilt_offsets:
    - -7.9e-06 rad
    - -6.89e-05 rad

The function read_config() can be used to read the file to a Python dictionary.

from gravitools.config import read_config

config = read_config("config.yml")

This variable can then be passed to the processing function.

from gravitools.aqg import read_aqg_raw_dataset

rawdata = read_aqg_raw_dataset("20240620_163341.zip")
rawdata.process(**config)

Units are mandatory

It is mandatory to put units behind all values. See here for the standard units.

Subsets of a configuration file

When measurements are conducted regularly at the same site or even when two measurements are done at the same day at the same site, some things differ in between these two measurement. Usually, at least the instrument orientation is different. But also other things like the height of the AQG-setup with respect to the point might change or outliers should be stored individually. For this reason, subsets of configuration files can be created and then passed to AQGRawData.process().

Recommended: one config-file per site or survey

We recommend not only keeping all additional information in configuration files but thinking about "how your measurement designs work". Maybe it makes sense to have one large config file (storing everything) or one file per site or per survey. This is entirely up to the operator but should work with individual measurement structures.

To create a subset of a loaded configuration file, use combine_dataset_config() to select the parameters that are relevant for a specific processing workflow or setup. All conditions passed to the function will look for a unique match with the provided config-file entires.

from gravitools.config import combine_dataset_config

dataset_config = combine_dataset_config(
    config,
    dataset="20240620_163341",
    meter="AQG-B02",
    point="CA",
    date="2024-01-01"
)

print(dataset_config)
{
    "comment": "This is a comment text.",
    "default_orientation": 0,
    "dg_syst": 120 nm/s²,
    "operator": "John Doe",
    "orientation": 0,
    "point": "CA",
    "pressure_admittance": -3 nm/s²/hPa,
    "vgg": -3008 nm/s²/m,
    "syst_uncertainty": 102.4 nm/s²,
}

This subset is then passed to the processing function.

from gravitools.aqg import read_aqg_raw_dataset

rawdata = read_aqg_raw_dataset("20240620_163341.zip")
rawdata.process(**dataset_config)

Configuration file structure

Depending on the keywords where values are stored in configuration files, they will be applied in different circumastances to the data to be processed.

Available keys

The allowed keys in a config file, where the function to create subsets (combine_dataset_config) is also tailored to, are limited.

  • global keys (no name, no indention)
  • meter (indendet)
  • point (indendet)
  • dataset (indendet)

Global keys

Parameters at the top-level of the configuration structure will apply globally to all datasets.

configuration.yml
# This is a global configuration
default_orientation: 

meter

Similarly, parameters inside a meter block apply to all datasets taken with this instrument.

configuration.yml
meter [IDENTIFIER]:
  # Systematic bias correction
  dg_syst: 123 nm/s²

point

Parameters that apply to all datasets at a specific location can be placed inside a point block.

configuration.yml
point [IDENTIFIER]:
  # Atmospheric pressure correction admittance factor
  pressure_admittance: -3 nm/s²/hPa

dataset

Parameters inside a dataset block apply to a specific dataset. When combining parameters, dataset parameters take highest priority.

configuration.yml
dataset [IDENTIFIER]:
  orientation: 

Time-variable parameters

Parameters that change over time, such as instrument characterization, can be placed in a date-block. This is shown here as example for the systematic instrumental bias value.

configuration.yml
meter [IDENTIFIER]:
  dg_syst: 14 nm/s²
  2023-12-01:
    # Systematic bias correction after 2023-12-01
    dg_syst: 120 nm/s²

Available configuration parameters

Here is a list of the available and accepted configuration file parameters, which can be parsed and implemented. Generally, it is possible to pass all options that will extend, define or modifiy input to the function AQGRawData.process() and/or the attribute metadata of AQGRawData.

Where does other passed variables are passed to?

All key:value pairs passed via config-files or within script to the described funcions and which DO NOT MATCH any of the function inputs will be passed to the metadata. This means they will also be included in the final dataset.

Entires starting with `dg_

All entires passed and starting with dg_ will be treated as corrections and also included in the attribute const_correcions of AQGRawData calculations.

The table indicates at which level these keys are valid and if they can be defined only for a certain time period (temporal).

variable unit available for keys usage
orientation ° global
point
Defines angle of instrument sensor head orientation towards North
dg_* nm/s² global
dataset
syst_uncertainty nm/s² global
meter
point
time veraible
Instrumental systmatic uncertainty
vgg nm/s²/m global
point
Vertical gravity gradient
pressure_admittance nm/s²/hPa point The admittance factor used for atmospheric pressure correction
station_height_difference m point
dataset
Offset to apply to the measurement reference height
tilt_offset rad global A two-tuple of old (=used) tilt offset values to recalculate the tilt angles
change_tilt_offset rad dataset A two-tuple of new tilt offset values to recalculate the tilt angles
time_period - dataset A two-tuple of a start and a end date to which the data will be cut to
recalculate_dg_pressure true/false global
dataset
recalculate_dg_pressure
recalculate_dg_polar true/false global
dataset
recalculate the polar motion correction
detect_outliers - global
dataset
A dictionary of parameters for outlier detection, see detect_outliers(). Leave unspecified (or None), to use default values. Pass False, to deactivate outlier detection
coriolis_correction true/false global
point
dataset
Exclude Coriolis effect correction

Example of complete configuration file

default_orientation: 0°
detect_outliers: 
  threshold: 3000
  sigma_threshold: 3
  neighbors: 15
  min_gap: '30s'
coriolis_correction: false

# Configuration for an individual instrument
meter AQG-B02:
  syst_uncertainty: 102.4 nm/s²
  dg_syst: 14 nm/s²
  tilt_offset: 
    - -0.00243 rad
    - -0.00055 rad
  2023-12-01:
    dg_syst: 105 nm/s²

# Configuration for a specific measurement location
point PILLARFG5:
  pressure_admittance: -3 nm/s²/hPa
  vgg: -3220 nm/s²/m

# Configuratoin for specific datasets
dataset 20200827_120756:
  operator: Marvin Reich
  site: Mueritz
  orientation: 0°
  point: PILLARFG5
  station_height_difference: 0.02 m
  comment: Within survey AQG Germany Tour
  outlier_ranges:
    2020-08-27 12:07:59 .. 2020-08-27 12:08:08:
      comment: ''
    2020-08-27 12:23:55 .. 2020-08-27 12:24:22:
      comment: ''

dataset 20250410_111852:
  operator: Marvin Reich, Stephan Schröder
  site: Mueritz
  orientation: 0°
  point: PILLARFG5
  change_tilt_offset:
    - -0.0024695 rad
    - -0.0006185 rad
  comment: 1 rubber pad used on each foot. Orientation North.

dataset 20250410_135906:
  operator: Marvin Reich, Stephan Schröder
  site: Mueritz
  orientation: 180°
  point: PILLARFG5
  change_tilt_offset:
    - -0.0024695 rad
    - -0.0006185 rad
  comment: 1 rubber pad used on each foot. Orientation South.