Skip to content

Utils

Common utility functions

AQG_INTERVAL = 0.54 module-attribute

Measurement frequency of AQG,Hz

STANDARD_HEIGHT = 0.0 module-attribute

Standard height above point marker for absolute gravity measurement.

filename_matches_pattern(path, pattern)

Compare filename against a regex pattern

Examples:

>>> filename_matches_pattern("path/to/test_123.csv", r"^test_\d+.csv$")
True

Parameters:

Name Type Description Default
path str | Path | Path

Input path.

required
pattern str

Regex pattern.

required

Returns:

Type Description
bool

True, if match.

files_in_path(path)

List all files in a directory or zip-archive

Parameters:

Name Type Description Default
path str | Path | Path

Directory path.

required

Returns:

Type Description
list[Path | Path]

List of paths

flatten(d, lvl=0, prefix=None, sep='__')

Flatten a nested dictionary

get_const_columns(df, columns)

Check if columns are constant and return their value

Parameters:

Name Type Description Default
df DataFrame

Input dataframe.

required
columns list[str]

List of column names to check.

required

Returns:

Type Description
dict

Names and values of constant columns.

interpolate_time_series(series, index, name=None)

Linearly interpolate time series to new time index

Time indices outside the bounds of the input data will return NaN values.

Parameters:

Name Type Description Default
series Series

Input time series.

required
index DatetimeIndex

New time index.

required
name str

Name of new time series. Taken from input series, if unspecified.

None

Returns:

Type Description
Series

Linear interpolation of input time series at specified time indices.

is_date(text)

Check if input is a date or a string that has date format

Accepted formats are: YYYY-mm-dd and YYYY-mm-dd HH:MM.

There is no check, whether this date is actually valid.

iter_zipfile_path(zippath)

Recursively iterate all files within a zipfile.Path

map_index(ser, index)

Map a time series to a different time index

merge_dicts(d1, d2)

Merge two nested dictionaries

d2 takes precedence over d1.

Examples:

>>> merge_dicts({"a": 1, "b": {"x": 1, "y": 0}}, {"b": {"y": 2, "z": 3}})
{'a': 1, 'b': {'x': 1, 'y': 2, 'z': 3}}

partition_dict(dictionary, condition)

Partition a dictionary by a condition on each key

Parameters:

Name Type Description Default
dictionary dict

Input dictionary.

required
condition Callable[[str], bool]

Callback function defining condition on key.

required

Examples:

>>> partition_dict({"a": 1, "b": 2}, lambda key: key.startswith("a"))
({'a': 1}, {'b': 2})

read_csv(path)

Wrapper around pd.read_csv() to read files within zip-archives

read_csv_files(paths)

Read CSV files in parallel

sort_paths_numerically(paths)

Sort file paths by trailing number

Examples:

>>> sort_paths_numerically(["raw_10.csv", "raw_11.csv", "raw_9.csv"])
['raw_9.csv', 'raw_10.csv', 'raw_11.csv']

to_path(input_path)

Convert to pathlib.Path or zipfile.Path

Examples:

>>> to_path("path/to/filename.txt")
PosixPath('path/to/filename.txt')
>>> to_path("path/to/archive.zip")
zipfile.Path('path/to/archive.zip', '')
>>> to_path("path/to/archive.zip/testfile.txt")
zipfile.Path('path/to/archive.zip', 'testfile.txt')

Parameters:

Name Type Description Default
input_path str | Path | Path

Input path.

required

Returns:

Type Description
Path | Path

Output path.