Skip to content

Handling processed AQG data

Once the raw data is processed and converted into an AQGDataset-class, all further visualization and offered methods are carried out on this variable.

General routine for processed AQG data

  1. processing of time series data has been finished
  2. final result (mean absolute gravity value) can be obtained
  3. additionally, several aspects of the data can be expored and visualized (see list below)
  4. the data is archived in a reproducible way (netCDF-file with metadata for reproducibility)

Which functionalities this AQGDataset-class offers

  • retrieve processed gravity time series signal
  • calculate final absolute mean value
  • transfer gravity value to differen height
  • calculate Allan Deviation
  • compare AQG time series to reference data time series
  • resample time series data
  • archive data (as NetCDF)
  • plot Allan Deviation
  • plot full gravity signal
  • calculate statistics (SEM, SD_h, TAV, sensitivity)
  • generate measurement report as PDF
  • plotting of all available variables (after manual extraction from class)

Read processed data from an .nc-file

read_aqg_dataset() uses xarray.open_dataset() to create an xarray.Dataset instance, and then wraps it as an AQGDataset.

from gravitools.aqg import read_aqg_dataset
dataset = read_aqg_dataset("../data/test.nc")
dataset
<AQGDataset: 20240620_163341>

Save to file

AQGDataset.to_nc() exports the dataset as NETCDF4, a binary data format. The resulting file contains the full signal of all data columns and all relevant metadata. It is also transparently compressed.

dataset.to_nc("../data/test.nc")

Accessing dataset properties and metadata

AQGDataset is a wrapper to an xarray.Dataset, which provides advantages for reading the data from file since individual columns can be read on-demand. The AQGDataset.ds attribute can also be accessed directly, though this is usually not needed. All metadata is contained in the ds dataset and can be accessed via AQGDataset.ds.attrs, which is also accessible with the AQGDataset.metadata property.

dataset.ds.attrs["software"]
'1.7.4'
dataset.metadata["software"]
'1.7.4'

List of properties

For all available class-properties please see the corresponding api documentation. The can be assesd via dataset.<property_name>.

Accessing dataset variables

For individual visualization, extraction or processing, all available time series signal can be extracted from the xarray.Dataset. A list of available variables can be shown.

dataset.ds.data_vars
Data variables:
    gps_timestamp                      (utc) float64 391kB ...
    gps_state                          (utc) int64 391kB ...
    ref_time_mode                      (utc) object 391kB 'S' 'S' ... 'S' 'S'
    g_raw                              (utc) int64 391kB ...
    dg_tilt                            (utc) float64 391kB ...
    dg_pressure                        (utc) float64 391kB ...
    dg_earth_tide                      (utc) int64 391kB ...
    dg_polar                           (utc) float64 391kB ...
    atmospheric_pressure               (utc) float64 391kB ...
    external_temperature               (utc) float64 391kB ...
    sensor_head_temperature            (utc) float64 391kB ...
    vacuum_chamber_temperature         (utc) float64 391kB ...
    tiltmeter_temperature              (utc) float64 391kB ...
    x_tilt                             (utc) float64 391kB ...
    y_tilt                             (utc) float64 391kB ...
    sat_abs_offset_correction          (utc) int64 391kB ...
    sat_abs_offset_correction_applied  (utc) int64 391kB ...
    raw_quartz_frequency               (utc) float64 391kB ...
    cooler_phi                         (utc) float64 391kB ...
    cooler_eta                         (utc) float64 391kB ...
    rep_phi                            (utc) float64 391kB ...
    rep_eta                            (utc) float64 391kB ...
    atoms_number                       (utc) int64 391kB ...
    laser_temperature                  (utc) float64 391kB ...
    is_outlier                         (utc) bool 49kB ...
    g                                  (utc) float64 391kB ...
dataset.get("g_raw")
utc
2024-06-20 16:33:41.790999889    9808372019
2024-06-20 16:33:42.332000017    9808372002
2024-06-20 16:33:42.871999979    9808372150
2024-06-20 16:33:43.411000013    9808372094
2024-06-20 16:33:43.950999975    9808372088
                                    ...    
2024-06-20 23:59:57.690000057    9808372158
2024-06-20 23:59:58.230000019    9808372119
2024-06-20 23:59:58.769999981    9808372203
2024-06-20 23:59:59.309999943    9808372333
2024-06-20 23:59:59.848999977    9808372214
Name: g_raw, Length: 48859, dtype: int64

Viewing the gravity time series

Usually it is not necessary or meaningfull to consider individual drops because the variance is very high and, due to the nature of the measurement principle, successive drops are correlated. For this reason, AQGDataset provides two methods for accessing the gravity signal:

AQGDataset.full_g(): The full gravity signal, showing all drops at 0.54 s intervals. Note that this does not follow a white noise distribution.

AQGDataset.g(): The gravity signal downsampled to 10 second interval means. If there are no interruptions to the measurement, a 10 second interval contains 18 or 19 drops. Importantly, one this time scale, the signal does show a white noise distribution.

# Configure Pandas display float precision
import pandas as pd
pd.options.display.float_format = '{:.1f}'.format
dataset.full_g()
GravityTimeSeries: 20240620_163341
{'vgg': Gradient(value=-3287.0, error=0), 'height': 0.0}
                                         y
utc                                       
2024-06-20 16:33:41.790999889 9808373525.6
2024-06-20 16:33:42.332000017 9808373508.7
2024-06-20 16:33:42.871999979 9808373656.6
2024-06-20 16:33:43.411000013 9808373600.7
2024-06-20 16:33:43.950999975 9808373594.7
...                                    ...
2024-06-20 23:59:57.690000057 9808373558.5
2024-06-20 23:59:58.230000019 9808373519.9
2024-06-20 23:59:58.769999981 9808373603.8
2024-06-20 23:59:59.309999943 9808373733.8
2024-06-20 23:59:59.848999977 9808373615.2

[48690 rows x 1 columns]
dataset.g()
GravityTimeSeries: 20240620_163341
{'vgg': Gradient(value=-3287.0, error=0), 'height': 0.0}
                               y  y_err  y_count
utc                                             
2024-06-20 16:33:40          NaN   22.9        6
2024-06-20 16:33:50 9808373396.7   40.9       19
2024-06-20 16:34:00 9808373739.8   25.0       18
2024-06-20 16:34:10 9808373609.3   52.7       19
2024-06-20 16:34:20 9808373557.8   70.4       19
...                          ...    ...      ...
2024-06-20 23:59:20 9808373165.9   16.3       19
2024-06-20 23:59:30 9808373276.3   28.3       18
2024-06-20 23:59:40 9808373422.4   37.6       19
2024-06-20 23:59:50 9808373267.2   46.0       19
2024-06-21 00:00:00          NaN   86.3        9

[2679 rows x 3 columns]

You can obtain the gravity signal at a different transfer height by transferring this result (dataset.g().transfer(h=0)) or by providing a height argument (dataset.g(h=0)).

When further downsampling to 10 minute intervals, for instance, the errors (standard error of the mean) should accurately represent the statistical uncertainty.

dataset.g().resample("10min")
GravityTimeSeries: 20240620_163341
{'vgg': Gradient(value=-3287.0, error=0), 'height': 0.0}
                               y  y_err  y_count
utc                                             
2024-06-20 16:30:00 9808373506.7   51.7        7
2024-06-20 16:40:00 9808373455.7   19.5       56
2024-06-20 16:50:00 9808373425.0   22.0       60
2024-06-20 17:00:00 9808373455.7   16.7       60
2024-06-20 17:10:00 9808373465.0   18.3       60
2024-06-20 17:20:00 9808373487.8   17.2       50
2024-06-20 17:30:00 9808373446.3   21.9       60
2024-06-20 17:40:00 9808373438.4   17.1       60
2024-06-20 17:50:00 9808373438.0   14.9       60
2024-06-20 18:00:00 9808373412.2   17.6       60
2024-06-20 18:10:00 9808373418.2   18.3       60
2024-06-20 18:20:00 9808373394.1   19.7       60
2024-06-20 18:30:00 9808373425.3   18.8       60
2024-06-20 18:40:00 9808373485.1   18.7       60
2024-06-20 18:50:00 9808373387.6   16.0       55
2024-06-20 19:00:00 9808373469.7   16.4       60
2024-06-20 19:10:00 9808373406.8   19.3       60
2024-06-20 19:20:00 9808373401.9   17.1       60
2024-06-20 19:30:00 9808373418.5   18.9       60
2024-06-20 19:40:00 9808373448.6   17.4       58
2024-06-20 19:50:00 9808373493.5   17.8       60
2024-06-20 20:00:00 9808373480.7   14.1       60
2024-06-20 20:10:00 9808373462.0   15.6       60
2024-06-20 20:20:00 9808373426.4   17.8       50
2024-06-20 20:30:00 9808373440.2   16.7       60
2024-06-20 20:40:00 9808373514.6   17.7       60
2024-06-20 20:50:00 9808373492.9   18.6       60
2024-06-20 21:00:00 9808373437.1   17.5       58
2024-06-20 21:10:00 9808373420.2   19.0       60
2024-06-20 21:20:00 9808373413.2   18.4       60
2024-06-20 21:30:00 9808373426.9   23.2       60
2024-06-20 21:40:00 9808373435.2   17.2       60
2024-06-20 21:50:00 9808373427.7   16.6       55
2024-06-20 22:00:00 9808373456.1   18.0       60
2024-06-20 22:10:00 9808373422.9   20.9       60
2024-06-20 22:20:00 9808373424.0   18.3       60
2024-06-20 22:30:00 9808373453.9   19.0       60
2024-06-20 22:40:00 9808373486.4   18.2       60
2024-06-20 22:50:00 9808373473.6   15.1       60
2024-06-20 23:00:00 9808373433.7   16.9       60
2024-06-20 23:10:00 9808373433.4   19.1       60
2024-06-20 23:20:00 9808373484.8   18.7       50
2024-06-20 23:30:00 9808373485.6   17.6       60
2024-06-20 23:40:00 9808373393.7   16.1       60
2024-06-20 23:50:00 9808373384.0   17.9       60
2024-06-21 00:00:00 9808373349.8   26.0       30

Naturally, some intervals will not contain as many drops and the statistical error can therefore be much higher. You can exclude these intervals by requiring at least 50% of the max. number of drops within an interval, for instance.

interval = "10min"
g = dataset.g().resample(interval, min_count=0.5)
g
GravityTimeSeries: 20240620_163341
{'vgg': Gradient(value=-3287.0, error=0), 'height': 0.0}
                               y  y_err  y_count
utc                                             
2024-06-20 16:30:00          NaN   51.7        7
2024-06-20 16:40:00 9808373455.7   19.5       56
2024-06-20 16:50:00 9808373425.0   22.0       60
2024-06-20 17:00:00 9808373455.7   16.7       60
2024-06-20 17:10:00 9808373465.0   18.3       60
2024-06-20 17:20:00 9808373487.8   17.2       50
2024-06-20 17:30:00 9808373446.3   21.9       60
2024-06-20 17:40:00 9808373438.4   17.1       60
2024-06-20 17:50:00 9808373438.0   14.9       60
2024-06-20 18:00:00 9808373412.2   17.6       60
2024-06-20 18:10:00 9808373418.2   18.3       60
2024-06-20 18:20:00 9808373394.1   19.7       60
2024-06-20 18:30:00 9808373425.3   18.8       60
2024-06-20 18:40:00 9808373485.1   18.7       60
2024-06-20 18:50:00 9808373387.6   16.0       55
2024-06-20 19:00:00 9808373469.7   16.4       60
2024-06-20 19:10:00 9808373406.8   19.3       60
2024-06-20 19:20:00 9808373401.9   17.1       60
2024-06-20 19:30:00 9808373418.5   18.9       60
2024-06-20 19:40:00 9808373448.6   17.4       58
2024-06-20 19:50:00 9808373493.5   17.8       60
2024-06-20 20:00:00 9808373480.7   14.1       60
2024-06-20 20:10:00 9808373462.0   15.6       60
2024-06-20 20:20:00 9808373426.4   17.8       50
2024-06-20 20:30:00 9808373440.2   16.7       60
2024-06-20 20:40:00 9808373514.6   17.7       60
2024-06-20 20:50:00 9808373492.9   18.6       60
2024-06-20 21:00:00 9808373437.1   17.5       58
2024-06-20 21:10:00 9808373420.2   19.0       60
2024-06-20 21:20:00 9808373413.2   18.4       60
2024-06-20 21:30:00 9808373426.9   23.2       60
2024-06-20 21:40:00 9808373435.2   17.2       60
2024-06-20 21:50:00 9808373427.7   16.6       55
2024-06-20 22:00:00 9808373456.1   18.0       60
2024-06-20 22:10:00 9808373422.9   20.9       60
2024-06-20 22:20:00 9808373424.0   18.3       60
2024-06-20 22:30:00 9808373453.9   19.0       60
2024-06-20 22:40:00 9808373486.4   18.2       60
2024-06-20 22:50:00 9808373473.6   15.1       60
2024-06-20 23:00:00 9808373433.7   16.9       60
2024-06-20 23:10:00 9808373433.4   19.1       60
2024-06-20 23:20:00 9808373484.8   18.7       50
2024-06-20 23:30:00 9808373485.6   17.6       60
2024-06-20 23:40:00 9808373393.7   16.1       60
2024-06-20 23:50:00 9808373384.0   17.9       60
2024-06-21 00:00:00 9808373349.8   26.0       30

Obtain mean gravity value

Calculate the dataset's mean gravity value (with a statistical error) at the standard transfer height (gravitools.utils.STANDARD_HEIGHT).

g_mean = dataset.mean()
g_mean
AbsGValue(value=np.float64(9808373441.901157), error=np.float64(19.447983338980816), height=0.0, vgg=Gradient(value=-3287.0, error=0))
str(g_mean)
'9,808,373,441.9 ± 19.4 nm/s² (0.0 m)'

The mean gravity at a different height can be obtained by transferring this value or by supplying a custom height to the mean() method.

str(g_mean.transfer(height=0))
'9,808,373,441.9 ± 19.4 nm/s² (0 m)'
str(dataset.mean(h=0))
'9,808,373,441.9 ± 19.4 nm/s² (0 m)'

Calculating statistics of the measurement

Calculate interesting standard statistics, which are also included in the measurement report.

dataset.get_statistics()
(np.float64(2.7663289116715513),
 np.float64(19.447983338980816),
 16.58154602293323,
 np.float64(479.71594261068253))

Visualizing the data

Set optional Matplotlib styling parameters

import matplotlib.pyplot as plt

plt.rcParams["figure.figsize"] = 8, 4
plt.rcParams["date.autoformatter.hour"] = "%H:%M\n%d %b"
plt.rcParams["axes.ymargin"] = 0.2

The GravityTimeSeries class provides a plot() method, which uses pyplot.errorbar() to create the graph.

g.plot()
<ErrorbarContainer object of 3 artists>

png

You can add more elements to the plot and customize it, just like any other matplotlib plot.

plt.figure()
g.plot("x", color="green", label=f"{interval} means", capsize=3)
plt.axhline(float(g_mean) - g.y0, label="Mean", color="k", linestyle="--")
plt.grid(color="lightgrey")
plt.title(dataset.name)
plt.legend(ncols=2)
plt.show()

png

Plot the gravity signal relative to the mean gravity

g.plot(y0=float(g_mean))
plt.show()

png

Create a stacked plot of the gravity signal and another data column

fig, axs = plt.subplots(nrows=2, sharex=True)
fig.subplots_adjust(hspace=0)
axs[0].set_title(dataset.name)
g.plot(ax=axs[0], y0=float(g_mean), label=f"{interval} means")
axs[1].plot(dataset.get("dg_pressure"), label="Pressure correction")
for ax in axs:
    ax.set_ylabel(r"$\Delta g$ [nm/s²]")
    ax.legend(loc="upper left")
plt.show()

png

Create plots of all available time series

Below is a generic method how all provided time series data can be easily plotted. The list of available variables can be shown as explained above. Here a relevant subset of this list (names can be used to replace the below example):

  • g
  • g_raw
  • dg_earth_tide
  • dg_tilt
  • dg_pressure
  • dg_polar
  • atoms_number
# obtain data
timeseries_parameter = dataset.get("dg_earth_tide")

# plot data
plt.plot(timeseries_parameter)
[<matplotlib.lines.Line2D at 0x7f494ee857f0>]

png

Create a standardized measurement report as PDF file

The AQGDataset-class provides the functionality to generate a standardized PDF report of the measurements. This report intents to include all relevant information as well as standard plots of many important parameters.

dataset.save_report("../data/AQG_measurementReport.pdf")
<Figure size 1360x600 with 0 Axes>



<Figure size 1360x600 with 0 Axes>