API Documentation

PAAT - Physical Activity Analysis Toolbox

The physical activity analysis toolbox (PAAT) is a comprehensive toolbox to analyze raw acceleration data. We developed all code mainly for analyzing ActiGraph data (GT3X files) in large sample study settings where manual annotation and analysis is not feasible.

This package has been developed and is maintained by researchers at UiT - the arctic university of Norway and was supported by the High Northern Population Studies, an interdisciplinary initiative to improve the health of future generations. The purpose of this package is to make our research on raw accelerometry easier accessible to other researchers. Most methods implemented in this package have been described in scientific papers which are usually cited in the function’s description. If you are using any of these methods in your research, we would be grateful if you cite the corresponding original paper(s).

This toolbox is divided into different modules:

Input & Output Module (paat.io)
Calibration Module (paat.calibration)
Preprocessing Module (paat.preprocessing)
Features Module (paat.features)
Wear Time Module (paat.wear_time)
Sleep Module (paat.sleep)
Estimates Module (paat.estimates)

The most important functions are also directly call-able from the module’s top level to increase usability. However, when designing applications based on PAAT, it might be cleaner to call the functions directly from the module.

Input & Output Module

paat.io provides loading and saving functions to process the raw acceleration signals.

paat.io.read_gt3x(file, rescale=True, pandas=True, metadata=False, use_pygt3x=False)[source]

Reads a .gt3x file and returns the tri-axial acceleration values together with the corresponding time stamps and all meta data.

Parameters:

filestring: file location of the .gt3x file
rescaleboolean (optional): boolean indicating whether raw acceleration data should be rescaled to g values
pandasboolean (optional): boolean indicating whether the data should be returned as a pandas DataFrame
metadataboolean (optional): boolean indicating whether the full metadata should be returned
use_pygt3xboolean (optional): boolean indicating whether to use ActiGraph’s Pygt3x library to read the file.

Returns:

dataDataFrame: a DataFrame containg the raw acceleration data
sample_freqint: the sampling frequency in which the data was recorded
timenp.array (n_samples x 1): a numpy array with time stamps for the observations in values
valuesnp.array (n_samples x 3): a numpy array with the tri-axial acceleration values. If rescale is true, data is rescaled to units of g. Note, that this function returns the values in the default order of ActiGraph which is [‘Y’,’X’,’Z’]. Depending on further use, you might want to adjust that order.
metadict: a dict containing all meta data produced by ActiGraph

paat.io.read_metadata(file)[source]

Reads the metadata from a .gt3x file.

Parameters:

filestring: file location of the .gt3x file

Returns:

metadict: a dict containing all meta data produced by ActiGraph

Sensor Calibration Module

paat.calibation provides functions to calibrate the raw acceleration signal.

paat.calibration.calibrate(acc, scale, offset)[source]

Calibrates the acceleration data based on the scale and offset variables.

Parameters:

accarray_like: numpy array with acceleration data
scalearray_like: numpy array with the scale factors
offsetarray_like: numpy array with the offset factors

Returns:

accarray_like: numpy array with calibrated acceleration data

paat.calibration.estimate_calibration_coefficents(acc)[source]

Warning

This function is not implemented yet

Estimates the calibration correction coefficients based on the method proposed by Van Hees et al. (2014)

Parameters:

accarray_like: numpy array with acceleration data

Returns:

scalearray_like: numpy array with the scale factors
offsetarray_like: numpy array with the offset factors

References

Van Hees, V. T., Fang, Z., Langford, J., Assah, F., Mohammad, A., da Silva, I. C. M., Trenell, M. I., White, T., Wareham, N. J., & Brage, S. (2014). Autocalibration of accelerometer data for free-living physical activity assessment using local gravity and temperature: An evaluation on four continents. Journal of Applied Physiology, 117(7), 738–744. https://doi.org/10.1152/japplphysiol.00421.2014

Preprocessing Module

paat.preprocessing provides functions to process the raw acceleration signals.

paat.preprocessing.resample(data, from_hz, to_hz, index, verbose)[source]

Resample data from_hz to to_hz

Parameters:

data: np.array(n_samples, 1): numpy array with single column
from_hz: int: original sample frequency of the data (this is usually the frequency the device was set to during initialization)
to_hz: int: the sampling frequency to convert to.
index: int: column index. Is used when use_parallel is set to True and the index is then used to know which column index is being returned.
verbosebool (optional): if set to True, then output debug messages to console and log file.

Returns:

index: int: column index, see above
new_data: np.array(n_samples, 1): new numpy array with resampled acceleration data

paat.preprocessing.resample_acceleration(data, from_hz, to_hz, use_parallel=False, num_jobs=2, verbose=False)[source]

Resample acceleration data to different frequency. For example, convert 100hz data to 30hz data. Enables upsampling (from lower to higher frequency), or downsampling (from higher to lower frequency)

Uses the resampy python module. see: https://github.com/bmcfee/resampy

Used in this paper: Smith, Julius O. Digital Audio Resampling Home Page Center for Computer Research in Music and Acoustics (CCRMA), Stanford University, 2015-02-23. Web published at http://ccrma.stanford.edu/~jos/resample/.

Parameters:

datanp.array: numpy array with acceleration data, can be more than one dimension
from_hzint: original sample frequency of the data (this is usually the frequency the device was set to during initialization)
to_hzint: the sampling frequency to convert to.
use_parallelBool (optional): if set to True, then individual axis will be processed in parallel to speed up computational time. Defaults to False
num_jobsint (optional): if ‘use_parallel’ is set to True, then ‘num_jobs’ defines how many parallel jobs are executed at the same time. This typically is the number of hyperthreads. Also note that for triaxial data, even if n_jobs > 3 axes, it can only process 3 at the same time.
verbosebool (optional): if set to True, then output debug messages to console and log file.

Returns:

new_datanp.array: new numpy array with resampled acceleration data

paat.preprocessing.rescale(acceleration, acceleration_scale=256.0)[source]

Rescale raw acceleration data to g values

Parameters:

accelerationnp.array(): array with YXZ acceleration data (in integers otherwise no scaling required)
acceleration_scalefloat (optional): value to scale the acceleration

Returns:

scaled_log_datanp.array(): log_data scaled by acceleration scale

Features Module

paat.features provides functions to compute various features from the raw acceleration signal.

paat.features.calculate_actigraph_counts(data, sample_freq, epoch_length)[source]

Wrapper function to create ActiGraph counts. A companion paper has been submitted by ActiGraph and will be referenced here as soon as it is published.

Parameters:

data: array_like: a numpy array containing the uniaxial acceleration data
sample_freq: int: an int indicating at which sampling frequency the data was recorded
epoch_length: str: a string indicating the length of the epochs to calculate

Returns:

counts: array_like: a numpy array containg the ActiGraph counts

paat.features.calculate_brond_counts(data, sample_freq, epoch_length)[source]

Create Brønd counts from uniaxial acceleration data. The algorithm was described in

Brønd, J. C., Andersen, L. B., & Arvidsson, D. (2017). Generating ActiGraph Counts from Raw Acceleration Recorded by an Alternative Monitor, Medicine & Science in Sports & Exercise, doi: 10.1249/MSS.0000000000001344

Parameters:

dataDataFrame: a DataFrame containg the raw acceleration data
sample_freqint: an int indicating at which sampling frequency the data was recorded
epoch_length: int: an int indicating the length of the epochs to calculate in seconds

Returns:

counts: DataFrame: a DataFrame containg the Brønd counts

paat.features.calculate_enmo(data, dtype=<class 'numpy.float32'>)[source]

Calculate the Euclidean norm minus one from raw acceleration data. This function is a wrapper of calculate_vector_magnitude.

Parameters:

dataarray_like: numpy array with acceleration data
dtypenp.dtype (optional): set the data type of the return array. Standard float 32, but can be set to better precision

Returns:

vector_magnitudenp.array (acceleration values, 1)(np.float): numpy array with the Eucledian Norm Minus One (ENMO) of the acceleration

paat.features.calculate_frequency_features(data, win_len=60, win_step=60, sample_rate=100, nfft=512, nfilt=40)[source]

Calculate frequency features from raw acceleration signal.

Parameters:

dataDataFrame: a DataFrame containg the raw acceleration data
win_lenint (optional): an int indicating the window length in seconds
win_stepint (optional): an int indicating the step size between windows in seconds
sample_ratefloat (optional): a float indicating the sampling rate in Hz
nfft: int (optional): an int indicating the number of points for the Fourier transform
nfilt: int (optional): an int indicating the number of triangular filters to use

Returns:

timenp.array (n_samples x 1): a numpy array with time stamps for the observations in values
featuresnp.array (n_samples x 160): a numpy array with the 160-dimensional feature vector per time step

paat.features.calculate_mad(data, freq='6s')[source]

Calculate the mean amplitude deviation (MAD) of the raw acceleration signal based on Vähä-Ypyä et al. (2015), by

\[MAD = \frac{1}{n} \sum_i | r_i - \bar{r} |\]

with

\[r = \sqrt{y^2 + x^2 + z^2}\]

Parameters:

dataarray_like: numpy array with acceleration data
freqstr: the sampling frequency on which the MAD values should be calculated

Returns:

madnp.array (acceleration values, 1)(np.float): numpy array with the Mean Amplitude Deviation (MAD) of the acceleration

References

Vähä-Ypyä, H., Vasankari, T., Husu, P., Suni, J., & Sievänen, H. (2015). A universal, accurate intensity-based classification of different physical activities using raw data of accelerometer. Clinical Physiology and Functional Imaging, 35(1), 64–70. https://doi.org/10.1111/cpf.12127

paat.features.calculate_vector_magnitude(data, minus_one=False, round_negative_to_zero=False, dtype=<class 'numpy.float32'>)[source]

Calculate the vector magnitude of the acceleration data.

Parameters:

dataarray_like: numpy array with acceleration data
minus_oneBoolean (optional): If set to True, the calculate the vector magnitude minus one, also known as the ENMO (Euclidian Norm Minus One)
round_negative_to_zeroBoolean (optional): If set to True, round negative values to zero
dtypenp.dtype (optional): set the data type of the return array. Standard float 32, but can be set to better precision

Returns:

vector_magnitudenp.array (acceleration values, 1)(np.float): numpy array with vector magnitude of the acceleration

Notes

The vector magnitude of the acceleration is calculated as the Euclidian Norm.

\[\sqrt{y^2 + x^2 + z^2}\]

if minus_one is set to True then it it is the Euclidian Norm Minus One.

\[\sqrt{y^2 + x^2 + z^2} - 1\]

if round_negative_to_zero all negative values are clipped.

Wear Time Module

paat.wear_time provides functions to infer wear and non wear times from raw acceleration signals.

paat.wear_time.detect_non_wear_time_hees2011(data, sample_freq, min_non_wear_time_window=60, window_overlap=15, std_mg_threshold=3.0, std_min_num_axes=2, value_range_mg_threshold=50.0, value_range_min_num_axes=2)[source]

Estimation of non-wear time periods based on Van Hees et al. (2011, 2013)

Accelerometer non-wear time was estimated on the basis of the standard deviation and the value range of each accelerometer axis, calculated for consecutive blocks of 30 minutes. A block was classified as non-wear time if the standard deviation was less than 3.0 mg (1 mg = 0.00981 m·s−2) for at least two out of the three axes or if the value range, for at least two out of three axes, was less than 50 mg.

Important to note that the default encoding for this function of non-wear time = 0, and that of wear time is 1.

Parameters:

dataDataFrame: a DataFrame containg the raw acceleration data
sample_freqint: sample frequency in hertz. Indicates the number of samples per 1 second. Default to 100 for 100hz. The sample frequency is necessary to know how many samples there are in a specific window. So let’s say we have a window of 15 minutes, then there are hz * 60 * 15 samples
min_non_wear_time_window: int (optional): minimum window length in minutes to be classified as non-wear time
window_overlap: int (optional): basically the sliding window that progresses over the acceleration data. Defaults to 15 minutes.
std_mg_threshold: float (optional): standard deviation threshold in mg. Acceleration axes values below or equal this threshold can be considered non-wear time. Defaults to 3.0g. Note that within the code we convert mg to g.
std_min_num_axes: int (optional): minimum numer of axes used to check if acceleration values are below the std_mg_threshold value. Defaults to 2 axes; meaning that at least 2 axes need to have values below a threshold value to be considered non wear time
value_range_mg_threshold: float (optional): value range threshold value in mg. If the range of values within a window is below this threshold (meaning that there is very little change in acceleration over time) then this can be considered non wear time. Default to 50 mg. Note that within the code we convert mg to g
value_range_min_num_axes: int (optional): minimum numer of axes used to check if acceleration values range are below the value_range_mg_threshold value. Defaults to 2 axes; meaning that at least 2 axes need to have a value range below a threshold value to be considered non wear time

Returns:

nw_vector: np.array(n_samples,): a numpy array indicating whether the values of the acceleration data are non-wear time

References

Van Hees, V. T., Renström, F., Wright, A., Gradmark, A., Catt, M., Chen, K. Y., Löf, M., Bluck, L., Pomeroy, J., Wareham, N. J., Ekelund, U., Brage, S., & Franks, P. W. (2011). Estimation of Daily Energy Expenditure in Pregnant and Non-Pregnant Women Using a Wrist-Worn Tri-Axial Accelerometer. PLOS ONE, 6(7), 7. https://doi.org/10.1371/journal.pone.0022922

Hees, V. T. van, Gorzelniak, L., León, E. C. D., Eder, M., Pias, M., Taherian, S., Ekelund, U., Renström, F., Franks, P. W., Horsch, A., & Brage, S. (2013). Separating Movement and Gravity Components in an Acceleration Signal and Implications for the Assessment of Human Daily Physical Activity. PLOS ONE, 8(4), e61691. https://doi.org/10.1371/journal.pone.0061691

paat.wear_time.detect_non_wear_time_naive(data, sample_freq, std_threshold, min_interval, use_vmu=False, min_segment_length=1, sliding_window=1)[source]

Calculate non-wear time from raw acceleration data by finding intervals in which the acceleration standard deviation is below a std_threshold value

Parameters:

dataDataFrame: a DataFrame containg the raw acceleration data
sample_freqint: sample frequency of the data
std_threshold: int or float: the standard deviation threshold in g
min_interval: int or float: minimal interval to consider period as non-wear time
use_vmu: bool: indicates whether the algorithm should runon vector magnitude data
min_segment_length: int (optional): minimum length of the segment to be candidate for non-wear time (default 1 minutes, so any shorter segments will not be considered non-wear time)
sliding_window: int (optional): sliding window in minutes that will go over the acceleration data to find candidate non-wear segments

Returns:

nw_vector: np.array(n_samples,): a numpy array indicating whether the values of the acceleration data are non-wear time

paat.wear_time.detect_non_wear_time_syed2021(data, sample_freq, cnn_model_file=None, std_threshold=0.004, distance_in_min=5, episode_window_sec=7, edge_true_or_false=True, start_stop_label_decision='and', min_segment_length=1, sliding_window=1, verbose=False)[source]

Infer non-wear time from raw 100Hz triaxial data based on the method proposed by Syed et al. (2021). Data at different sample frequencies will be resampled to 100hz.

The steps are described in the paper but are summarized here too:

Detect candidate non-wear episodes:: Perform a forward pass through the raw acceleration signal and calculate the SD for each 1-minute interval and for each individual axis. If the standard deviation is <= 0.004 g for all axes, record this 1-minute interval as a candidate non-wear interval. After all 1-minute intervals have been processed, merge consecutive 1-minute intervals into candidate non-wear episodes and record their start and stop timestamps.
Merge bordering candidate non-wear episodes:: Merge candidate non-wear episodes that are no more than 5 minutes apart and record their new start and stop timestamps. This step is required to capture artificial movement that would typically break up two or more candidate non-wear episodes in close proximity.
Detect the edges of candidate non-wear episodes:: Perform a backward pass with a 1-second step size through the acceleration data from the start timestamp of a candidate non-wear episode and calculate the SD for each individual axis. The same is applied for the stop timestamps with a forward pass and a step size of 1 second. If the standard deviation of all axes is <= 0.004 g, include the 1-second interval into the candidate non-wear episode and record the new start or stop timestamp. Repeat until the standard deviation of the 1-second interval does not satisfy <= 0.004 g. As a result, the resolution of the edges is now recorded on a 1-second resolution.
Classifying the start and stop windows:: For each candidate non-wear episode, extract the start and stop segment with a window length of 3 seconds to create input features for the CNN classification model. For example, if a candidate non-wear episode has a start timestamp of tstart a feature matrix is created as (tstart – w , tstart) x 3 axes with w = 3 seconds, resulting in an input feature with dimensions (300 x 3) for 100Hz data. If both (i.e., logical ‘AND’) start and stop features are classified (through the CNN model) as non-wear time, the candidate non-wear episode can be considered true non-wear time. If tstart is at t = 0, or tend is at the end of the acceleration data—meaning that those candidate non-wear episodes do not have a preceding or following window to extract features from—classify the start or stop as non-wear time by default.

Parameters:

dataDataFrame: a DataFrame containg the raw acceleration data
sample_freqint: sample frequency of the data. The CNN model was trained for 100Hz of data. If the data is at a different sampling frequency it will be resampled to 100Hz
cnn_model_file: os.path (optional): file location of the trained CNN model. On default, the corresponding pretrained model is used.
std_threshold: float (optional): standard deviation threshold to find candidate non-wear episodes. Default 0.004 g
distance_in_min: int (optional): causes two nearby candidate non-wear episodes not more than ‘distance_in_min’ apart to be grouped/merged together. This results in capturing artificial movement that would otherwise break up a longer candidate non-wear time. Defaults to 5 minutes.
episode_window_sec: int (optional): length of the window to extract features from the start or the end of a candidate non-wear episode. If a non-wear episodes starts at time t, then a feature will be extracted from the raw data t-‘episode_window_sec’ to t. The same happens at the end of a candidate non-wear episode. So t-end untill t-end + ‘episode_window_sec’ Default to 7 seconds. Also note that a different value will need a different trained CNN model.
edge_true_or_false: Bool (optional): default classification if a candidate non-wear episode starts at the start of the acceleration data, so at t=0, or ends at the end of the acceleration data. In such cases, we can’t obtain the feature at t-‘episode_window_sec’ since there is no data before t=0. In these cases, the start or stop of the candidate non-wear episode will be defaulted to True (non-wear time) or False (wear-time). Default value is True
start_stop_label_decision: string (‘or’,’and’) (optional): Logical operator OR or AND to determine if a candidate non-wear episode should be considered non-wear time if only one side, either the start or the stop parts, is inferred as non-wear time, or if both sides need to be inferred as non-wear time for the candidate non-wear time to be considered true non-wear time. Default to AND, meaning that both the start and the stop parts of the candidate non-wear time need to be inferred as non-wear time to allow the candidate non-wear time to be true non-wear time. In all other cases, the candidate non-wear time is then wear-time.
min_segment_length: int (optional): minimum length of the segment to be candidate for non-wear time
sliding_window: int (optional): sliding window in minutes that will go over the acceleration data to find candidate non-wear segments
verbose: Bool (optional): set to True if debug messages should be printed to the console and log file. Default False.

Returns:

nw_vector: np.array(n_samples,): a numpy array indicating whether the values of the acceleration data are non-wear time

Notes

If the data is not 100hz, then it will be resampled to 100hz. However, how the inference of non-wear time is affected by this has not been investigated.
CNN models were trained with a hip worn accelerometer.

References

Syed, S., Morseth, B., Hopstock, L. A., & Horsch, A. (2021). A novel algorithm to detect non-wear time from raw accelerometer data using deep convolutional neural networks. Scientific Reports, 11(1), 8832. https://doi.org/10.1038/s41598-021-87757-z

Sleep Module

paat.sleep provides functions to detect periods of sleep in the raw acceleration signals.

paat.sleep.detect_time_in_bed_weitz2024(data, sample_freq, resampled_frequency='1min', means=None, stds=None, model=None)[source]

Infer time in bed from raw acceleration signal using the method of Weitz et al. (2025).

Parameters:

dataDataFrame: a DataFrame containg the raw acceleration data
sample_freqint: the sampling frequency in which the data was recorded
resampled_frequencystr (optional): a str indicating to what frequency the data should be resampled. This depends on the model used to predict, defaults to 1min.
meansarray_like (optional): a numpy array with the channel means, will be calculated for the sample if not specified
stdsarray_like (optional): a numpy array with the channel stds, will be calculated for the sample if not specified
modelkeras.Model (optional): a loaded keras custom model.

Returns:

predicted_time_in_bednp.array (n_samples,): a numpy array indicating whether the values of the acceleration data were spent in bed

References

Weitz, M., Syed, S., Hopstock, L. A., Morseth, B., Henriksen, A., & Horsch, A. (2025). Automatic time in bed detection from hip-worn accelerometers for large epidemiological studies: The Tromsø Study. PLOS ONE, 20(5), e0321558. https://doi.org/10.1371/journal.pone.0321558

Estimates Module

paat.estimates provides functions to compute physical activity estimates from the raw acceleration signal.

paat.estimates.calculate_pa_levels(data, sample_freq, mvpa_cutpoint, sb_cutpoint, interval='1s')[source]

Calculate moderate to vigourous physical activity (MVPA) and sedentary behavior based on cutpoints (mvpa_cutpoint and sb_cutpoint) on 1s resolution. The algorithm works by

The Euclidian norm minus one (ENMO) is calculated from the triaxial signal

The ENMO is averaged over 1s epochs

These epochs are compared against the mvpa_cutpoint and the sb_cutpoint

Parameters:

dataDataFrame: a DataFrame containg the raw acceleration data
sample_freqint: the sampling frequency in which the data was recorded
mvpa_cutpointfloat: a float indicating the cutpoint between light physical activity and moderate-to-vigourous activity
sb_cutpointfloat: a float indicating the cutpoint between light physical activity and sedentary behavior
intervalstr (optional): a str indicating at what frequency the cutpoints are calculated

Returns:

pa_levelsnp.array (n_samples, 2): a numpy array indicating whether the values of the acceleration data are moderate-to-vigourous physical activity (first column) or sedentary behavior (second column)

paat.estimates.create_activity_column(data, columns=['SB', 'MVPA', 'Sleep', 'Non Wear Time'])[source]

Merge the different activity columns into one label column.

Parameters:

dataDataFrame: a DataFrame containg the raw acceleration data
columnsarray_like: a list of activity columns in ascending order of importance. The order of the list implies which activity overrides which. E.g. the first entry would override the second in cases of doubt, etc.

Returns:

activity_vecarray_like: the merged activity vector with the names of columns as entries

References

Brønd, J. C., Andersen, L. B., & Arvidsson, D. (2017). Generating ActiGraph Counts from Raw Acceleration Recorded by an Alternative Monitor. Medicine & Science in Sports & Exercise, 49(11), 2351. https://doi.org/10.1249/MSS.0000000000001344

Syed, S., Morseth, B., Hopstock, L. A., & Horsch, A. (2021). A novel algorithm to detect non-wear time from raw accelerometer data using deep convolutional neural networks. Scientific Reports, 11(1), 8832. https://doi.org/10.1038/s41598-021-87757-z

Van Hees, V. T., Fang, Z., Langford, J., Assah, F., Mohammad, A., da Silva, I. C. M., Trenell, M. I., White, T., Wareham, N. J., & Brage, S. (2014). Autocalibration of accelerometer data for free-living physical activity assessment using local gravity and temperature: An evaluation on four continents. Journal of Applied Physiology, 117(7), 738–744. https://doi.org/10.1152/japplphysiol.00421.2014

Van Hees, V. T., Gorzelniak, L., León, E. C. D., Eder, M., Pias, M., Taherian, S., Ekelund, U., Renström, F., Franks, P. W., Horsch, A., & Brage, S. (2013). Separating Movement and Gravity Components in an Acceleration Signal and Implications for the Assessment of Human Daily Physical Activity. PLOS ONE, 8(4), e61691. https://doi.org/10.1371/journal.pone.0061691

Van Hees, V. T., Renström, F., Wright, A., Gradmark, A., Catt, M., Chen, K. Y., Löf, M., Bluck, L., Pomeroy, J., Wareham, N. J., Ekelund, U., Brage, S., & Franks, P. W. (2011). Estimation of Daily Energy Expenditure in Pregnant and Non-Pregnant Women Using a Wrist-Worn Tri-Axial Accelerometer. PLOS ONE, 6(7), e22922. https://doi.org/10.1371/journal.pone.0022922

Vähä-Ypyä, H., Vasankari, T., Husu, P., Suni, J., & Sievänen, H. (2015). A universal, accurate intensity-based classification of different physical activities using raw data of accelerometer. Clinical Physiology and Functional Imaging, 35(1), 64–70. https://doi.org/10.1111/cpf.12127

Weitz, M., Syed, S., Hopstock, L. A., Morseth, B., Henriksen, A., & Horsch, A. (2025). Automatic time in bed detection from hip-worn accelerometers for large epidemiological studies: The Tromsø Study. PLOS ONE, 20(5), e0321558. https://doi.org/10.1371/journal.pone.0321558