Timewise summary statistics and training FDR from differential analysis (DA) that tests the effect of training on each metabolite within each sex. One data frame per tissue.
For metabolites measured on more than one platform, the consensus result may be provided. For metabolites measured on a single platform, these results are identical to METAB_DA.
Usage
METAB_ADRNL_DA_METAREG
METAB_BAT_DA_METAREG
METAB_COLON_DA_METAREG
METAB_CORTEX_DA_METAREG
METAB_HEART_DA_METAREG
METAB_HIPPOC_DA_METAREG
METAB_HYPOTH_DA_METAREG
METAB_KIDNEY_DA_METAREG
METAB_LIVER_DA_METAREG
METAB_LUNG_DA_METAREG
METAB_OVARY_DA_METAREG
METAB_PLASMA_DA_METAREG
METAB_SKMGN_DA_METAREG
METAB_SKMVL_DA_METAREG
METAB_SMLINT_DA_METAREG
METAB_SPLEEN_DA_METAREG
METAB_TESTES_DA_METAREG
METAB_VENACV_DA_METAREG
METAB_WATSC_DA_METAREG
Format
A data frame with 30 variables:
feature
character, unique feature identifier in the format 'ASSAY_ABBREV;TISSUE_ABBREV;feature_ID' only for training-regulated features at 5% IHW FDR. For redundant differential features, 'feature_ID' is prepended with the specific platform to make unique identifiers. See REPEATED_FEATURES for details.
assay
character, assay abbreviation, one of ASSAY_ABBREV
assay_code
character, assay code used in data release. See MotrpacBicQC::assay_codes.
tissue
character, tissue abbreviation, one of TISSUE_ABBREV
tissue_code
character, tissue code used in data release. See MotrpacBicQC::bic_animal_tissue_code.
feature_ID
character, MoTrPAC feature identifier
dataset
character, metabolomics platform (metab-u-ionpneg, metab-u-lrpneg, metab-u-lrppos, metab-u-hilicpos, metab-u-rpneg, metab-u-rppos, metab-t-amines, metab-t-oxylipneg, metab-t-tca, metab-t-nuc, metab-t-acoa, metab-t-ka) the feature was measured in. 'meta-reg' specifies results from the metabolomics meta-regression for repeated features.
site
character, Chemical Analysis Site (CAS) name
is_targeted
logical, is this a targeted platform?
sex
character, one of 'male' or 'female'
comparison_group
character, time point of trained animals compared to the sex-matched sedentary control animals, one of '1w', '2w', '4w', '8w'
p_value
double, unadjusted p-value for the difference between 'comparison_group' and the group of sex-matched sedentary control animals
adj_p_value
double, adjusted p-value from 'p_value' column. P-values are BY-adjusted across all datasets within a given assay/ome.
logFC
double, log fold-change where the numerator is 'comparison_group', e.g., '1w', and the denominator is the group of sex-matched sedentary control animals
logFC_se
double, standard error of the log fold-change
tscore
double, t statistic
zscore
double, z statistic
covariates
character, comma-separated list of adjustment variables or NA
comparison_average_intensity
double, average intensity among the replicates in 'comparison_group'
reference_average_intensity
double, average intensity among the replicates in the group of sex-matched sedentary control animals
metabolite_refmet
character, RefMet name of metabolite
cv
double, feature coefficient of variation in the dataset
metabolite
character, name of metabolite as appears in the CAS's data
control_cv
double, feature coefficient of variation in the dataset
mz
double, mass over charge
rt
double, retention time
neutral_mass
double, neutral mass
meta_reg_het_p
double, for metabolites with multiple measurements, the meta-regression heterogeneity p-value, where a smaller p-value indicates more disagreement between platforms. One value per feature (not per training group).
meta_reg_pvalue
double, for metabolites with multiple measurements, the meta-regression p-value. One value per feature (not per training group).
selection_fdr
double, adjusted training p-value used to select training-regulated analytes. P-values are IHW-adjusted across all datasets within a given assay with tissue as a covariate.
An object of class data.frame
with 9086 rows and 30 columns.
An object of class data.frame
with 1888 rows and 30 columns.
An object of class data.frame
with 1672 rows and 30 columns.
An object of class data.frame
with 9872 rows and 30 columns.
An object of class data.frame
with 8920 rows and 30 columns.
An object of class data.frame
with 1736 rows and 30 columns.
An object of class data.frame
with 10376 rows and 30 columns.
An object of class data.frame
with 10264 rows and 30 columns.
An object of class data.frame
with 10208 rows and 30 columns.
An object of class data.frame
with 912 rows and 30 columns.
An object of class data.frame
with 8696 rows and 30 columns.
An object of class data.frame
with 8472 rows and 30 columns.
An object of class data.frame
with 1584 rows and 30 columns.
An object of class data.frame
with 1800 rows and 30 columns.
An object of class data.frame
with 1816 rows and 30 columns.
An object of class data.frame
with 872 rows and 30 columns.
An object of class data.frame
with 1278 rows and 30 columns.
An object of class data.frame
with 8808 rows and 30 columns.
Details
The metabolomics data were collected using different platforms from six different sites. Moreover, some platforms were targeted (i.e., quantified a predefined small set of metabolites of interest), whereas other platforms were untargeted. There were 1116 cases in which at least two sites measured the same metabolite in the same tissue. In these cases, we used meta-regression to integrate the differential analysis results, implemented using the metafor R package.
For a given metabolite \(m\), the input to this analysis included the timewise effect sizes \(y_{g,p}\) and their variances
\(v_{g,p}\) where \(g\) denotes the analysis group, which is a combination of the training time point and the sex for
which the summary statistics were computed using the regression models explained above, and \(p \in (1,...,n_{m})\)
denotes the platform. If \(m\) had data from at least three platforms, of which at least one was untargeted and
at least one was targeted, then we added nested random effects for both the platform and the targeted status.
That is, in metafor
’s notation we used: "mods ~ 0+analysis_group"
and
"random = list(~analysis_group|platform, ~analysis_group|is_targeted)"
. The inner|outer
notation defines a
blockwise dependence structure for the random effects, where different outer values are assumed to be independent,
and the same outer values may be dependent based on their inner values. If the targeted status was redundant
(e.g., we only had two platforms, one was targeted and the other was untargeted) then we kept the platform-level random effects only.
In practice, we observed that in 154 cases the default metafor
optimizer failed to converge. In these cases,
we modified the default parameters to optimizer = "nloptr", algorithm = "NLOPT_LN_SBPLX"
. Still, optimization
failed in 61 additional cases. When both the default and alternative optimizers failed, we opted for a standard
fixed effect model without random effects. Other than these cases, we had 361 models with random effects for both
the platform and the targeted status, and 694 models with a platform-only random effect.
As a summary of each model we kept the overall model p-value (i.e., the modifiers p-value) \(QMp\) (used as the training p-value), and the residual heterogeneity p-value \(QEp\). We flagged 103 models with \(QEp < 0.001\) as having excessive heterogeneity. Using this definition we partitioned the meta-analysis results into three classes: (1) excessive heterogeneity, and has a targeted platform (57 cases), (2) excessive heterogeneity, without a targeted platform (46 cases), and (3) low heterogeneity (1013 cases). For class (2) we discarded the meta-analysis and kept the all platform-level results for m as-is. For class (1) we discarded the meta-analysis and kept only the targeted platform-level results for m. For class (3) we kept the meta-analysis results only (i.e., discarded the platform-level results), and used the meta-regression model to calculate the timewise summary statistics, i.e., time- and sex-specific meta-analysis effect sizes, their variance, and their p-values.
Reproduce our analysis with MotrpacRatTraining6mo::metab_meta_regression()
.