Installation
First, download and install R and RStudio:
Then, open RStudio and install the devtools
package
install.packages("devtools")
Finally, install the MotrpacBicQC
package
library(devtools)
devtools::install_github("MoTrPAC/MotrpacBicQC", build_vignettes = TRUE)
Usage
Load the library
library(MotrpacBicQC)
And run any of the following tests to check that the package is correctly installed and it works. For example:
# Just copy and paste in the RStudio terminal.
check_metadata_samples_olink(df = metadata_metabolites_named)
check_metadata_proteins(df = metadata_metabolites_named)
check_results_olink(df = results_named)
which should generate the following outputs:
- (-) `metadata_samples`: Expected COLUMN NAMES are missed: FAIL
The following required columns are not present: `sample_id, sample_type, sample_order, plate_id`
- (-) `sample_id` is missed: FAIL
- (-) `sample_type` column missed: FAIL
- (-) `plate_id` column missed: FAIL
- (-) `metadata_proteins`: Expected COLUMN NAMES are missed: FAIL
The following required columns are not present: `olink_id, uniprot_entry, assay, missing_freq, panel_name, panel_lot_nr, normalization`
- (-) `olink_id`: is missed: FAIL
- (-) `uniprot_entry` column missed: FAIL
- (-) `assay` column missed: FAIL
- (-) {missing_freq} column missed: FAIL
- (-) `panel_name` column missed: FAIL
- (-) `panel_lot_nr` column missed: FAIL
- (-) `normalization` column missed: FAIL
- (-) `olink_id` is missed: FAIL
- (-) `results.txt` contains non numeric columns: FAIL
- metabolite_name
+ ( ) Number of zeros in dataset: NA (out of 5194 values)
+ ( ) Number of NAs in dataset: 95 (out of 5194 values)
How to test your datasets
Check full RESULTS_YYYYMMDD
folder (recommended). The
typical folder and file structure should look like this:
HUMAN/
`-- T02
`-- PROT_OL
`-- BATCH1_20210825
|-- RESULTS_20221102
| |-- MOTRPAC_HUMAN_T02_PROT_OL_GER_20210825_metadata-proteins.txt
| |-- MOTRPAC_HUMAN_T02_PROT_OL_GER_20210825_metadata-samples.txt
| `-- MOTRPAC_HUMAN_T02_PROT_OL_GER_20210825_results.txt
|-- file_manifest_20240103.csv
`-- metadata_phase.txt
Run test on the full submission. For that, run the following command:
n_issues <- validate_olink(input_results_folder = "/full/path/to/HUMAN/T02/PROT_OL/BATCH1_20210825/RESULTS_20221102",
cas = "broad_rg",
return_n_issues = TRUE,
verbose = TRUE)
A typical output would look like this:
# OLINK QC report
+ Site: broad_rg
+ Folder: `HUMAN/T02/PROT_OL/BATCH1_20210825/RESULTS_20221102`
+ Motrpac phase reported: HUMAN-PRECOVID (info from metadata_phase.txt available): OK
## QC `metadata_proteins`
+ (+) File successfully opened
+ (+) All required columns present
+ (+) `olink_id`: unique values: OK
- ( ) `uniprot_entry` non-unique values detected (n duplications = 3). This is OK
- P01375
- P05231
- P10145
- ( ) `assay` non-unique values detected (n duplications = 3). This is OK
- TNF
- IL6
- CXCL8
+ (+) {missing_freq} all numeric: OK
+ (+) {panel_name} checking available panels:
- Cardiometabolic
- Neurology
- Inflammation
- Oncology
+ (+) {panel_lot_nr} checking available panels:
- B04409
- B04410
- B04407
- B04408
+ (+) {normalization} checking available panels:
- Intensity
## QC `metadata-samples.txt`
+ (+) File successfully opened
+ (+) All required columns present
+ (+) `sample_id` seems OK
+ (+) `sample_type` seems OK
+ (+) `plate_id` is available: OK
+ (+) `sample_order` is numeric: OK
+ (+) All `plate_id` values have unique `sample_order` values: OK
## QC `results.txt`
+ (+) File successfully opened
+ (+) `olink_id` seems OK
+ (+) All columns (except `olink_id`) are numeric: OK
+ ( ) Number of zeros in dataset: 55 (out of 1105472 values)
+ ( ) Number of NAs in dataset: 0 (out of 1105472 values)
## Cross File Validation
+ (+) All samples in `results.txt` are available in `metadata-samples.txt`
+ (+) All `olink_id` from `results.txt` are available in `metadata-proteins.txt`
## QC `file_manifest_YYYYMMDD.csv` (required)
+ (+) `file_name, md5` columns available in manifest file
+ (+) `metadata-proteins` file included in manifest: OK
+ (+) `metadata-samples` file included in manifest: OK
+ (+) `results` file included in manifest: OK
## DMAQC validation
+ ( ) File [`metadata_failedsamples.*.txt`] not found
+ ( ) NO FAILED SAMPLES reported
TOTAL NUMBER OF ISSUES: 0
Help
Additional details for each function can be found by typing, for example:
?validate_olink
Need extra help? Please, submit an issue here providing as many details as possible.