Installation
First, download and install R and RStudio:
Then, open RStudio and install the devtools
package
install.packages("devtools")
Finally, install the MotrpacBicQC
package
library(devtools)
devtools::install_github("MoTrPAC/MotrpacBicQC", build_vignettes = TRUE)
Usage
Load the library
library(MotrpacBicQC)
And run any of the following tests to check that the package is correctly installed and it works. For example:
# Just copy and paste in the RStudio terminal.
check_metadata_samples_lab(df = metadata_metabolites_named)
check_metadata_analyte(df = metadata_metabolites_named)
check_results_assays(df = results_named, assay_type = "lab")
which should generate the following outputs:
- (-) `metadata_samples`: Expected COLUMN NAMES are missed: FAIL
The following required columns are not present: `sample_id, sample_type, sample_order, raw_file, extraction_date, acquisition_date`
- (-) `sample_id` column missing: FAIL
- (-) `sample_type` column missing: FAIL
- (-) `sample_order` column missing: FAIL
- (-) `raw_file` column missing: FAIL
- (-) `extraction_date` column missed: FAIL
- (-) `acquisition_date` column missed: FAIL
- (-) `metadata_analytes`: Expected COLUMN NAMES are missed: FAIL
The following required columns are not present: `analyte_name, uniprot_entry, assay_name`
- (-) `analyte_name` column missing: FAIL
- (-) `uniprot_entry` column missing: FAIL
- (-) `assay_name` column missing: FAIL
- (-) `analyte_name` column missing: FAIL
- (-) `results` contains non-numeric columns: FAIL
- metabolite_name
+ ( ) Number of zeros in dataset: 14 (out of 5099 values)
+ ( ) Number of NAs in dataset: 95 (out of 5194 values)
How to test your datasets
Check full PROCESSED_YYYYMMDD
folder (recommended). The
typical folder and file structure should look like this:
└── HUMAN
└── T02
├── LAB_CK
│ ├── BATCH1_20221102
│ │ ├── PROCESSED_20221102
│ │ │ ├── metadata_analyte_named_CK_plasma.txt
│ │ │ ├── metadata_experimentalDetails_named_duke_ClinChem.txt
│ │ │ ├── metadata_sample_named_CK_plasma.txt
│ │ │ └── results_CK_plasma.txt
│ │ ├── metadata_failedsamples_20221102.txt
│ │ └── metadata_phase.txt
│ │ └── file_manifest_20240103.csv
Run test on the full submission. For that, run the following command:
n_issues <- validate_lab(input_results_folder = "/full/path/to/HUMAN/T02/LAB_CK/BATCH1_20221102/PROCESSED_20221102/",
cas = "duke",
return_n_issues = TRUE,
verbose = TRUE)
A typical output would look like this:
# LAB Assay QC Report
+ Site: duke
+ Folder: `HUMAN/T02/LAB_CK/BATCH1_20221102/PROCESSED_20221102`
+ Motrpac phase reported: HUMAN-PRECOVID (info from metadata_phase.txt available): OK
## QC `metadata_analyte` file
+ (+) File successfully opened
+ (+) All required columns present
+ (+) `analyte_name` unique values: OK
+ (+) `uniprot_entry` unique values: OK
+ Validating `uniprot_entry` IDs with the Uniprot database. Please wait...
+ (+) All `uniprot_entry` IDs are valid: OK
+ (+) `assay_name` unique values: OK
## QC `metadata_sample` file
+ (+) File successfully opened
+ (+) All required columns present
+ (+) `sample_id` unique values: OK
+ (+) `sample_type` values are valid: OK
+ (+) `sample_order` is numeric: OK
+ (+) `raw_file` values are valid: OK
+ (+) `extraction_date`: All dates are valid.
+ (+) `acquisition_date`: All dates are valid.
## QC `results` file
+ (+) File successfully opened
+ (+) `analyte_name` unique values: OK
+ (+) All measurement columns are numeric: OK
+ ( ) Number of zeros in dataset: 0 (out of 1438 values)
+ ( ) Number of NAs in dataset: 0 (out of 1438 values)
## Cross-File Validation
+ (+) All sample IDs match between results and metadata samples: OK
+ (+) All analyte IDs match between results and metadata analytes: OK
## QC Plots
+ (p) Plot QC plots: OK
## QC `file_manifest_YYYYMMDD.csv` (required)
+ (+) `file_name, md5` columns available in manifest file
+ (+) `metadata-proteins` file included in manifest: OK
+ (+) `metadata-samples` file included in manifest: OK
+ (+) `results` file included in manifest: OK
## DMAQC validation
+ ( ) File [`metadata_failedsamples.*.txt`] not found
+ ( ) NO FAILED SAMPLES reported
TOTAL NUMBER OF ISSUES: 0
Help
Additional details for each function can be found by typing, for example:
?validate_lab
Need extra help? Please, submit an issue here providing as many details as possible.