- 📖 Methylation Pipline Overview
- ⚡️ Quickstart
- Input Paths
- Default Working Directory
- Output Paths - ⚙️ Executing Methylation CLI
- 🧪 Run the Test Case
⚠️ Troubleshooting
Download and install the following packages:
You can automatically install these requirements using /Volumes/CBioinformatics/Methylation/install_requirements.sh
Use ARM (-arm64.pkg) package downloads for M1/M2 Macs & Intel (-x86_64.pkg) for older non-Apple Silicon Chip Based Macs)
- R 4.3 or higher: https://cran.r-project.org/bin/macosx/
- Gfortran from GitHub use dmg installer: https://github.com/fxcoudert/gfortran-for-macOS/releases/
- RStudio 2023.06.0+421 or later: https://www.rstudio.com/products/rstudio/download/#download
- XQuartz: https://www.xquartz.org/
- LaTeX for Mac: https://www.tug.org/mactex/mactex-download.html Direct DL or
brew install --cask basictex
- Pandoc: https://pandoc.org/installing.html
- XCode command line tools for Mac OS: in iTerm or Terminal enter
xcode-select --install
- Java 8 JDK: https://www.oracle.com/java/technologies/downloads/#java8-mac (Intel Macs) or Java for M1/M2 Macs
- Homebrew: https://brew.sh/
- Library Magic, Sqlite and Proj:
brew install libmagic sqlite proj tcl-tk
- Compilers+:
brew install llvm aspell gdal autoconf automake gcc libgit2 openssl@3 zlib go pandoc git libffi
- Additional Libraries:
brew install texinfo pango cairo open-mpi poppler-qt5 graphviz libopenmpt java11 libomp libtorch openjdk gmp mpfr pkg-config apache-arrow udunits mariadb-connector-c libtiff
echo 'export PATH="/usr/local/opt/openjdk/bin:$PATH"' >> ~/.zshrc
sudo R CMD javareconf
R CMD config --all
- Additional OpenGL:
brew install --from-source glfw3
brew install cmake && brew uninstall glfw
git clone https://github.com/glfw/glfw.git && cd glfw && \
cmake -DCMAKE_OSX_ARCHITECTURES=arm64 . && \
make && \
sudo make install
You may need to unlock permissions before installing packages in the Mac's System Preferences Privacy & Security Panel:
https://github.com/NYU-Molecular-Pathology/Methylation/blob/main/Notes/SystemPermissions.md
- You can install all the dependencies above by executing the script on the CBioinformatics shared drive:
text
/Volumes/CBioinformatics/Methylation/install_requirements.sh - After you have installed all the required system dependencies above in Essential Downloads above, you must install all the R packages needed to install and run the classifiers.
- Before running the classifier for the first time run the Rscript below,
all_installer.R
, to install any R-package dependencies. The script only needs to be run the first time installing the classifier on new systems. - For better debugging, paste the raw code from the URL into RStudio:
https://raw.githubusercontent.com/NYU-Molecular-Pathology/Methylation/main/R/all_installer.R
- To install & run the pipeline, it is critical to mount the following network smb shared drives:
- Open Finder and press ⌘(CMD) + K then paste each of the directories below, using NYUMC\KerberosID as the login name and password is your kerberos password.
smb://research-cifs.nyumc.org/Research/CBioinformatics/
smb://research-cifs.nyumc.org/Research/snudem01lab/snudem01labspace
smb://shares-cifs.nyumc.org/apps/acc_pathology/molecular
- You can download runMeth.sh in this repo under Methylation/Meth_Scripts/ or use curl/wget:
curl -# -L https://raw.githubusercontent.com/NYU-Molecular-Pathology/Methylation/main/Meth_Scripts/runMeth.sh >$HOME/runMeth.sh
You can use nano $HOME/runMeth.sh
methAPI="XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" #Paste your API Token here
- Note: Your API Token can be found in "All Samples DataBase" on the left-side panel called "API" in REDCap. Explained here
chmod +rwx $HOME/runMeth.sh
- If install of any packages fail, be sure to check the troubleshooting section at the bottom of this page
Files are copied to the work directory by their RUNID name and YEAR, including the worksheet and idats for example:
- Worksheets
/Volumes/molecular/MOLECULAR LAB ONLY/NYU-METHYLATION/WORKSHEETS/2022/22-MGDM17.xlsm
- .idat files
/Volumes/molecular/Molecular/iScan/
- Input files are copied and report files are generated on the Cbioinformatics drive:
/Volumes/CBioinformatics/Methylation/Clinical_Runs/22-MGDM17
- Html report files saved to the working directory are copied to the Z-drive
For example, run 22-MGDM17 report files would be output in the following directories:
/Volumes/molecular/Molecular/MethylationClassifier/2022/22-MGDM17
/Volumes/molecular/MOLECULAR LAB ONLY/NYU-METHYLATION/Results/2022/22-MGDM17
To run the Clinical or Research Methylation pipeline, simply use the locally stored Shell Script in:
/Volumes/CBioinformatics/Methylation/runMeth.sh
- This shell script uses Curl to download the files from this repo and takes four positional argument inputs to execute methylExpress.R in the terminal.
- The bash script stores your REDCap API token locally and only requires the methylation run ID to be entered.
- You can copy runMeth.sh and create an alias or symlink to execute more easily. For example:
alias runmeth='bash $HOME/runMeth.sh'
orecho "alias runmeth='bash $HOME/runMeth.sh'" >> ~/.bashrc
The shell script takes the following positional arguments:
methAPI='XXXXXXXX' # (hardcoded) Your REDCap API Token
methRun=${1-NULL} # methylation run id e.g. 22-MGDM17
PRIORITY=${2-NULL} # string of prioritized RD-numbers
runPath=${3-NULL} # any custom directory to copy/run the idat files
redcapUp=${4-NULL} # to upload to redcap or not if server down single char i.e. "T" or "F"
runLocal=${5-NULL} # If the run directory should be executed without shared drives locally i.e. "T" or "F"
The four positional arguments from runmeth.sh are passed to the Rscript methylExpress.R:
arg[1]
is the token
for the API call ('#######################')
arg[2]
is the RunID
which if NULL runs the latest Clinical Worksheet 22-MGDM17
arg[3]
is the selectRds
parameter which is to prioritize samples being run (NULL)
arg[4]
is the baseFolder
parameter which is optional if you want to run/save output to a different directory (NULL)
Alternatively, instead of passing the RunID to runmeth.sh, you can source and download this repository and then locally edit args in methylExpress.R to run manually.
After installation, test the pipeline from your terminal, by executing the test case:
$HOME/runMeth.sh 21-MGDM_TEST
or if you have not saved the runMeth.sh script locally:
/Volumes/CBioinformatics/Methylation/runMeth.sh 21-MGDM_TEST
You can then check the output to confirm each html report was generated in the output directory:
/Volumes/CBioinformatics/Methylation/Clinical_Runs/21-MGDM_TEST
ls -lha "$HOME/Desktop/html_21-MGDM_TEST/21-MGDM_TEST/"
NOTE: When running the test case (21-MGDM_TEST) you may notice an error with the upload log as these reports would already exist in REDCap. It is normal for the test case html files to fail uploading since the REDCap database already contains the data and files for the test run, 21-MGDM_TEST.
- For Individual Cases: Execute the script directly with RD-numbers, for example:
Rscript --verbose /Volumes/CBioinformatics/Methylation/Clinical_Runs/Sarcoma_runs/methylExpress_sarcoma.R RD-15-123 RD-16-1234 RD-17-321
- For Several/Bulk Cases: Execute the script by passing the path to a csv file containing a list of RD-numbers in the first column, for example:
Rscript --verbose /Volumes/CBioinformatics/Methylation/Clinical_Runs/Sarcoma_runs/methylExpress_sarcoma.R /Path/To/Desktop/MyListRDs.csv
In the event the shared drive is not accessible, the script without the API token is availible here
Pipeline Installation Issues
1. If you have issues with package installation or dependencies:Make sure compilers are installed by opening Xcode.app or executing `sudo xcode-select --install`
2. Then, execute the all_installer.R script by copy and pasting the raw contents of the script below into Rstudio before running runmeth.sh again: https://raw.githubusercontent.com/NYU-Molecular-Pathology/Methylation/main/Research/all_installer.R
3. To resolve any problems during automation, you can open methylExpress.R in RStudio which is downladed by runmeth.sh to your home directory.
4. Try to run `sudo xcode-select -s /Library/Developer/CommandLineTools` and `brew install gdal proj` then install the package **rgdal** in Rstudio.
5. Download the libraries below from their sources:
(a) sqlite-autoconf-3330000.tar.gz from "https://www.sqlite.org/download.html".
(b) tiff-4.1.0.tar.gz from "https://download.osgeo.org/libtiff/"
(c) proj-7.2.0.tar.gz from "https://proj.org/download.html#current-release"
(d) libgeotiff-1.6.0.tar.gz from "https://download.osgeo.org/geotiff/libgeotiff/"
(e) geos-3.8.1.tar.bz2 from "https://trac.osgeo.org/geos"
(f) gdal-3.2.0.tar.gz from "https://gdal.org/download.html"
REDCap errors
1. Once your run completes check in your run directory if there is any *upload_log.tsv* file or *redcaperrors.txt*. If these files exist, they may note any files or data which would have been over-written in the database. 2. Check with the wet lab if any RD-numbers were duplicated or previously used for the samples listed in the upload_log.tsv file. 3. Make sure your API token is not NULL and that REDCap is not down for maintenence here: https://redcap.nyumc.org/apps/redcap/4. Check if any of the urls in the notification or API calls have been broken by a new version of REDCap. For Example, the link: https://redcap.nyumc.org/apps/redcap/redcap_v13.1.35/API/project_api.php?pid=24752 if broken, modify the URL to match REDCap Version i.e. /redcap/**redcap_v13.2.57**/)
Additional resources are here: https://redcap.nyumc.org/apps/redcap/index.php?action=help&newwin=1
REDCap Email Notification issues
The automatic email notifications are located on the left-side panel called "Alerts & Notifications". If you need to change an output path in the email or change the year in the email, click on edit for Alert #1:Research Run Complete or Alert #2:Clinical Run Complete.View the "Applications Overview" video here: https://redcap.nyumc.org/apps/redcap/index.php?action=training
A detailed guide for Alerts is availible here: https://www.ctsi.ufl.edu/wordpress/files/2019/06/REDCap-Alerts-Notifications-User-Guide.pdf
Additional resources are here: https://redcap.nyumc.org/apps/redcap/index.php?action=help&newwin=1
How to upload manually to REDCap
1. Login with your kerberos ID to https://redcap.nyumc.org/2. On the left-hand sidebar scroll all the way down the Reports Bookmarks until you see the folders:
`>>>>CURRENT Runs~~~~~ and 3) >>>>>CLINICAL Current Run`
3. Here, you can click on the RD-number of choice and then select "Upload html file" under the methylation menue
4. Optionally, you can also select "Add / Edit Records" menu in the left sidebar and find your RD-number in the "Search query" field
5. To upload the sample classifier details, such as the values and scores, a csv file named <run_id>_Redcap.csv is saved on the Desktop in a folder a run folder created named with the <run_id>. This file can be uploaded in the import tab of REDCap under *Data Import Tool* in the sidebar. The folder will also contain a <run_id>_samplesheet.csv file used in the run derived from the .xlsm file.
Additionally, this file is copied to: `/Volumes/CBioinformatics/Methylation/Clinical_Runs/csvRedcap/<run_id>/<run_id>_Redcap.csv`
Issues with installing or running packages
1. If you are getting compiler errors or all_installer.R fails, try installing additional system dependencies with brew and restart your R session: https://raw.githubusercontent.com/NYU-Molecular-Pathology/Methylation/main/Development/brewFix.sh2. If you still have errors with compiling or installing a package, try removing you MakeVars directory in:
`rm -rf $HOME/.R/Makevars`
Fix wet lab worksheet
1. In your run path /Volumes/CBioinformatics/Methylation/Clinical_Runs/22-MGDM##, open the RUNID.xlsm file2. On the Review ribbon, click unprotect worksheet and unprotect tab
3. Right-click the "worksheet" tab at the bottom and unhide... raw_labels tab
4. If any "#ref" errors either drag the formula down to correct or type "=" and select the cell in the first tab "worksheet" and press return.
For example "=worksheet!B25" references cell B25 in the tab named "worksheet"