Lightning Talk & Poster 28th Annual Lorne Proteomics Symposium 2023

The ProtoemeRiver pipeline facilitates reproducible and sharable differential abundance and pathways analyses of proteomics and phosphoproteomics datasets (#37)

Ignatius Pang 1 , Ashley J. Waardenberg 2 , Nader Aryamanesh 1 , Mark Graham 3
  1. Bioinformatics Core Facility, Children's Medical Research Institute, Westmead, NSW, Australia
  2. i-Synapse, Cairns, QLD, Australia
  3. Biomedical Proteomics and Synapse Proteomics, Children's Medical Research Institute, Westmead, NSW, Australia

Identifying phosphorylation sites and how they change in abundance under different environmental conditions are important for elucidating the role of signal regulations in cellular processes. ProteomeRiver is a novel pipeline that facilitates the analysis of differential abundance of proteins and their phosphorylation events. The pipeline enables the batch analysis of many pairwise treatment versus control comparisons and subsequent pathways overrepresentation analysis. To enable differential abundance analysis of mono- and multi-phosphorylation events, ProteomeRiver incorporates missing values imputation (PhosR*, 1), removal of unwanted variation (ruv*, 2), normalization between samples and linear models (limma*,3), kinase-substrate enrichment (KinSwingR*, 4), and pathways analysis (clusterProfiler*, 5). The changes in the abundance of phosphorylation events were also normalized by the changes in host-protein abundance using custom scripts. This pipeline uses modular components, which allows the modules to be substituted and/or extended with novel tools as they become available. This pipeline also uses a small set of configuration files and scripts to store all instructions necessary for data analysis, which could be shared publicly on code repositories to support reproducibility. We demonstrate the pipeline through the re-analysis of a published dataset of synapses proteome and phosphoproteome during homeostatic up- and down-scaling (6). The use of ProteomeRiver resulted in the identification of pathways and upstream kinases associated with homeostatic up- and down-scaling, with the number and types of pathways identified comparable to those from the original publication. ProteomeRiver enables all steps necessary for the analysis to be easily shared with other researchers and for them to independently reproduce the results. The pipeline and step-by-step tutorial will be available as a R package via https://bitbucket.org/cmri-bioinformatics/proteomeriver. (*) denotes R or Bioconductor packages.

  1. 1) Kim et al. Cell Rep., 34(8), 108771.
  2. 2) Molania et al. 2019 Nucleic Acids Res. 47:6073 – 6083.
  3. 3) Ritchie et al. 2015 Nucleic Acids Res. 43(7), e47.
  4. 4) Engholm-Keller et al. 2019 PLoS Biol 17(3): e3000170.
  5. 5) Wu et al. The Innovation, 2(3), 100141.
  6. 6) Desch et al. 2021 Cell Rep. 36:109583.