Lightning Talk & Poster 28th Annual Lorne Proteomics Symposium 2023

Unbiased plasma proteomics of a large type 1 diabetes cohort in a mass spectrometry facility – potential, perspectives and pitfalls (#19)

Samantha Emery-Corbin 1 2 , Megan Penno 3 , Jumana M Yousef 1 2 , Helena Oakley 3 , Jennifer J Couper 3 , Leonard C Harrison 1 4 , John M Wentworth 1 4 , Laura F Dagley 1 2
  1. Department of Medical Biology, The University of Melbourne, Parkville, VIC, Australia
  2. Division of Advanced Technology and Biology Division, Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia
  3. Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, SA, Australia
  4. Population Health and Immunity, Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia

The use of unbiased plasma proteomics remains a common first step for disease-based biomarker-discovery pipelines with the view that blood-based diagnostics continue to be widely implemented across medical settings. However, clinical translation from discovery-driven, proteomic surveys remains poor, with many published studies extensively studying few samples, and only a few studies attempting large-scale cohorts with greater potential for translation. While larger cohort sizes have arguable advantages, technical logistics and inherent variation often hold back the practicality and payoff of undertaking such studies in the clinical proteomics field.

WEHI Proteomics Facility recently undertook a large-scale plasma cohort from The Environmental Determinants of Islet Autoimmunity (ENDIA) Study. ENDIA is the largest global study to follow babies through pregnancy to explore factors which may protect against or impact upon the development of type 1 diabetes (TD1). The ENDIA cohort included 931 plasma samples from mothers throughout pregnancy (Trimester 1-T3), as well as their infants from birth. This cohort was spread across 14 batches (96-well plates), totalling ~1200 samples with included controls and QCs (or ~2 months continuous instrument analysis time). Samples were manually processed using the USP3 method, and then acquired on a 30 minute analytical gradient (48 minute total cycle time) using diaPASEF acquisition on a timsTOF Pro, with two windows in each diaPASEF scan across 16 × 25 m/z precursor isolation windows (32 windows). Data was then searched on DIA-NN in library-free mode and the data subjected to in-house analysis pipelines for normalisation, imputation and protein quantitation.

Herein we discuss the logistics (both expected and realised) of running this large-scale ENDIA cohort through our proteomics facility. We detail aspects of our experimental design– from plate formats including sample, technical, batch and QC controls – as well as our sample processing and instrument pipelines, and how data was acquired, assessed, and analysed. We evaluate the success of this project from multiple perspectives – including as a facility – towards advising future proteomic analyses of large-scale clinical cohorts.