- 3 次围观
Genetics-informed proteome-wide association studies (PWAS) provide an effective way to map the complex molecular landscape of biological mechanisms for complex diseases. PWAS relies on an ancestry-matched reference panel to model protein expression using genetic variants as features and determine the protein's impact on phenotype. However, reference panels from underrepresented populations remain relatively limited. In this study, we developed an analytic framework that borrows information from potentially multiple ancestries to boost the protein abundance prediction accuracy in an underrepresented population. We illustrate the framework's utility and reproducibility through application to PWAS in East Asians: BioBank Japan (BBJ), Korean Genome and Epidemiology Study (KoGES), and Taiwan Biobank (TWB). An ensemble of information-sharing approaches was integrated to build the Multi-Ancestry-based Best-performing Model (MABM). MABM substantially improved the prediction performance with higher performance observed in both cross-validation and an external validation dataset (Tongji-Huaxi-Shuangliu Birth Cohort). Leveraging the BBJ, we identified three times as many significant PWAS associations with MABM as with the baseline Lasso model. Notably, 47.5% of the MABM specific associations were reproduced in independent East Asian datasets with concordant effect sizes. Furthermore, MABM enhanced gene/protein prioritization for downstream functional validation by (1) confirming a greater number of well-established gene/protein-trait associations and (2) identifying previously uncharacterized trait-associated genes. The benefits of MABM were further validated in additional ancestries and demonstrated in brain tissue-based PWAS, underscoring its broad applicability. Our findings close critical gaps in multi-omics research, develop a new reference resource of genetic models of protein abundance, and facilitate trait-relevant protein discovery in underrepresented populations.
来源出处
Cross-ancestry information transfer framework improves protein abundance pred…
https://www.biorxiv.org/content/10.1101/2025.08.13.670235v1?rss=1