Protein structure prediction has been revolutionized by AlphaFold, yet a key limitation remains: the challenge of characterizing the multiple conformations adopted by proteins that can switch between different folds. Current prevailing approaches rely on sampling the multiple sequence alignment (MSA) input, either through random sampling or clustering, but these methods are statistically inefficient and do not utilize coevolutionary information. To address this, we introduce an iterative sampling framework that systematically explores the MSA space using residue-specific frequencies and coevolutionary patterns inferred via Markov random fields. We also develop tools to identify a protein's variable regions and to subsequently extract representative structures. Together, our method yields a high-quality and compact set of final structural models for downstream analysis with good coverage of the distinct conformational states. On a benchmark set of fold-switching proteins, our method outperforms existing ones by substantially improving the diversity of the sampled structures. Overall, this work significantly advances our ability to characterize the conformational landscape of proteins capable of adopting distinct folds, which represents a key step towards understanding protein conformational dynamics and enabling the de novo design of protein switches.
来源出处
Uncovering distinct protein conformations using coevolutionary information an…
https://www.biorxiv.org/content/10.1101/2025.10.08.681198v1?rss=1