Many microbial species have developed adaptations for coping with life in extreme cold and in particular in the cryosphere. Ice-binding proteins (IBPs) play a critical role in enabling organisms to survive in extreme cold environments. IBPs can be divided into two distinct functional classes - antifreeze proteins (AFPs) and ice-nucleation proteins (INPs). These classes have been identified based on their specific modes of interaction with ice. Here, we introduce PLM-ICE, a computational system designed to predict IBPs with high precision and sensitivity. Leveraging ESM-2 embeddings, which incorporate evolutionary and functional sequence signals more effectively than conventional embeddings, our model employs a frozen ESM-2 encoder coupled to a Multi-layer Perceptron (MLP) prediction head. The application of this architecture allows for accurate determination of AFPs and INPs, surpassing existing methods (e.g., VotePLMs-AFP) in metrics such as Matthews correlation coefficient (MCC), area under the precision-recall curve (AUPR), and area under the receiver operating characteristic curve (AUROC). Our findings indicate that PLM-ICE exhibits robust performance across broad datasets encompassing bacterial genomic sequences, highlighting its potential for wide-ranging implementation. Notably, the ability of ESM-2 to capture essential sequence patterns confers PLM-ICE with advantages in both basic research and industrial settings, where prompt and reliable identification of IBPs remains a priority. Further, the model's strong performance underscores the broader promise of protein language model-based pipelines for decoding complex biological networks and driving innovations in cryopreservation, food technology, and climate studies. Together, these data demonstrate that PLM-ICE provides novel insight into IBP classification and stands poised to advance biotechnology applications focused on freezing tolerance and specialized temperature adaptations.
来源出处
PLM-ICE: A Protein Language Model-based Approach for Prediction of Ice nuclea…
https://www.biorxiv.org/content/10.1101/2025.08.13.669380v1?rss=1