Dergi makalesi Açık Erişim

Integration of single-cell proteomic datasets through distinctive proteins in cell clusters

   Koca, Mehmet Burak; Sevilgen, Fatih Erdoğan

The use of mass spectrometry and antibody-based sequencing technologies at the single-cell level has led to an increase in single-cell proteomic datasets. Integrating these datasets is crucial to eliminate the batch effect that often arises due to their limited sequencing molecules. Although methods for horizontally integrating high-dimensional single-cell transcriptomic datasets can also be applied to single-cell proteomic datasets, a specialized approach explicitly tailored for low-dimensional proteomic datasets may enhance the integration process. Here, we introduce SCPRO-HI, an algorithm for the horizontal integration of antibody-based single-cell proteomic datasets. It utilizes a hierarchical cell anchoring technique to match cells based on the similarity of distinctive proteins for constituting cell clusters. A novel variational auto-encoder model is employed for correcting batch effects on the protein abundances, eliminating the need for mapping them into a new domain. Moreover, we propose a technique for extending the algorithm to high-dimensional datasets. The performance of the SCPRO-HI algorithm is evaluated using simulated and real-world single-cell proteomic datasets. The findings demonstrate our algorithm outperforms state-of-the-art methods, achieving a 75% higher silhouette score while preserving HVPs 13% better. Furthermore, the algorithm shows competitive performance in transcriptomic datasets, suggesting potential for integrating high-dimensional mass-spectrometry-based proteomic datasets.

Dosyalar (154.8 MB)
Dosya adı Boyutu
154.8 MB İndir
Görüntülenme 67
İndirme 4
Veri hacmi 619.2 MB
Tekil görüntülenme 61
Tekil indirme 4

Alıntı yap