Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Advertisement

Springer Nature Link
Log in

T Cell Receptor Protein Sequences and Sparse Coding: A Novel Approach to Cancer Classification

  • Conference paper
  • First Online:

Abstract

Cancer is a complex disease marked by uncontrolled cell growth, potentially leading to tumors and metastases. Identifying cancer types is crucial for treatment decisions and patient outcomes. T Cell receptors (TCRs) are vital proteins in adaptive immunity, specifically recognizing antigens and playing a pivotal role in immune responses, including against cancer. TCR diversity makes them promising for targeting cancer cells, aided by advanced sequencing revealing potent anti-cancer TCRs and TCR-based therapies. Effectively analyzing these complex biomolecules necessitates representation and capturing their structural and functional essence. We explore sparse coding for multi-classifying TCR protein sequences with cancer categories as targets. Sparse coding, a machine learning technique, represents data with informative features, capturing intricate amino acid relationships and subtle sequence patterns. We compute TCR sequencek-mers, applying sparse coding to extract key features. Domain knowledge integration improves predictive embeddings, incorporating cancer properties like Human leukocyte antigen (HLA) types, gene mutations, clinical traits, immunological features, and epigenetic changes. Our embedding method, applied to a TCR benchmark dataset, significantly outperforms baselines, achieving 99.8% accuracy. Our study underscores sparse coding’s potential in dissecting TCR protein sequences in cancer research.

Z. Tayebi and S. Ali—Equal Contribution.

This is a preview of subscription content,log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 11439
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 14299
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide -see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Similar content being viewed by others

References

  1. Ali, S., Patterson, M.: Spike2Vec: an efficient and scalable embedding approach for covid-19 spike sequences. In: IEEE Big Data, pp. 1533–1540 (2021)

    Google Scholar 

  2. Ali, S., Bello, B., et al.: PWM2Vec: an efficient embedding approach for viral host specification from coronavirus spike sequences. MDPI Biol. (2022)

    Google Scholar 

  3. Alley, E.C., Khimulya, G., et al.: Unified rational protein engineering with sequence-based deep representation learning. Nat. Methods16(12), 1315–1322 (2019)

    Google Scholar 

  4. Bai, F., et al.: Use of peripheral lymphocytes and support vector machine for survival prediction in breast cancer patients. Transl. Cancer Res.7(4) (2018)

    Google Scholar 

  5. van den Berg, J.H., Heemskerk, B., van Rooij, N., et al.: Tumor infiltrating lymphocytes (TIL) therapy in metastatic melanoma: boosting of neoantigen-specific T cell reactivity and long-term follow-up. J. Immunother. Cancer8(2) (2020)

    Google Scholar 

  6. Bileschi, M.L., et al.: Using deep learning to annotate the protein universe. BioRxiv, p. 626507 (2019)

    Google Scholar 

  7. Brandes, N., Ofer, D., et al.: ProteinBERT: a universal deep-learning model of protein sequence and function. Bioinformatics38(8), 2102–2110 (2022)

    Google Scholar 

  8. Bufe, S., et al.: PD-1/CTLA-4 blockade leads to expansion of CD8+ PD-1int TILs and results in tumor remission in experimental liver cancer. Liver Cancer (2022)

    Google Scholar 

  9. Carosella, E.D., Ploussard, G., LeMaoult, J., Desgrandchamps, F.: A systematic review of immunotherapy in urologic cancer: evolving roles for targeting of CTLA-4, PD-1/PD-L1, and HLA-G. Eur. Urol.68(2), 267–279 (2015)

    Article  Google Scholar 

  10. Chen, S.Y., et al.: TCRdb: a comprehensive database for T-cell receptor sequences with powerful search function. Nucleic Acids Res.49(D1), D468–D474 (2021)

    Google Scholar 

  11. Chourasia, P., Ali, S., Ciccolella, S., Vedova, G.D., Patterson, M.: Reads2Vec: efficient embedding of raw high-throughput sequencing reads data. J. Comput. Biol.30(4), 469–491 (2023)

    Article  Google Scholar 

  12. Courtney, A.H., Lo, W.L., Weiss, A.: TCR signaling: mechanisms of initiation and propagation. Trends Biochem. Sci.43(2), 108–123 (2018)

    Article  Google Scholar 

  13. De Visser, K.E., Eichten, A., Coussens, L.M.: Paradoxical roles of the immune system during cancer development. Nat. Rev. Cancer6(1), 24–37 (2006)

    Article  Google Scholar 

  14. Dunne, M.R., et al.: Characterising the prognostic potential of HLA-DR during colorectal cancer development. Cancer Immunol. Immunother.69, 1577–1588 (2020)

    Google Scholar 

  15. Farhan, M., Tariq, J., Zaman, A., Shabbir, M., Khan, I.U.: Efficient approximation algorithms for strings kernel based sequence classification. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  16. Fodde, R.: The APC gene in colorectal cancer. Eur. J. Cancer38(7), 867–871 (2002)

    Article  Google Scholar 

  17. Gittelman, R.M., Lavezzo, E., Snyder, T.M., Zahid, H.J., Carty, C.L., et al.: Longitudinal analysis of t cell receptor repertoires reveals shared patterns of antigen-specific response to SARS-CoV-2 infection. JCI Insight7(10) (2022)

    Google Scholar 

  18. Gonzalez, H., et al.: Roles of the immune system in cancer: from tumor initiation to metastatic progression. Genes Dev.32(19–20), 1267–1284 (2018)

    Google Scholar 

  19. Hee, B.J., Kim, M., et al.: Feature selection for colon cancer detection using k-means clustering and modified harmony search algorithm. Mathematics9(5), 570 (2021)

    Google Scholar 

  20. Heinzinger, M., Elnaggar, A., et al.: Modeling aspects of the language of life through transfer-learning protein sequences. BMC Bioinformatics20(1), 1–17 (2019)

    Google Scholar 

  21. Hoadley, K.A., Yau, C., et al.: Cell-of-origin patterns dominate the molecular classification of 10,000 tumors from 33 types of cancer. Cell173(2), 291–304 (2018)

    Google Scholar 

  22. Hu, M., et al.: Exploring evolution-based & -free protein language models as protein function predictors. arXiv preprintarXiv:2206.06583 (2022)

  23. Iqbal, M.J., Faye, I., Samir, B.B., Md Said, A.: Efficient feature selection and classification of protein sequence data in bioinformatics. Sci. World J.2014 (2014)

    Google Scholar 

  24. Janeway, C.A. Jr.: The major histocompatibility complex and its functions. In: Immunobiology: The Immune System in Health and Disease. 5th edn. Garland Science (2001)

    Google Scholar 

  25. Johnson, N., et al.: Counting potentially functional variants in BRCA1, BRCA2 and ATM predicts breast cancer susceptibility. Hum. Mol. Genet.16(9), 1051–1057 (2007)

    Google Scholar 

  26. Kelly, T.K., De Carvalho, D.D., Jones, P.A.: Epigenetic modifications as therapeutic targets. Nat. Biotechnol.28(10), 1069–1078 (2010)

    Article  Google Scholar 

  27. Kidman, J., et al.: Characteristics of TCR repertoire associated with successful immune checkpoint therapy responses. Frontiers Immunol.11, 587014 (2020)

    Google Scholar 

  28. Kuzmin, K., et al.: Machine learning methods accurately predict host specificity of coronaviruses based on spike sequences alone. Biochem. Biophys. Res. Commun.533(3), 553–558 (2020)

    Article  Google Scholar 

  29. Lee, A., et al.: BOADICEA: a comprehensive breast cancer risk prediction model incorporating genetic and nongenetic risk factors. Genet. Med.21(8), 1708–1718 (2019)

    Google Scholar 

  30. Liang, H., Lu, T., Liu, H., Tan, L.: The relationships between HLA-A and HLA-B genes and the genetic susceptibility to breast cancer in Guangxi. Russ. J. Genet.57, 1206–1213 (2021)

    Article  Google Scholar 

  31. Lin, Z., Akin, H., Rao, R., et al.: Evolutionary-scale prediction of atomic-level protein structure with a language model. Science379(6637), 1123–1130 (2023)

    Article MathSciNet  Google Scholar 

  32. Loibl, S., Gianni, L.: HER2-positive breast cancer. Lancet389(10087), 2415–2429 (2017)

    Article  Google Scholar 

  33. Lu, Y.C., et al.: Single-cell transcriptome analysis reveals gene signatures associated with T-cell persistence following adoptive cell therapygene signatures associated with T-cell persistence. Cancer Immunol. Res.7(11), 1824–1836 (2019)

    Article  Google Scholar 

  34. Makuuchi, M., Kosuge, T., Takayama, T., et al.: Surgery for small liver cancers. In: Seminars in Surgical Oncology, vol. 9, pp. 298–304. Wiley Online Library (1993)

    Google Scholar 

  35. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprintarXiv:1301.3781 (2013)

  36. Min, S., Park, S., et al.: Pre-training of deep bidirectional protein sequence representations with structural information. IEEE Access9, 123912–123926 (2021)

    Google Scholar 

  37. Nambiar, A., Heflin, M., Liu, S., Maslov, S., Hopkins, M., Ritz, A.: Transforming the language of life: transformer neural networks for protein prediction tasks. In: Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, pp. 1–8 (2020)

    Google Scholar 

  38. Olshausen, B.A., Field, D.J.: Sparse coding of sensory inputs. Curr. Opin. Neurobiol.14(4), 481–487 (2004)

    Article  Google Scholar 

  39. Ostrovsky-Berman, M., et al.: Immune2vec: embedding B/T cell receptor sequences in n using natural language processing. Frontiers Immunol.12, 680687 (2021)

    Google Scholar 

  40. Peshkin, B.N., Alabek, M.L., Isaacs, C.: BRCA1/2 mutations and triple negative breast cancers. Breast Dis.32(1–2), 25–33 (2011)

    Article  Google Scholar 

  41. Ranstam, J., Cook, J.: Lasso regression. J. Br. Surgery105(10), 1348 (2018)

    Article  Google Scholar 

  42. Rotte, A.: Combination of CTLA-4 and PD-1 blockers for treatment of cancer. J. Exp. Clin. Cancer Res.38, 1–12 (2019)

    Article  Google Scholar 

  43. Schaafsma, E., et al.: Pan-cancer association of HLA gene expression with cancer prognosis and immunotherapy efficacy. Br. J. Cancer125(3), 422–432 (2021)

    Google Scholar 

  44. Shah, K., Al-Haidari, A., Sun, J., Kazi, J.U.: T cell receptor (TCR) signaling in health and disease. Signal Transduct. Target. Ther.6(1), 412 (2021)

    Article  Google Scholar 

  45. Shen, J., Qu, Y., Zhang, W., et al.: Wasserstein distance guided representation learning for domain adaptation. In: AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  46. Singh, R., et al.: GaKCo: a fast gapped k-mer string Kernel using counting. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 356–373 (2017)

    Google Scholar 

  47. Stanton, S.E., Disis, M.L.: Clinical significance of tumor-infiltrating lymphocytes in breast cancer. J. Immunother. Cancer4, 1–7 (2016)

    Article  Google Scholar 

  48. Van, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. (JMLR)9(11) (2008)

    Google Scholar 

  49. Wan, F., et al.: DeepCPI: a deep learning-based framework for large-scale in silico drug screening. Genomics Proteomics Bioinform.17(5), 478–495 (2019)

    Article  Google Scholar 

  50. Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: International Conference on Machine Learning, pp. 478–487 (2016)

    Google Scholar 

  51. Yang, X., Yang, S., Li, Q., Wuchty, S., Zhang, Z.: Prediction of human-virus protein-protein interactions through a sequence embedding-based machine learning method. Comput. Struct. Biotechnol. J.18, 153–161 (2020)

    Article  Google Scholar 

  52. Zhang, J., et al.: Recurrent neural networks with long term temporal dependencies in machine tool wear diagnosis and prognosis. SN Appl. Sci.3, 1–13 (2021)

    Google Scholar 

  53. Zhu, J.D.: The altered DNA methylation pattern and its implications in liver cancer. Cell Res.15(4), 272–280 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Georgia State University, Atlanta, GA, 30302, USA

    Zahra Tayebi, Sarwan Ali, Prakash Chourasia, Taslim Murad & Murray Patterson

Authors
  1. Zahra Tayebi

    You can also search for this author inPubMed Google Scholar

  2. Sarwan Ali

    You can also search for this author inPubMed Google Scholar

  3. Prakash Chourasia

    You can also search for this author inPubMed Google Scholar

  4. Taslim Murad

    You can also search for this author inPubMed Google Scholar

  5. Murray Patterson

    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toMurray Patterson.

Editor information

Editors and Affiliations

  1. School of Automation, Central South University, Changsha, China

    Biao Luo

  2. Institute of Automation, Chinese Academy of Sciences, Beijing, China

    Long Cheng

  3. Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, China

    Zheng-Guang Wu

  4. School of Automation, Guangdong University of Technology, Guangzhou, China

    Hongyi Li

  5. School of Electrical Engineering and Telecommunications, UNSW Sydney, Sydney, NSW, Australia

    Chaojie Li

Rights and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tayebi, Z., Ali, S., Chourasia, P., Murad, T., Patterson, M. (2024). T Cell Receptor Protein Sequences and Sparse Coding: A Novel Approach to Cancer Classification. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1964. Springer, Singapore. https://doi.org/10.1007/978-981-99-8141-0_17

Download citation

Publish with us

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 11439
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 14299
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide -see info

Tax calculation will be finalised at checkout

Purchases are for personal use only


[8]ページ先頭

©2009-2025 Movatter.jp