Data Science Strategies for Cancer Immunotherapy Applications

Lead Supervisor
Dr Sophia Tsoka
Reader in Bioinformatics
Informatics/NMS, King’s College London

Professor Sophia Karagiannis
Professor of Translational Cancer Immunology and Immunotherapy
King’s College London

Project Details

The tumour microenvironment is an inflammatory environment that drives tumour growth, suppresses anti-tumour responses and is underpinned by a complex network of interactions between immune, stromal and tumour cells. Immunotherapy offers a promising avenue to target cancer cells by disrupting the interactions key to its supportive microenvironment. In order to develop immunotherapies that are both effective and potent, the tumour microenvironment and the relevant signalling events must be fully characterised. 

This project will seek to explore antibody-immune cell interactions for cancer. We have previously reported mechanistic analyses in the context of cancer immunotherapy (e.g. [1]). Here, we aim to analyse publicly available gene expression data (including single-cell RNA sequencing) through flexible, accurate and efficient models based on combinatorial optimisation principles to address (i) communitites of multiple interaction types, (ii) communities of consistently regulated genes and (iii) multiscale community modelling. 

Task 1: Integration of datasets from multiple biological resources: Data management and integration are instrumental in managing the heterogeneous nature of diverse biological data sources, establishing the context for further analysis, facilitating method development, comparisons and data sharing. Public repositories will be used to construct a unified data resource for analysis. This resource will encompass multiple data resources such as genomic and interaction databases (STRING, BioGRID) metabolic and signalling pathways (Reactome, KEGG) and functional annotations (Gene Ontology). The use of graph database frameworks (such as Neo4J) will be explored as a scalable option that can also handle data heterogeneity well. 

Task 2: The detection of composite communities: The integration of multiple data sources into a single clustering algorithm in order to detect composite modules may capture cellular functions more accurately. Previously, we reported the development of consensus clustering, where multilayer networks corresponding to diverse sources of  interactions were combined to determine a single representative partition of composite communities [2]. Extension of this work is envisaged to include meta-data annotations of network nodes, as well as to rationalise the choice of data layers in the model through information theory, thereby overcoming limitations of existing work.

Task 3: The detection of consistently regulated communities: While composite modules are based on link density, the aim here is to find modules that meet additional criteria, in particular consistent regulation of the constituent genes. Various approaches have been taken to analyse gene regulatory networks, eg. in silico simulation of fluctuations in network components when the network is perturbed. Global qualitative analysis is proposed here, which formalises automatic reasoning where experimentally derived protein interactions are contrasted to a generic network topology from reference data. 

Overall, the work proposed represents a multidisciplinary program of research, that has the potential of high-impact biomedical discovery though Bioinformatics, Systems Biology and Cancer Immunotherapy applications. 

1. Nakamura M, et al. Immune mediator expression signatures are associated with improved outcome in ovarian carcinoma. Oncoimmunology, 8(6):e1593811, 2019.

2. Bennett, L., Kittas, A., Muirhead, G., Papageorgiou, L. G., Tsoka, S. Detection of composite communities in multiplex biological networks. Sci. Rep. 5, 10345, 2015.


Public data will be used. Where ethical clearance is required, appropriate authorisation is in place by the co-supervisor.


Systems Biology, Bioinformatics, Cancer Immunotherapy, Network Analysis