A neurosymbolic approach to generating trust in Artificial Intelligence innovations in medicine

Lead Supervisor
Dr Zina Ibrahim
Lecturer in Computer Science for Health Informatics
Dept. of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience (IoPPN)
King’s College London
zina.ibrahim@kcl.ac.uk

Co-supervisor
Dr Johnny Downs

Project Details

Overview:
The medical domain is characterised by the generation of massive data streams of patient vital signs, laboratory tests, symptoms, diagnoses, medications, etc.. It is therefore a source of highly dimensional time-series data with a high level of granularity of observations and large potential for generating useful actionable insight.
An active area of research is the development of deep learning models to generate prognostic predictions (e.g. of mortality, deterioration) from patient time-series (e.g. [1-4]). However, existing models are generally black-box frameworks that lack explainability and the user-centric requirements needed to achieve clinical utility. This project aims to build on existing work by the supervisor’s team of designing accurate, robust and explainable prognostic algorithms by combining the power of deep learning with reasoning frameworks that capture domain expertise using principled software engineering methodologies. The project builds on the state of the art, an existing deep learning architecture developed by the supervisor’s team [4], improving it by achieving the following aims:

Project Aims:

  1. Develop and evaluate a reasoning framework capable of capturing existing clinical knowledge and reasoning pathways such as clinical guidelines.
  2. Extending the existing deep learning prognostic framework [4] as a Transformer-based architecture capable of handling long-term dependencies in patient time-series data.
  3. Design and implement a prototype of a neurosymbolic early warning framework integrating that combines the deep learning model and reasoning frameworks developed in aims 1 & 2.
  4. Work with domain experts to evaluate the prototype’s ability to provide guidelines-backed explainable predictions for clinical use cases. Potential use cases include: a) predicting the onset of ADHD in children, b) the ability to predict sepsis in ICU patients.

Timeliness and Novelty:
The healthcare system is in dire need to develop robust, generalisable and trustworthy models that can predict adversity for patients at risk of deterioration [5], such as those in acute settings, where there is currently an unacceptably high rate of mortality and morbidity [6]. There have been successful implementations of high-performing DL-based early warning systems (EWS) [7] including our platform that has outperformed prior state of the art [4]. However, all existing platforms unanimously rely on fixed DL architectures, lacking the sensitivity to contextual changes and the ability to identify data biases or to provide clinically-actionable explanations of the predicted outcomes. Such approaches have failed to gain healthcare worker support [8], which warrants a novel and user-centric approach that ensures safety and accountability. This proposal of a neuro-symbolic framework is in line with what is being termed the ‘3rd AI spring’, whereby sound, flexible and explainable modular frameworks of specialised interacting components are designed to complement the unprecedented advances in ML with principled knowledge representation and reasoning [9].

Proposed plan of work:
Year 1: The first year will comprise the following activities:

  1. Obtain a King’s Health Partners research passports for the student to enable access to data from partner hospitals. This will be done through an existing governance framework to issue a research passport through a KCH honorary contract.
  2. Data Extraction:
    a. Extract data from King’s College Hospitals.
    b. Complete the tasks of data restructuring, normalisation, cleaning and merging, to enable the creation of data resources that are ready for use in subsequent stages of the project.
  3. Training: Please see the training section
  4. Develop the building blocks of the reasoning model: the knowledge representation and knowledge discovery models, through formal analyses and coding in order to have a working prototype by the end of the second year.
  5. Literature Review: Complete the literature review, in view of the readings done as part of the upgrade report and in line with recent development of the field. This will serve as the initial chapter of the thesis.
    Year 2: The second year will comprise the following activities:
  6. Devise an extended architecture an examine performance improvement over [4].
    a. Develop the explainability framework combining the neural (deep) architecture with the symbolic guideline reasoner.
  7. Compile (at most) two sources of external standard datasets for evaluating the algorithms developed throughout this year, before testing those on real hospital data. Suggested data sources include:
    a. The MIMIC-III dataset, freely available after completing an online data privacy course.
    b. UCLH intensive care database.
  8. Write the methods chapters of the thesis, in light of the developments made through the second year of the project.
    Year 3: The third year will comprise the following activities:
    a. Evaluate the models developed on the sepsis use case in KCH, focusing on discrepancies and performance comparison the widely used datasets of MIMIC-III and UCLH data.
    b. Write the analysis chapters of the dissertation.
    Year 4: The fourth year will comprise the following activities:
  9. Complete writing of the thesis, based on activities carried out through years 2 and 3.
  10. Complete writing of papers based on accomplished work, and submit to international journals, with preference to:
    a. Journal of the American Medical Informatics Association (JAMIA)
    b. Journal of Artificial Intelligence in Medicine (JAIM)
    c. BMC Medical Informatics and Decision Support

Theme Alignment and Knowledge Building:
By mining and learning from the large and heterogeneous EHR sources, the project fits with the call’s theme of Learning from Big Data for Health. In addition, the models will serve as building blocks to potential clinical decision support tool sepsis prediction, thereby fitting with Knowledge Representation for Clinical Decision Support theme.
The project will build on machine learning models developed by the supervisor [4] to build and evaluate a tool for assessing sepsis risk according to patient digital phenotypes.

Datasets

This project will utilise electronic health records data from King’s Health Partner hospitals. The student will apply for a King’s Health Partners research passport to enable access to data from partner hospitals. This will be done through an existing governance framework to issue a research passport through a KCH honorary contract.

Keywords

Artificial Intelligence, Machine Learning, Neurosymbolic AI, Explainability, Turst