Using machine learning to understand multimorbidity progression and its prognostic impact in patients with heart failure

Lead Supervisor
Dr Zina Ibrahim
Lecturer in Computer Science for Health Informatics
Department of Biostatistics and Health Informatics, King’s College London

Dr Rosita Zakeri (British Heart Foundation Centre of Excellence, King’s College London)
Dr Rebecca Bendayan (Department of Biostatistics and Health Informatics, King’s College London)
Dr Andrew Cooper (Director, AstraZeneca)

Industry Partner

Project Details


Heart failure (HF) is a major public health problem: 1.4% of the UK population currently has a diagnosis of HF, with the prevalence rising to greater than 10% among individuals over 70 years of age. Despite major advances in medical therapy, rates of morbidity, mortality and the economic burden of HF remain unacceptably high. 

In recent years the proportion of HF patients with multimorbidity (i.e. 2 or more co-existent chronic health conditions) and the average number of comorbidities per person have increased, adding substantial complexity to the clinical management of HF. However, current HF treatment algorithms, based on cardiac function-based classification of HF (i.e. ejection fraction) or aetiology (ischaemic versus non-ischaemic), fail to account for the heterogeneity in clinical course and outcomes of HF that may be related to multimorbidity. Particular knowledge gaps exist for HF subgroups under-represented in clinical trials, e.g. HF patients with coexistent chronic kidney disease (CKD) and the very elderly, or patients with HF with preserved ejection fraction (HFpEF), among whom multimorbidity is common and no effective therapies are available. 

Previous attempts to study multimorbidity in HF have largely examined baseline patient characteristics without follow-up (longitudinal) data. In order to design effective interventions that improve quality of life and prognosis among patients with HF, there is a critical need to characterise the progression of complex multimorbidity patterns over time and understand their contribution to the clinical course and outcomes after HF diagnosis. Given the deluge of large-scale data contained within electronic health records (EHRs), there is a real opportunity for the development and use of powerful Machine Learning methodologies to obtain previously inaccessible insights that can inform clinical decision making in HF management and risk stratification. Such methods enable the modelling, exploration and prediction of outcomes from the large, highly-dimensional and uncertainty-ridden longitudinal patient treatment timelines, characteristic of chronic HF care. 


This project is motivated by the team’s a) access to large-scale longitudinal data of patient treatment histories, diagnoses, symptoms, clinical observations, laboratory tests and prescriptions in the secondary care institutions associated with King’s Healthcare Partners and the availability of digital innovations (CogStack) enabling the extraction of said data and reformulating it to enable secondary research b) combined expertise in cardiology, multimorbidity research and machine learning, providing the appropriate guidance to develop sound models that provide real insight into HF prognosis. 

Aims and objectives 

The overall aim of this project is the development and application of novel machine learning models to identify distinct and clinically insightful patterns and longitudinal trajectories of comorbidities over time in patients with heart failure, and to explore their association(s) with health outcomes. We will achieve this through the following objectives:

Objective 1: Data Extraction- To validate tools for data extraction on health conditions within the HF cohort using natural language processing techniques. 

This will allow characterisation of the HF cohort and development of tools which can be further validated in other HF cohorts. 

Objective 2: Multimorbidity Pattern Identification: To identify temporal patterns in patients with HF using statistical machine learning algorithms. 

This will allow us to identify subgroups of HF patients based on their multimorbidity patterns and understand how these individuals progressed from previous health conditions to HF. 

Objective 3: Associations and Prediction- Cardiac and Renal Outcomes in HF: Predict the impact of HF multimorbidity sub-groups on adverse outcomes, including worsening renal function (i.e., creatinine trajectories) and incident chronic kidney disease, worsening HF severity (i.e., cardiac function trajectories), and all-cause and HF-specific hospitalisation, and death. 

This will allow us to identify the high-risk HF subgroups among those identified in objective 2 and explore their predictive value for adverse renal and cardiac outcomes. 

Objective 4: Proof-of concept decision support tool: Develop algorithms for a proof-of-concept decision support tool to predict prognostic outcomes (e.g. mortality and rehospitalisation for HF) from longitudinal treatment histories, with the added knowledge of multimorbidity patterns being facilitators of prediction. 

This will allow us to develop proof of concept for a clinical decision support tool based on machine learning algorithms. 

Findings from the test dataset will be internally and externally validated by leveraging the established collaborative network of the supervisory team. 

Expected value of results 

Knowledge of HF-associated multimorbidity patterns obtained from this project will have broad and immediate relevance to clinical care, by identifying key multimorbidity-related domains that should be considered in HF treatment strategies, as well as identifying potential subgroups of HF patients who may benefit from stratified care or preventive interventions. This information will be particularly relevant for patients with HF with preserved ejection fraction (up to 50% of all HF cases) for whom no effective therapies are currently available or HF subgroups under-represented in existing clinical trials (e.g. CKD). For healthcare systems, data from this project will provide vital information regarding immediate and forecasted priorities for HF-related healthcare delivery and resource allocation planning. Finally, demonstrating prognostic value for machine-learning derived multimorbidity subgroups and proof of concept for clinical decision support would be a major step towards the delivery of personalised care for patients with HF. 

Planned training in research methods

The proposal includes a comprehensive training plan for the student to gain expertise in state-of-the-art pattern recognition and machine learning techniques, natural language processing technology, and advanced statistical modelling to examine multimorbidity profiles associated with HF. Execution of the project aims and training plan will be possible through active interdisciplinary collaboration between the British Heart Foundation Centre for Cardiovascular Research Excellence and the Department of Biostatistics and Health Informatics centre of the Institute of Psychiatry, Psychology and Neuroscience at KCL.