Data-driven analysis of the impact of Universal Credit on mental health usage using a novel data linkage

Lead Supervisor
Dr Sharon Stevelink
Lecturer in Epidemiology
Department of Psychological Medicine, Institute of Psychiatry, Psychology & Neuroscience (IoPPN), King’s College London

Dr Sumithra Velupillai (joint first supervisor)
Lecturer in Applied Health Informatics, King’s College London
Professor Nicola Fear (third supervisor)
Professor of Epidemiology, King’s College London

Project Details

The student has the unique opportunity to contribute to a larger work package exploring the interrelationships between the implementation of Universal Credit (UC) and people’s benefit status, mental health problems and mental health care usage. 

The student’s PhD will aim

  1. To develop data-driven Natural Language Processing (NLP) methods for defining and extracting patient reported experiences related to benefits and financial situations from CRIS, including creating reference standards and annotation guidelines.
  2. To explore the impact of these factors and others, such as age, gender, socio-economic status, employment status, type of mental health problem and treatment, among people attending mental health care services and receive UC, and how this varies over time and by type of UC entitlement. 

These aims can be adjusted in line with the interest and expertise of the successful applicant. This PhD will help the applicant to develop skills in NLP, managing big data, longitudinal modelling, epidemiology and informatics. 

The proposed PhD is timely and relevant as major changes have taken place in the UK benefit system, including the introduction of Universal Credit (UC). Currently, there are 2.8 million people on UC, and this is set to rise to 6 million people when UC is fully rolled-out, costing the society £60 billion in payments annually. There have been concerns about the impact of these changes on people’s health and wellbeing, and especially on patients with mental health problems. In mental health care, clinical encounters are documented in electronic mental health care records in a combination of structured data (e.g. age, diagnosis) and unstructured, free-text notes (e.g. clinical assessments, referral letters, daily ward notes). The extent to which social and behavioural determinants of health, including financial resource strains, is documented during healthcare assessments as structured or unstructured data is not known. 

Natural language processing (NLP) algorithms have been successfully applied to extract clinically relevant information from clinical text, such as CRIS. Examples include extracting symptom profiles, treatments and exposures. However, NLP models for extracting information related to patient-reported financial burdens and the impact of this on their mental health have not been widely studied. Modeling patient-reported information as documented by clinicians involves developing novel NLP approaches that are able to capture complex expressions and idioms. Recent advances in language representation models, e.g. embeddings, have had a huge impact in the wider NLP community, but complex expressions are still a challenge. 

The student will be part of a wider interdisciplinary research team consisting of epidemiologists, informaticians, computer scientists and healthcare professionals. The team has received approval to establish a unique data linkage where electronic mental health care records from people who are receiving treatment from the South London and Maudsley NHS Trust (the Clinical Record Interactive Search – CRIS – database) are linked with benefit records from the Department for Work and Pensions (DWP). This approved linked dataset will include information about approximately 400,000 working age adults with mental health problems, covering the years 2007-2019, creating the largest cohort referred to psychiatric services. 

Supervisory team

Dr Stevelink is an epidemiologist and the principal investigator on the DWP/CRIS data linkage. Dr Velupillai has an excellent track record in the area of clinical natural language processing, particularly semantic and contextual analysis. Professor Fear is an epidemiologist with an interest in military mental health and has worked with a variety of linked datasets, including the DWP.


The Department for Work and Pensions and the South London and Maudsley NHS Trust have entered into a data sharing agreement and work is underway to create a de-identified, linked dataset. Ethical and Health Research Authority Confidentiality Advisory Group approvals (s251) are in place. Internal approval from the CRIS Oversight Committee and the Work and Health Screening Panel will be required to ensure that the research adheres to the agreed standards of research and dissemination. 


Mental health, natural language processing, big data, benefits, data linkage, NLP, EHR, electronic health records