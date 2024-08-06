Guest Article by Alejandro Tlaie, Ernst Strüngmann Institute for Neuroscience, and Jaisalmer de Frutos Lucas, European Public Health Alliance.

Large Language Models (LLMs) are advanced AI systems designed to understand and generate human-like text based on the data they have been trained on. These systems have – to many, unexpectedly – rapidly increased in capabilities [1]: from translation to summarisation, through overall grammar checks or programming. These models present numerous applications in healthcare. For example, they can be used for obtaining insurance pre-authorisation, managing clinical documents, extracting information from research papers, or responding to questions for patients about their health concerns [2].

The utilisation of AI-based medical tools in healthcare is already regulated. However, LLMs are not necessarily designed as medical tools and, therefore, compliance with existing medical regulations is not required. Yet, their use for health-related purposes could potentially lead to harmful effects. There is a well-documented effect [3] by which people tend to believe more computer answers than human ones. These two phenomena, when coupled together, make it more likely that people use them as if they were aseptic and objective systems. In a recently published pre-print [4], the author explores whether the current training methods for LLMs impart a moral dimension to these models, even if they have not been explicitly trained to handle ethical considerations.

In a nutshell, already deployed LLMs indeed have moral preferences and biases, as measured by their agreement with different ethical schools of thought when presented with ethical dilemmas. Furthermore, when subjected to a questionnaire that comes from moral psychology [5], it is shown that these moral profiles are compatible with a very specific kind of demographics: subjects from WEIRD (Western, Educated, Industrialised, Rich, and Democratic) societies [6]. The vast majority of the tested LLMs – such as GPT-4 or Claude-3-Sonnet – align with the moral schema of a young Western liberal male with a high level of education, engaged in social causes, and with a great openness to experience, empathy, and compassion. The only exception is Llama-2-7B, which aligns more to an American conservative. See below an example based on an item from the aforementioned questionnaire and the response from Claude-3-Sonnet, the LLM that was found to be the most liberally-biased: