Does AI Require a History Lesson?

02 Apr 2024

The best way to move forward is to learn the lessons of history, so is AI already at risk of reinforcing the cultural biases of the past centuries?

It has been said that some of the greatest names in history have been standing on the shoulders of giants. Our background, our beliefs and our vision of the world will all influence on the way we move forward in it.

As thought-provoking as that might be, now put that in the context of a phenomenon that is set to be the largest tidal wave of innovation since the advent of the worldwide web. The influence of artificial intelligence (AI) is already more advanced than many people realise and it will play an increasingly significant part in our everyday lives, month on month.

But what are the influences on AI? Through what lens does AI see the world?

By the nature of its creation, particularly within the development of Large Language Models (LLMs) like ChatGPT, the spectre of bias, especially that rooted in white, Western perspectives, looms large.

LLMs, such as ChatGPT, Gemini or Claude, are trained on vast datasets compiled from the internet and drawn from books, articles, websites and other text-based resources. There is no escaping the fact that these datasets are predominantly skewed towards English language content, which is disproportionately generated by white, Western authors.

This imbalance in source material inherently influences the model's understanding of culture, history and societal norms, embedding a white, Western perspective at the very core of its knowledge base.

With the central role that AI is set to play in all areas of our lives, drawing on the fullest breadth of world experience will be the only way for us to enjoy that full diversity and a greater sense of understanding of who we are as the human race and the potential our future holds.   

Currently, asking about significant historical moments will often see LLMs disproportionately highlight events from European and North American history, sidelining equally significant events from Asia, Africa, Latin America, and other regions. There is the potential for this Eurocentric view to further distort the global historical narrative, reinforcing the dominance of white, Western perspectives.

This pattern is also seen in the realm of literature and art, with LLMs likely to prioritise Western authors and artists as references for quality and significance. This not only marginalises non-Western creators but also subtly implies a hierarchy of cultural value, biased towards Western works. There is a genuine risk that this bias not only reflects historical inequalities but also shapes the future of content creation in significant ways.

The key players in AI innovation have introduced initiatives to diversify the training datasets of LLMs to include a broader spectrum of languages, cultures and perspectives, both by translating non-English content into English and also training models to understand and generate content in multiple languages directly. There's also a growing emphasis on ethical AI development, which involves the active participation of diverse groups in the creation and training processes of LLMs.

By incorporating a wider range of voices and perspectives, the aim is to reduce inherent biases and ensure that future content creation is more inclusive and representative of global diversity.

Learning from the errors of history, the goal must be to ensure that AI technologies become tools for amplifying all voices, not just those from a dominant subset of society.

Now wouldn’t that be a rewriting of history?