11 February 2025

The Memory Machine

How Large Language Models Shape Our Collective Past

The standardization of collective memory involves the processes through which shared memories and narratives are shaped, maintained, and transmitted within a society through formal procedures and often supported by policy. In the digital age, the advent of artificial intelligence (AI) and Large Language Models (LLMs) like GPT has significantly impacted this phenomenon. While these models offer unprecedented capabilities in processing and generating human-like text, they also pose risks to the diversity and plurality of collective memory. Here, I discuss how LLMs influence the standardization of collective memory, the potential dangers of their widespread use, and the importance of regulatory frameworks to mitigate these risks.

The nature of collective memory and its standardization

Collective memory is a socially constructed phenomenon that enables communities to recall and interpret the past, forming a foundation for shared identity and cultural continuity. emphasized that memory is deeply embedded in social contexts and interactions, asserting that individual recollections are influenced by societal norms and values. The standardization of collective memory involves establishing frameworks that determine which memories are preserved, how they are represented, and the mechanisms through which they are communicated across generations.

noted that modern societies experience a redefinition of collective memory due to the diversification of social fields, leading to a plurality of specialized memories rather than a unified narrative. This fragmentation challenges traditional normative frameworks and highlights the importance of recognizing multiple perspectives in the construction of collective memory.

The digital transformation of collective memory

The digital age has ushered in an era where traditional collective memories are digitized. These are digital versions of collections or memories that previously existed in other forms. For example, social media accounts may recontextualize historical events in modern formats. , sharing fabricated photos and stories that mimic known narratives but in a contemporary setting. While such accounts can raise awareness, perhaps in a more organic and horizontal manner, they also risk distorting historical accuracy.

Beyond digitization, entirely new forms of collective memory have emerged through digital platforms. Wikipedia, as an online encyclopedia, serves as a repository of collective memory reflecting the obsessions and biases of different communities regarding politics, history, and identity. The platform’s pluralistic nature allows multiple aspects of the same event to coexist, such as articles on the September 11 attacks alongside those discussing related conspiracy theories.

Digital platforms offer unique opportunities to study collective memory. Web analytics enable researchers to directly measure and analyze how current events trigger interest in past events. For example, a new airline crash can lead to increased attention to previous crashes with similarities, resulting in cumulative attention that surpasses that given to the current event. Similarly, it has been studied and reported how external events like terrorist attacks can affect collective memory and shared identities, influencing how societies perceive the “us versus them” narrative.

Large Language Models and the standardization of memory

To understand how Large Language Models affect our collective memory, it is important to first understand how they function. LLMs are trained on vast amounts of text data, learning patterns and relationships between words to predict the most likely word sequences in response to prompts. When given a question or prompt, an LLM generates a response by selecting words based on statistical probabilities derived from its training data. It has been shown that a corpus of text on Wikipedia has been heavily used to train current LLMs.

However, LLMs do not understand content as humans do and lack the ability to verify facts or access real-time information. Their responses are linguistic constructions rather than verified information. Therefore, they can result in outputs that are coherent and well-structured but not necessarily truthful.

Risks of standardization

The reliance on LLMs introduces risks to the plurality of collective memory. Research indicates that while LLMs can mimic average human behavior, they struggle to replicate the diversity inherent in human societies. The iterative use of LLMs, where their outputs become part of future training data, can create a feedback loop that reinforces a narrow, standardized interpretation of events. Over time, this may erode the multifaceted nature of truth and collective memory, leading to a homogenized cultural narrative.

This can lead to the marginalization of minority viewpoints or alternative perspectives that are underrepresented in the data. This would be in stark contrast to the promise of Web 2.0 and crowd-based information repositories such as Wikipedia.

Propagating illusions of consensus and pluralistic ignorance

As users interact with LLMs and perceive them as authoritative sources, there is a risk of propagating illusions of consensus. This can lead individuals to believe there is a consensus on an issue where none exists, especially when the LLM’s responses lean toward a specific perspective. The repetition of a single source’s claim can be misinterpreted as a consensus supported by multiple independent sources, a phenomenon known as the illusion of consensus.

The illusion of consensus, combined with the mechanism coined by Noelle-Neumann —where individuals hesitate to express minority opinions due to a perceived lack of support, further isolating these views and perpetuating a self-reinforcing cycle of silence—can lead societies to adopt non-pluralistic perspectives on complex issues. Unlike platforms like Wikipedia, which uphold principles of neutrality and pluralism through transparent editorial processes, LLMs operate with opaque, proprietary training data, limiting the representation of marginalized opinions.

The widespread use of LLMs in journalism, academia, social media, and other areas amplifies these risks. As LLMs become integrated into content creation and information dissemination, they may inadvertently standardize narratives and influence public perception. This could have profound implications for democratic discourse, cultural diversity, and the preservation of minority histories and perspectives.

The potential for LLMs to shape collective memory necessitates a critical examination of their role in society. Without intervention, there is a danger that LLMs could contribute to the erosion of cultural pluralism and the homogenization of societal narratives.

Regulatory frameworks and mitigation strategies

Regulating the datasets used to train LLMs is crucial. Ensuring that training data includes diverse perspectives, especially from marginalized groups, can help mitigate biases. Transparency in data selection and curation processes is essential to hold developers accountable for the representativeness of their models. Efforts should be made to audit and document the sources of training data.

These requirements are not met in the current regulations, such as the EU AI act.

Promoting digital literacy

Enhancing public understanding of how LLMs work is another crucial step in mitigating risks. Educational initiatives can help users critically evaluate AI-generated content, recognize potential biases, and understand the limitations of LLMs. By fostering digital literacy, societies can reduce the likelihood of misinformation spreading unchecked.

Investing in research to develop LLMs that prioritize diversity and context-awareness is essential. This includes exploring methods to incorporate ethical considerations into model training and developing techniques to preserve the multifaceted nature of collective memory. Collaborative efforts between technologists, social scientists, and ethicists can lead to more responsible AI systems.

In an era where Large Language Models increasingly shape how societies remember and interpret history, it is crucial to recognize their potential impact on the diversity and plurality of collective memory. By implementing regulatory frameworks, fostering digital literacy, and prioritizing ethical AI development, we can ensure that these technologies enhance rather than homogenize our shared narratives, preserving the richness of human history for future generations.


SUGGESTED CITATION  Yasseri, Taha: The Memory Machine: How Large Language Models Shape Our Collective Past, VerfBlog, 2025/2/11, https://verfassungsblog.de/the-memory-machine/.

Leave A Comment

WRITE A COMMENT

1. We welcome your comments but you do so as our guest. Please note that we will exercise our property rights to make sure that Verfassungsblog remains a safe and attractive place for everyone. Your comment will not appear immediately but will be moderated by us. Just as with posts, we make a choice. That means not all submitted comments will be published.

2. We expect comments to be matter-of-fact, on-topic and free of sarcasm, innuendo and ad personam arguments.

3. Racist, sexist and otherwise discriminatory comments will not be published.

4. Comments under pseudonym are allowed but a valid email address is obligatory. The use of more than one pseudonym is not allowed.




Explore posts related to this:
Collective Memory, Digital Transformation, Standardization