Boris Johnson Beats AI or Garbage In, Garbage Out: The Critical Role of Data
- Aug 22, 2023
- 3 min read
Updated: Dec 27, 2023
In the realm of artificial intelligence (AI), the saying "garbage in, garbage out" holds more truth than ever. This well-known principle emphasises the direct relationship between the quality of input data and the outcomes produced by AI systems.

The Foundation of AI: Data
Artificial intelligence relies on vast amounts of data to learn and make predictions or decisions. Just like a human brain learns from experiences, AI models learn from data. This data can encompass various forms, including text, images, audio, and more. However, the quality, diversity, and relevance of this data are of paramount importance.
Garbage In Garbage Out
The system will inevitably learn from these flawed inputs, leading to erroneous predictions and outcomes. This phenomenon is succinctly captured by the "garbage in, garbage out" principle. If the data used to train an AI model is of low quality, the model's performance will mirror that quality.
Consider a scenario where an AI model is trained to recognise cats. If the training dataset primarily consists of images of dogs labelled as cats, the model will eventually struggle to accurately identify cats. It's crucial to understand that AI models don't possess innate reasoning abilities; they make decisions based on patterns learned from data.
We at TeachOatcake.com used Quillbot Paraphrasing AI Tool to try an improve the data from Boris Johnson's notorious Peppa Pig speech. The results didn't give much more than the original speech.
If we look at this section of Boris Johnson's speech taken from the UK Governments Homepage, it is clear that the quality of data going in is not of the highest quality.
When we compared the original text with the AI generated improved text, there isn't much difference. One might say that the quality of the text going in, could only produce this and nothing better.
Data Quality and Bias
One of the most pressing issues related to AI is bias. If the training data contains biases, the AI model will learn and perpetuate those biases in its predictions and decisions. This has real-world implications in various domains, from hiring processes to criminal justice systems. AI models have been known to exhibit racial, gender, and socio-economic biases, reflecting the biases present in the data they were trained on.
Data quality also extends to issues of completeness. If an AI model is trained on a limited dataset, it might fail to generalise well to new situations. For example, a medical diagnostic AI trained on data from a specific demographic might struggle when faced with cases from a different population.
The Importance of Data Curation
To mitigate the "garbage in, garbage out" dilemma, data curation becomes paramount. Data must be carefully collected, cleaned, and labelled to ensure accuracy and diversity. Human oversight is crucial at this stage to identify and rectify potential biases. Moreover, constant monitoring and retraining of AI models with updated data are essential to adapt to evolving patterns and trends.
AI has the potential to revolutionise industries and improve various aspects of our lives. However, this potential can only be realised if we acknowledge the critical role of data quality. The "garbage in, garbage out" principle underscores the importance of providing AI models with high-quality, diverse, and unbiased data. By doing so, we can ensure that AI systems produce reliable, fair, and meaningful outcomes that truly reflect the goals of innovation and progress. Maybe if Boris Johnson had waited a few more years Peppa Pig may have sounded more Prime Ministerial.