<iframe src="//www.googletagmanager.com/ns.html?id=GTM-K3L4M3" height="0" width="0" style="display:none;visibility:hidden">

Flat White

Wikipedia: how safe is crowdsourcing the truth?

17 February 2024

1:00 AM

17 February 2024

1:00 AM

Have you ever wondered what feeds your internet search results on Google, Yahoo, or Bing?

What about question-answering systems such as Google’s Siri, Amazon’s Alexa, or AI language models?

In almost all cases, Wikipedia has a significant role to play as the primary corpus of information feeding the internet.

The success story that is Wikipedia reflects why tech giants such as Google transformed their algorithms to selectively extract Wikipedia’s information to populate its knowledge panels. Even journalists and universities consider Wikipedia an extremely useful tool to find sources and fact check information.

However, in the age of information empowerment, the notion of anonymous volunteers (aka ‘the crowd’ of editors) wielding control over the cornerstone of online information should raise serious concerns for society.

Younger generations growing up in a digital age are saturated with information, fact and fiction, with society placing Wikipedia at the forefront of determining what is true. This raises not only Orwellian concerns, but also hints at a genius idea with a somewhat unsettling undertone – free labour and free information with no one to take responsibility for what’s presented.

Wikipedia is one of the most reliable sources on the internet, writes author Amy Bruckman in her 2022 book, Should you believe Wikipedia? Her rationale runs along the general line that information on a popular Wikipedia page is edited by thousands of people who support their edits with reliable sources, strengthening its trust. In contrast, a peer-reviewed journal article might only have three researchers verifying the information and then it is published with no option for change. Bruckman, who is best known for her work in online learning and communities, states that Wikipedia has the ability to be updated at any point in the future making it ever-changing and adaptable. In this context, controversies are endlessly debated and edit-warring is part of the process required to get to the ‘truth’.

Wikimedia Foundation CEO Maryana Iskander has echoed similar comments, explaining that the neutrality of the crowd creates a more reliable Wikipedia article – so by extension, the more editors an article has the more balanced it becomes.

For this reason, over the course of some 20 years, Wikipedia has become the silent and final arbiter trusted by tech giants and intergovernmental organisations on what is reliable and neutral information on the internet.


Since Google’s Freebase and the Wikimedia Foundation’s Wikidata merged their knowledge bases in 2014, Wikidata has grown to house more than 60 million data items in more than 400 languages. It supports over 5,000 websites, archives, libraries, and databases, making it the go-to place for instant knowledge. Wikidata’s founder, Denny Vrandečić, imagined that Wikidata would become the Rosetta Stone of the web and it has very much become that in its standing as a coordinated and standardised open data warehouse disseminated and accessed by all with the core of its data coming from its parent project, Wikipedia.

Due to the massive reach of Wikipedia via Wikidata, in 2020 the World Health Organisation and the Wikimedia Foundation collaborated to share information across platforms to expand the public’s access to the latest information about Covid to combat ‘fake news and misinformation’. Google cites Wikipedia on what it deems conspiracy-theory videos and adds captions to YouTube channels. Similar patterns are seen on TikTok and Facebook with its provision of ‘information buttons’ with sources and articles linked from Wikipedia.

This raises serious concerns for society where so much trust is placed in an organisation whose content is created by a crowd of anonymous editors.

Studies show that these editors tend to be a homogeneous group of mostly men, aged 15-49, technically inclined, white-collar workers or students, from majority Christian countries like the United States and the United Kingdom, of whom a small percentage create 80 per cent of the content. Wikipedia itself says a few thousand users make more than 100 edits per month and this isn’t necessarily a bad thing. We know who some of the top editors are through highlighted stories, yet these are a mere few who choose to interview and release some of their identities to the public.

Crowds of anonymous online editors become a problem when Hezbollah supporters, for example, want to edit Wikipedia but cannot publicly show their support for the terrorist organisation without violating the terms of service. They can, however, remain as editors as long as they conform to the site’s rules. The same can be said for just about anyone with fringe beliefs, from Flat Earthers to October 7 denialists. Play by the rules and you can stay.

The concept of consensus and authority is important here. In a journal’s peer review process, consensus emerges from a group of selected experts who collectively acknowledge each other’s expertise. It does not guarantee truth, but it helps. Conversely, in Wikipedia editing, consensus arises from contributions by voluntary editors who gain status through years on the site and a number of edits but may lack genuine expertise. It is the difference between a student writing an essay in college and a published peer-reviewed article by an expert. Combine this modality of editing with smaller articles on the site and bias is more likely to manifest in more subtle ways that are related to tone and decisions to include or exclude specific viewpoints.

Take the following examples, an article on Flat Earth and an article on the Weaponisation of antisemitism. The Flat Earth article created in 2001 has a total number of 5,511 edits with 650-page watchers (usually administrators who monitor page stability). Whereas the article on the Weaponisation of antisemitism created in December 2023 has a total of 196 edits and fewer than 30-page watchers. The former reads more neutral and factual where whereas the latter reads like an argumentative essay. The problem here is that with millions of articles on any given topic, the majority of people are likely to read smaller articles as fact regardless of any increased bias or lack of impartiality.

Larger articles are more likely to be subject to vandalism and manipulation, except with more page watchers and editors it is easier to track such changes and revert them accordingly.

A January 2024 piece in the Times UK highlighted how pro-Iranian Wikipedia editors manipulated entries on the English Wikipedia to downplay the Iranian regime’s human rights abuses. The entries were brought to the attention of administrators by one editor who noticed the changes stating they appeared to lack impartiality and neutral point of view. The Wikimedia Foundation subsequently banned some 16 editors from taking part in conflict of interest editing but could not take action to remove or change any content, deciding to remain neutral on the underlying dispute thus leaving the correction to the broader community of volunteer editors.

The same result was decided in 2023 on the Holocaust distortion investigation where the Arbitration Committee opted not to intervene in correcting the distortion that took place over 15 years. Instead, the result led to topic bans for several editors but not global bans, meaning some of the editors who were involved in distorting history remain editing Wikipedia today, just not on the Holocaust or the Polish involvement.

Arabic Wikipedia is another case in point and is often debated regarding its bias in several topic areas pertaining to Israel, Iran, the United States, and terrorism. Wikipedia has mechanisms in place to address issues of bias such as community-driven policies and processes for verifying information, but WMF does not take responsibility for correction of the site’s information. It would be hard to gauge how many biased or fringe entries remain on Wikipedia yet as a whole it remains trusted by billions of readers and is placed on a pedestal as an impartial encyclopedia.

Over 80 per cent of Wikipedia’s traffic comes from web searches. Type in any phrase or concept and you will likely find its Wikipedia entry. As young people become more connected to the internet greater than the rest of the population and with Wikipedia’s advancement into social media, the concern becomes what young people may think and base their beliefs on in 10-20 years.

Google’s algorithms extract data from Wikipedia and divorce it from any ongoing disputes on Talk Pages removing the data from the context in which it originated. If there is potential for bias to go unchecked, it is via these algorithms presenting the data as fact.

Vrandečić once stated that Wikidata is not about truth, ‘We look at what sources say and share it. It’s up to the user to decide what to believe.’

This is very much a reflection of what transpires on Wikipedia, Wikipedia presents what sources say and effectively the editors choose the sources. Despite the best of intentions, this is perhaps what makes Wikipedia’s power so scary for future generations.

Got something to add? Join the discussion and comment below.


Comments

Don't miss out

Join the conversation with other Spectator Australia readers. Subscribe to leave a comment.

Already a subscriber? Log in

Close