Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Scunthorpe problem

From Wikipedia, the free encyclopedia
Problem caused by profanity filters on the Internet

An example of the Scunthorpe problem inWikipedia because of aregular expression identifying "cunt" in the username

TheScunthorpe problem is the unintentional blocking of online content by aspam filter orsearch engine because their text contains astring (orsubstring) of letters that appear to have anobscene or otherwise unacceptable meaning. Names, abbreviations, and technical terms are most often cited as being affected by the issue.

The problem arises since computers can easily identify strings of text within a document, but interpreting words of this kind requires considerable ability to interpret a wide range ofcontexts, possibly across manycultures, which is an extremely difficult task. As a result, broad blocking rules may result infalse positives affecting many innocent phrases.

Etymology and origin

[edit]

The problem was named after an incident in April 1996 in whichAOL's profanity filter prevented people in the English town ofScunthorpe from creating AOL accounts because the town's name contains the substring "cunt".[1] In the early 2000s,Google's opt-inSafeSearch made the same error, with local services and businesses that included the town in their names orURLs among those mistakenly hidden from search results.[2]

Workarounds

[edit]

The Scunthorpe problem is challenging to completely solve due to the difficulty of creating a filter capable of understanding words in context.[3][4]

One solution involves creating awhitelist of known false positives. Any word appearing on the whitelist can be ignored by the filter, even though it contains text that would otherwise not be allowed.[5]

Other examples

[edit]
This sectionmay containexcessive orirrelevant examples. Please helpimprove it by removingless pertinent examples andelaborating on existing ones.(October 2025) (Learn how and when to remove this message)

Mistaken decisions by obscenity filters include:

Refused web domain names and account registrations

[edit]
  • In April 1998, Jeff Gold attempted to register thedomain name shitakemushrooms.com, but due to the substringshit, he was blocked by anInterNIC filter prohibiting the "seven dirty words".[6] (Shiitake, also commonly spelledshitake, is the Japanese name for the edible fungusLentinula edodes.)
  • In 2000, a Canadian television news story onweb filtering software found that the website for theMontreal Urban Community (French:Communauté Urbaine de Montréal) was entirely blocked because its domain name was its French acronymCUM (www.cum.qc.ca);[7] "cum" (among other meanings) is an English-language vulgar slang term forsemen.
  • In February 2004 in Scotland, Craig Cockburn reported that he was unable to use his surname (pronounced "Coburn",IPA:/ˈkoʊbərn/) withHotmail because it contains the substringcock, a slang word for thepenis. Separately, he had problems with his workplace email because his job title,software specialist, contained the substringCialis, anerectile dysfunction medication commonly mentioned inspam e-mails. Hotmail initially told him to spell his name C0ckburn (with a zero instead of the letter "o") but later reversed the ban.[8] In 2010, he had a similar problem registering on the BBC website, where again the first four characters of his surname caused a problem for the content filter.[9]
  • In February 2006, Linda Callahan was initially prevented from registering her name withYahoo! as an e-mail address as it contained the substringAllah. Yahoo! later reversed the ban.[10]
  • In July 2008, Herman I. Libshitz could not register an e-mail address containing his name withVerizon because his surname contained the substringshit, and Verizon initially rejected his request for an exception. In a subsequent statement, a Verizon spokeswoman apologised for not approving his desired e-mail address.[11]

Blocked web searches

[edit]
  • In the months leading up to January 1996, some web searches forSuper Bowl XXX were being filtered, because theRoman numeral for the game is also used to identifypornography.[12]
  • Gareth Roelofse, the web designer forRomansInSussex.co.uk, noted in 2004: "We found many library Net stations, school networks and Internet cafes block sites with the word 'sex' in the domain name. This was a challenge for RomansInSussex.co.uk because its target audience is school children."[2]
  • InGerman, a linking letter is used in compound words to represent the genitive case of the first of two linked words. For example, inGeburtstag, which means day of birth, the compound is built up by connecting the phrase "der Geburt Tag", adding a linking s inbetween. In case the second word starts with ex-, this way of compounding long words results in the substringsex, which is said to be triggering search engine block filter lists, famously so in the wordStaatsexamen.[13]
  • In 2008, the filter of the free wireless service of the town ofWhakatāne in New Zealand blocked searches involving the town's own name because the filter'sphonetic analysis deemed the "whak" to sound likefuck; the town name is in Māori, and in theMāori language, "wh" is most commonly pronounced/f/. The town subsequently put the town name on the filter'swhitelist.[14]
  • In July 2011, web searches in China on the nameJiang were blocked following claims on theWeibo microblogging site that formerChinese Communist Party (CCP) general secretaryJiang Zemin had died. Since the word "Jiang" meaning "river" is written with the sameChinese character (), searches related to rivers including theYangtze (Cháng Jiāng) produced the message: "According to the relevant laws, regulations and policies, the results of this search cannot be displayed."[15]
  • In February 2018, web searches on Google's shopping platform were blocked for items such asglue guns,Guns N' Roses, andBurgundy wine after Google hastily patched its search system after it displayed results for weapons and accessories that violated Google's stated policies.[16]

Blocked emails

[edit]
  • In 2001,Yahoo! Mail introduced anemail filter which automatically replacedJavaScript-related strings with alternative versions, to prevent the possibility ofcross-site scripting inHTML email. The filter wouldhyphenate the terms "JavaScript", "JScript", "VBScript" and "LiveScript"; and replaced "eval", "mocha" and "expression" with the similar but not quite synonymous terms "review", "espresso" and "statement", respectively. Assumptions were involved in the writing of the filters: no attempts were made to limit these string replacements toscript sections and attributes, or to respect word boundaries, in case this would leave some loopholes open. This resulted in such errors asmedireview in place ofmedieval.[17][18][19]
  • In February 2003,Members of Parliament at theBritish House of Commons found that a newspam filter was blocking emails containing references to the Sexual Offences Bill then under debate, as well as some messages relating to aLiberal Democrat consultation paper on censorship.[20] It also blocked emails sent inWelsh because it did not recognise the language.[21]
  • In October 2004, it was reported that theHorniman Museum in London was failing to receive some of its email because filters mistakenly treated its name as a version of the wordshorny man.[22]

Blocked for words with multiple meanings

[edit]
  • In October 2004, e-mails advertising thepantomimeDick Whittington sent to schools in the UK were blocked by school computers because of the use of the nameDick, sometimes used asslang forpenis.[23]
  • In May 2006, a man inManchester in the UK found that e-mails he wrote to his local council to complain about a planning application had been blocked as they contained the worderection when referring to a structure.[24]
  • Blocked e-mails and web searches relating toThe Beaver, a magazine based inWinnipeg, caused the publisher to change its name toCanada's History in 2010, after 89 years of publication.[25][26] Publisher Deborah Morrison commented: "Back in 1920,The Beaver was a perfectly appropriate name. And while its other meaning [vulva] is nothing new, its ambiguity began to pose a whole new challenge with the advance of the Internet. The name became an impediment to our growth".[27]
  • In June 2010,Twitter blocked a user fromLuxembourg 29 minutes after he had opened his account and posted his first tweet. The tweet read: "Finally! A pair ofgreat tits (Parus major) has moved into my birdhouse!" Despite including the Latin name to point out that the tweet was about birds, any attempts to unblock the account were in vain.[28]
  • In 2011, a councillor inDudley found an email flagged for profanity by his council's security software after mentioning theBlack Country dishfaggots (a type ofmeatball, but alsoa pejorative term forgay men).[29]
  • Residents ofPenistone in South Yorkshire have had e-mails blocked because the town's name includes the substringpenis.[30]
  • Residents ofClitheroe (Lancashire, England) have been repeatedly inconvenienced because their town's name includes the substringclit, which is short for "clitoris".[31]
  • Résumés containing references to graduating withLatin honors such ascum laude,magna cum laude, andsumma cum laude have been blocked by spam filters because of inclusion of the wordcum, which is Latin for 'with' (in this usage), but is sometimes used as slang forsemen orejaculation in English usage.[32]
  • AReddit moderator alleged in March 2025 that the platform's automatic moderation system has been flagging posts that mentioned the name "Luigi". The moderator, who noted it flagging a post about the video gameLuigi's Mansion 3, believed it to be part of Reddit's "automod" feature for forums with few active human moderators, and the software regarded "Luigi" as being among words which "could – but don't necessarily – indicate violating content" in reference toLuigi Mangione, the suspect in theassassination of United Healthcare CEO Brian Thompson.[33]

News articles

[edit]
"Tyson Homosexual" redirects here. For his article, seeTyson Gay.
  • In June 2008, the news siteOneNewsNow run by the anti-LGBT lobby groupAmerican Family Association filtered anAssociated Press article on sprinterTyson Gay, replacing instances of "gay" with "homosexual", thus rendering his name as "Tyson Homosexual".[34][35] This same function had previously changed the name of basketball playerRudy Gay to "Rudy Homosexual".[36] Similarly, in 2015, on the 70th anniversary of theatomic bombing of Hiroshima, theObserver Chronicle referred to theEnola Gay as the "Enola Homosexual".[37][38]
  • The word or string "ass" may be replaced by "butt", resulting in "clbuttic" for "classic", "buttignment" for "assignment", and "buttbuttinate" for "assassinate".[39] Saying something was a "clbuttic mistake" - a filter block of "ass" in "classic mistake" - is used online to humorously point out instances of the Scunthorpe problem happening.[40][41]

Video games

[edit]
  • In 2008,Microsoft confirmed that its policy to prevent the use of words relating to sexual orientation had meant that Richard Gaywood's name was deemed offensive and could not be used in his "gamertag" or in the "Real Name" field of his bio.[42]
  • In 2010,Microsoft banned an Xbox Live player named Josh Moore for stating he lived inFort Gay, West Virginia, mistakenly deeming his profile offensive. Although he tried to clarify the situation, Xbox initially upheld the decision and warned of a possible permanent ban. The suspension lasted several days, causing him to miss aCall of Duty: Modern Warfare 2 Search and Destroy tournament.[43][44] It took an appeal from Mayor David Thompson and media coverage for the issue to finally be corrected.[45]
  • In 2011, the release ofPokémon Black and White introducedCofagrigus, which could not be traded online to other players without a nickname because its species name contained the substringfag. The system has since been updated to allow players to trade it without nicknames. The same problem occurred withNosepass,Probopass andFroslass due to their inclusion of the substringass.[46]
  • In January 2014, files used in the online gameLeague of Legends were reportedly blocked by some UKISP filters due to the names "VarusExpirationTimer.luaobj" and "XerathMageChainsExtended.luaobj", which contain the substringsex. This was later corrected.[47]
  • In 2020, the gameGenshin Impact made news when it was found to censor words such as "Hong Kong", "Tibet", as well as words such as "enemies". Game analyst Daniel Ahmad mentioned that "China's laws and games regulator state that games cannot contain 'Anything that threatens China's national unity".[48] However, this can lead to benign words, parts of words and words in longer strings being censored or unable to send, such as reports of simply "Kong" in the Genshin Impact event, or any word with the string 'hong' as well as words that are swears in romanized Chinese but not in English. In 2017, a paper looked at over 180,000unique blacklisted keywords.[49] Content policy plans are required to be submitted to regulators,[50] but there is no centralized 'banned' keyword list provided by federal or even provincial authorities, so every developer and publisher may or may not allow different "edge cases".[51] The flagging of innocuous language because ofregex-based keyword lists is omnipresent whether inLatin or Chinese characters. This has prompted a Chinese internet slang culture that leans heavily on puns and intentional misspelling.[52]

Other

[edit]
  • In 2013, file transfers named for the Swedish city ofFalun caused web connection outages at Diakrit, a firm based in China. Diakrit resolved the issue by renaming the files. Fredrik Bergman of Diakrit believes that the file names triggered theGreat Firewall's censors used to block discussion ofFalun Gong, a banned religious movement founded in China.[53]
  • In November 2013,Facebook temporarily blocked British users for using the wordfaggot in reference to thetraditional dish of the same name.[54]
  • In May 2018, the website of the grocery storePublix would not allow a cake to be ordered containing the Latin phrasesumma cum laude. The customer attempted to rectify the problem by including special instructions, but still ended up with a cake reading "Summa --- Laude".[55][56][57]
  • In May 2020, despite extensive media scrutiny, somehashtags directly referring to British political advisorDominic Cummings were unable totrend onTwitter because the substringcum triggered an anti-porn filter.[58]
  • In October 2020, apaleontology conference's virtual meeting platform blocked various words including "bone", "pubic", and "stream".[59]
  • In January 2021, Facebook apologised for muting and banning users after it had erroneously flagged the Devon landmarkPlymouth Hoe as misogynistic.[60]
  • In April 2021, the official Facebook page for the French Commune ofBitche was taken down. In response, commune officials created a new page referencing instead the postal code,Mairie 57230. Facebook later apologised and restored the original page. As a precaution, the officials ofRohrbach-lès-Bitche renamed their Facebook pageVille de Rohrbach.[61][62]
  • In March 2025, theUnited States Department of Defense erroneously removed references to theEnola Gay aircraft because it had the substringgay as part of a purge ofdiversity, equity and inclusion content ordered by Defense SecretaryPete Hegseth.[63]

See also

[edit]

References

[edit]
  1. ^Clive Feather (25 April 1996). Peter G. Neumann (ed.)."AOL censors British town's name!".The Risks Digest.18 (7).
  2. ^abMcCullagh, Declan (23 April 2004)."Google's chastity belt too tight".CNET.Archived from the original on 16 June 2011.
  3. ^Oberhaus, Daniel (29 August 2018)."Life on the Internet Is Hard When Your Last Name is 'Butts'".Vice. Retrieved31 July 2022.
  4. ^Gellis, Cathy (31 August 2018)."The Scunthorpe Problem, And Why AI Is Not A Silver Bullet For Moderating Platform Content At Scale".Techdirt. Retrieved31 July 2022.
  5. ^Veale, Tony (2021).Your Wit Is My Command: Building AIs with a Sense of Humor. MIT Press. p. 231.ISBN 978-0-262-04599-5.OCLC 1221016857.
  6. ^Festa, Paul (27 April 1998)."Food domain found "obscene"".CNET.Archived from the original on 10 May 2020.
  7. ^"Foire aux questions".Canadian Broadcasting Corporation. Archived fromthe original on 21 October 2012. Retrieved24 February 2011.
  8. ^Barker, Garry (26 February 2004)."How Mr C0ckburn fought spam".The Sydney Morning Herald.Archived from the original on 3 September 2009.
  9. ^Cockburn, Craig (9 March 2010)."BBC fail – my correct name is not permitted".blog.siliconglen.com.Archived from the original on 30 September 2020.
  10. ^"Is Yahoo Banning Allah?". Kallahar's Place. Archived from the original on 14 January 2016. Retrieved24 February 2011.
  11. ^Rubin, Daniel."When your name gets turned against you".The Philadelphia Inquirer. Archived fromthe original on 5 August 2008. Retrieved3 August 2008.
  12. ^"E-Rate And Filtering: A Review Of The Children's Internet Protection Act". Congressional Hearings. General. Energy and Commerce, Subcommittee on Telecommunications and the Internet. 4 April 2001.
  13. ^"Zum Online-Banking geht es ins Internetcafé".www.nordbayern.de.
  14. ^"F-Word Town's Name Gets Censored By Internet Filter". Archived from the original on 1 December 2008. Retrieved27 July 2011.{{cite news}}: CS1 maint: bot: original URL status unknown (link)
  15. ^Chin, Josh (6 July 2011)."Following Jiang Death Rumors, China's Rivers Go Missing".The Wall Street Journal.Archived from the original on 13 August 2011.
  16. ^Molloy, Mark (27 February 2018)."Wine lovers cannot buy Burgundy tipple on Google as internet giant cracks down on 'gun' searches".The Telegraph.Archived from the original on 2 March 2018. Retrieved27 February 2018.
  17. ^"Yahoo admits mangling e-mail".BBC News. 19 July 2002.Archived from the original on 26 January 2021. Retrieved21 June 2013.
  18. ^"Hard news".Need To Know 2002-07-12. 12 July 2002. Retrieved21 June 2013.
  19. ^Knight, Will (15 July 2002)."Email security filter spawns new words".New Scientist.Archived from the original on 24 September 2020. Retrieved21 June 2013.
  20. ^"E-mail vetting blocks MPs' sex debate".BBC News. 4 February 2003.Archived from the original on 4 February 2021.
  21. ^"Software blocks MPs' Welsh e-mail".BBC News. 5 February 2003.Archived from the original on 4 February 2021.
  22. ^Kwintner, Adrian (5 October 2004)."Name of museum is confused with porn".News Shopper.
  23. ^Jones, Sam (13 October 2004)."Panto email falls foul of filth filter".The Guardian.Archived from the original on 4 February 2021.
  24. ^"E-mail filter blocks 'erection'". 30 May 2006.Archived from the original on 4 February 2021.
  25. ^"The Beaver mag renamed to end porn mix-up".The Sydney Morning Herald.Agence France-Presse. 13 January 2010.Archived from the original on 9 November 2020. Retrieved24 February 2021.
  26. ^Austen, Ian (24 January 2010)."Web Filters Cause Name Change for a Magazine".The New York Times.Archived from the original on 9 November 2020. Retrieved24 February 2021.
  27. ^Sheerin, Jude (29 March 2010)."How spam filters dictated Canadian magazine's fate".BBC News.Archived from the original on 16 January 2021.
  28. ^"Luxemburger Twitter-Neubenutzer nach 29 Minuten blockiert" [Luxembourg new Twitter user blocked after 29 minutes].Tageblatt (in German). 22 June 2010. Retrieved12 June 2010.[dead link]
  29. ^"Black Country Councillor Caught up in Faggots Farce".Birmingham Mail. 24 February 2011.
  30. ^Tom Chatfield (17 April 2013)."The 10 best words the internet has given English".The Guardian.
  31. ^Keyes, Ralph (2010).Unmentionables: From Family Jewels to Friendly Fire – What We Say Instead of What We Mean. John Murray.ISBN 978-1-84854-456-7.
  32. ^Maher, Kris."Don't Let Spam Filters Snatch Your Resume".Career Journal. Archived fromthe original on 23 October 2006. Retrieved11 February 2008.
  33. ^Sato, Mia (7 March 2025)."A Reddit moderation tool is flagging 'Luigi' as potentially violent content".The Verge. Retrieved8 March 2025.
  34. ^Frauenfelder, Mark (30 June 2008)."Homophobic news site changes athlete Tyson Gay to Tyson Homosexual".Boing Boing.Archived from the original on 4 February 2021.
  35. ^Arthur, Charles (30 June 2008)."Computer autocorrects surname 'gay' to.. no, you guess".The Guardian.Archived from the original on 13 November 2020.
  36. ^Mantyla, Kyle (30 June 2008)."The Dangers of Auto-Replace".Right Wing Watch.People for the American Way.Archived from the original on 25 October 2020. Retrieved24 February 2021.
  37. ^Williams, Joe (6 August 2015)."US newspaper claims Hiroshima bombing caused by 'homosexual' plane".PinkNews. Retrieved14 January 2025.
  38. ^"Hiroshima Atomic Bombing 70th Anniversary Marked with Solemn Ceremony, Calls for Nuclear Disarmament".Observer Chronicle. 6 August 2015. Archived fromthe original on 11 August 2015.
  39. ^Moore, Matthew (2 September 2008)."The Clbuttic Mistake: When obscenity filters go wrong".The Telegraph.Archived from the original on 23 February 2020.
  40. ^Dengsø, Christopher (19 July 2023)."The Clbuttic Mistake: A Thing Of The Past?".Moderation API. Retrieved25 November 2024.
  41. ^"Clbuttic mistake".Collins Dictionary. 13 February 2020. Retrieved25 November 2024.
  42. ^"Microsoft Confirms "Gaywood" Is An Offensive Surname, Mr. Gaywood Responds". 22 May 2008. Archived fromthe original on 9 November 2012.
  43. ^"Xbox apologises over 'gay' suspension".BBC News. 10 September 2010. Retrieved1 February 2025.
  44. ^"Town's name confuses Xbox".Spokesman.com. 9 September 2010. Retrieved1 February 2025.
  45. ^Whitworth, Dan (10 September 2010)."Xbox apologises over 'gay' suspension".BBC News. Retrieved17 February 2025.
  46. ^Keating, Lauren (17 February 2016)."These Are The Words Nintendo Censors From Appearing On The 3DS".Tech Times. Retrieved14 November 2023.
  47. ^Gibbs, Samuel (21 January 2014)."UK porn filter blocks game update that contained 'sex'".The Guardian. London.Archived from the original on 11 November 2020.
  48. ^Walker, Ian (6 October 2020)."Genshin Impact is Censoring Words Like 'Taiwan' and 'Hong Kong'".Kotaku. Retrieved6 November 2025.
  49. ^Knockel, Jeffrey; Ruan, Lotus; Crete-Nishihata, Masashi (14 August 2017)."Keyword Censorship in Chinese Mobile Games".The Citizen Lab. Munk School of Global Affairs & Public Policy, University of Toronto. Retrieved6 November 2025.
  50. ^Kuhns, Todd (31 October 2019)."Security Assessment Forms Now Required for App Publishers in China".AppinChina. Retrieved6 November 2025.
  51. ^Knockel, Jeffrey; Ruan, Lotus; Crete-Nishihata, Masashi (2017)."Measuring Decentralization of Chinese Keyword Censorship via Mobile Games".7th USENIX Workshop on Free and Open Communications on the Internet (FOCI 17). USENIX Association. Retrieved6 November 2025.
  52. ^Chen, Stella (27 July 2022)."Don't You Dare Say "WeChat"".China Media Project. Retrieved6 November 2025.
  53. ^Mozur, Paul; Tejada, Carlos (13 February 2013)."China's 'Wall' Hits Business".The Wall Street Journal.Archived from the original on 10 September 2013. Retrieved25 May 2013.
  54. ^"Faggots and peas fall foul of Facebook censors".Express & Star. 1 November 2013.Archived from the original on 10 May 2020.
  55. ^Ferguson, Amber (22 May 2018)."Proud mom orders 'Summa Cum Laude' cake online. Publix censors it: Summa … Laude".The Washington Post. Archived from the original on 22 May 2018. Retrieved22 May 2018.{{cite news}}: CS1 maint: bot: original URL status unknown (link)
  56. ^Amatulli, Jenna (22 May 2018)."Publix Censors Teen's 'Summa Cum Laude' Graduation Cake".The Huffington Post.Archived from the original on 5 September 2018.
  57. ^"Carolyn Cooper | Latin cum mistaken for semen".jamaica-gleaner.com. 27 May 2018. Retrieved2 August 2025.
  58. ^Hern, Alex (27 May 2020)."Anti-porn filters stop Dominic Cummings trending on Twitter".The Guardian.Archived from the original on 20 February 2021.
  59. ^Ferreira, Becky (15 October 2020)."A Profanity Filter Banned the Word 'bone' at a Paleontology Conference".Motherboard.Archived from the original on 23 February 2021.
  60. ^Morris, Steven (27 January 2021)."Facebook apologises for flagging Plymouth Hoe as offensive term".The Guardian.Archived from the original on 29 January 2021.
  61. ^Kempf, Cédric (12 April 2021)."Insolite : Bitche est censuré par Facebook".Radio Mélodie (in French).
  62. ^Darmanin, Jules (13 April 2021)."Facebook takes down official page for French town of Bitche".POLITICO. Retrieved3 July 2021.
  63. ^"War heroes and military firsts are among 26,000 images flagged for removal in Pentagon's DEI purge".AP News. 7 March 2025.
By language
Devices
Other
Portal:
Retrieved from "https://en.wikipedia.org/w/index.php?title=Scunthorpe_problem&oldid=1320989138"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp