
Nominations for four new seats on theFunds Dissemination Committee (FDC) have just closed. The FDC is a volunteer WMF body that strongly influences how a high proportion of donors' funds are spent. The Committee was created in 2012 after the Foundation Board stopped all but a few chapters from directly processing donations raised on its behalf. The move brought the review and funding of most "eligible" affiliates under more centralised community review, and the FDC has since been the primary instrument for scrutinising their applications for recurrent operating expenses, and recommending to the Board who should get what. Thus far the Board has accepted all of the FDC's recommendations from the twice-yearly application rounds.
The FDC has been no stranger to controversy. In late 2015, with the Foundation in turmoil over issues of governance, funding, and leadership, the FDC stepped outside its official mandate to publisha scathing critique of the WMF’s performance. This step was well received: the WMFaccepted and acted on the feedback (video 14:27), andinvited the FDC’s comments in the subsequent funding round.
The FDC'stwo-year memberships have been designed around a leap-frogging process of yearly alternating Board appointments of four volunteer voting members and community elections of five volunteer voting members. In addition, the Board maintains a close relationship with the FDC through the appointment of two of its own members as non-voting FDC members (currently ex-chair of the FDC,Dariusz Jemielniak andGuy Kawasaki), and the participation of three non-voting staff members in FDC processes (Katy Love,Winifred Olliff, andDelphine Ménard, herself a former voting member of the FDC).
The current round is for the four Board appointments, and has attracted13 self-nominees. Nominations closed just over a week ago, after which there was a shortpublic Q&A. FDC staff will now confer with the two FDC board representatives to draw up a shortlist by 5 August; a decision on the final four candidates will be finalised in consultation with the Board, and announced 2 September. In the last, 2014 round of Board appointments, the factors weighed up by the Board's werereported by a Trustee as: "solid Wikimedia contributions (online or offline); complementary non-Wikimedia background, some in finance or budgeting or program evaluation/review (not all though and that's not a must); grantmaking/reviewing experience in Wikimedia; chapter leadership/exec experience or non-chapter contributor; geographical, age, language, wiki diversity; reasoned, analytical responses to the Q&A on Meta and during the interviews."
The FDC's charter explicitly requires membership diversity; in practice, this has clearly been a stumbling block, probably for a complex set of reasons that are difficult to resolve. Of the five ongoing voting members (recommended last year via community election), two are native-speakers of English, all are males, and all are from the global north; of the four whose terms are about to become vacant, two are native-speakers of English, three are males, and three are from the global north (depending on how the south–north boundary isdefined). Among the 13 nominees, six declare themselves to be native-speakers of English and nine as male, with roughly eight from the global north.
Questions from Wikimedia community members involved the duties of Wikimedia affiliates regarding paid editing; depth of experience in evaluating grants; diversity (specifically, geographic and gender); the need for innovation, its tension with evaluating success by standard measures, and the value in sharing stories of successes and failures; the nature of what FDC money should fund; and its relation to the committee’s volunteer community.
Two candidates came to Wikimedia through high-profile roles with the WMF (which they have since left): Garfield Byrd as chief financial officer, and Bishakha Datta as trustee. Several have extensive experience in Wikimedia governance, including past service on the FDC itself. Several women, and several candidates from the global south, could increase the diversity of perspectives on the FDC. Some have very little experience with editing wikis, and point to their professional backgrounds for their main qualifications. Along with this elaborate matrix of backgrounds, the staff and Board will need to factor in the relevant expertise and experience required by a task that at certain times during the year will require a full-time effort by members. Former chair of the FDC and now member of the WMF Board, Dariusz Jemielniak,posed a significant question to all candidates on the Q and A page:
| “ | ... the FDC requires a lot of past experience in evaluating grants (not just writing grant proposals, which also is a must, but of having had a chance to read and compare 100+ applications for money), or extensive professional background in management, strategy, finance, or auditing. | ” |
Jemielniak asked for nominees' attitudes to this proposition, and how they saw their background in relation to it. There was surprising variety in the responses, revealing something of a clash of cultures between valuing on-the-ground programmatic experience and professional, technocratic expertise, although some nominees emphasised the need for both dimensions to be represented on the Committee. One answer asserted a strongly different view of the importance of grantmaking experience:
| “ | Some of the best grantmakers I've worked with in foundations had none of these skills. But they had other skills needed to make good grants. They had domain knowledge or expertise. They had a vision. | ” |
The other answers echoed one or both of these positions, reflecting a range of views of the relative merits of grantmaking experience and programmatic experience.
The Board and FDC staff face a range of competing needs in their judgment of the nominations. Not only are there issues of diversity and grantmaking-related professional skills; there is the need to prepare the FDC, and grantmaking more broadly, to grapple with deeper issues over time. Among these are the inherent difficulty of predicting and measuring impact-value for money of programmatic activities on WMF sites and their readers; and the likelihood that we are entering a period in which the model for fundraising isunder challenge. –T andP


With regret, theSignpost passes on the news that Geoff Brigham finished up on 18 July as general counsel and secretary to the Foundation, after five years of service. Geoff, who came to the WMF from a very senior role ateBay, posted amessage to the Wikimedia mailing list expressing his love for "the mission, the Foundation, the Wikimedia communities, and my colleagues at work ... I stand in awe of the volunteer writers, editors, and photographers who contribute every day to the Wikimedia projects. The future of the Foundation under Katherine's leadership is exciting."
Executive director Katherine Maherreplied to Geoff:
| “ | You’ve seen the Foundation through a remarkable five years. You’ve built a tremendous team that is critical to helping the Wikimedia projects thrive well into the future. You’ve expertly navigated our challenges, focusing our efforts where we can have the most impact. Through your team, you've empowered the Foundation as fierce advocates for open licensing, privacy, freedom of information, and contributors rights, truly embodying the values of our movement. And as a colleague, you’ve been a counselor and voice of wisdom for our executive team and Board of Trustees. | ” |
Michelle Paulson will be interim head of legal, and Stephen LaPorte will be interim secretary to the Board (pending Board approval). Geoff will take up the position of director of YouTube Trust & Safety, managing global teams for policy, legal, and anti-abuse operations. We wish him well. –T
TheArbitration Committee (ArbCom) recently decided to implement a new type of restriction for pages on certain topics with intractable and long-running disputes, such as theGamergate controversy. It barred editing from anonymous (IP) users and registered editors with fewer than 30 days tenure and 500 edits.
Initially, a series ofedit filters enforced the restriction. In January 2016, an editorproposed a new protection level called extended confirmed protection ("ECP" or "30/500", for short) with the same function. Although the proposal received some complaints regarding theinstruction creep it presented to new editors, it was eventually approved and technically implemented, with editors being granted the "extendedconfirmed" user right after reaching the requirement. ECP was rolled out on April 5, with ArbCom passing a motion allowing administrators to use ECP to preventsockpuppetry when less restrictive protection fails to work.
Since that time, ECP occasionally deviated from its ArbCom use: without raising the eyebrows of many, it was used for other reasons, such as to prevent BLP violations. Within three months, an administratormade a proposal allowing use of ECP for any purpose, not just for ArbCom and sockpuppetry: that, with community scrutiny, administrators would be allowed to use ECP protection. The RfC gave editors three options:
The RfC has received a wide range of inputs, with most non-administrators and administrators supporting the third option, and some non-administrators and a few administrators supporting the first and second options. Proponents of the third option believe ECP would be valuable in stopping disruption, while its opponents believe that it would deter newcomers and disenfranchise occasional editors.

Genetically modified organisms (GMOs) have been a controversial topic for years on Wikipedia, and one with a less than peaceful environment: a number of editors have beensanctioned by ArbCom for poor decorum in GMO discussion, and"discretionary sanctions" have been implemented to stabilize GMO articles.
Wikipedia's coverage of the safety of GM foods in particular has been a source of conflict. Many editors believed the then-current wording on GMO safety was inadequate and provides little context:
There is a general scientific agreement that food from genetically modified crops is not inherently riskier to human health than conventional food, but should be tested on a case-by-case basis. No reports of ill effects have been proven in the human population from ingesting GM food. Although labeling of GMO products in the marketplace is required in many countries, it is not required in the United States and no distinction between marketed GMO and non-GMO foods is recognized by the US FDA. In a May 2014 article inThe Economist it was argued that, while GM foods could potentially help feed 842 million malnourished people globally, laws such as those being considered by Vermont's governor, Peter Shumlin, to require labeling of foods containing genetically modified ingredients, could have the unintended consequence of interrupting the process of spreading GM technologies to impoverished countries that suffer withfood security problems.
— Pre-RfC version of second paragraph ofGenetically modified organism#Controversy.
To help settle the question,a RfC to change the current wording was opened. Moderated under tight conditions, with strict word limits and behavioral restrictions, there were 22 proposals; nearly 90 editors participated. After one month of discussion, the RfC was closed on July 7, and the first proposal prevailed:
There is ascientific consensus that currently available food derived from GM crops poses no greater risk to human health than conventional food, but that each GM food needs to be tested on a case-by-case basis before introduction. Nonetheless, members of the public are much less likely than scientists to perceive GM foods as safe. The legal and regulatory status of GM foods varies by country, with some nations banning or restricting them, and others permitting them with widely differing degrees of regulation.
— Proposal 1,Wikipedia:Requests for comment/Genetically modified organisms
GMO articles faced a less-than-smooth transition afterwards, as several editors debated the best way to include the new language and replace the old. In the first few days after the RfC was closed, additional text was deleted and replaced while some editors debated whether to change language immediately before and after the RfC-mandated language. Approximately a week later, those disagreements had calmed down.
[[File:|center|300px]]
The Hindureportedabout an edit-a-thon on Indian women scientists held on July 16 inBangalore. Theirpre-event article noted that only about 40 women scientists from the country currently have Wikipedia entries, and many of those are incomplete or lack citations.
The paper'sfollowup article reported that about 25 editors participated in the event, creating and updating articles on prominent women scientists in the country.Sandhya Srikant Visweswariah, chair of the Department of Molecular Reproduction, Development and Genetics at theIndian Institute of Science, was among the subjects tackled. One participant noted, however, that "lack of citations online made it hard to validate entries for many women scientists from the country". This, of course, is a persistent concern, as discussed in part inThe Atlantic last month. Having content online leads to the production of more content. Creating new material from non-online content – and being able to use that content to defend Wikipedia's processes of validating content and assessing notability – is a much bigger task although also an essential one.--Milo
Cracked.comfeatured a critical piece on Wikipedia as "shockingly biased", with input from current administratorCrisco 1492. The piece falls squarely in the sweet-spot of modern criticism of any website: (1) it comes from a website that loves Wikipedia; (2) has readers who love Wikipedia and use it all the time despite its faults; and thus (3) will read any articles, which raises "shocking" concerns about Wikipedia. And though the items discussed are mostly old-hat to Wikipedia editors (not to discount their importance), such articles are usually popular. This one has already received over 350,000 views and 450 comments.
The topic areas discussed in the article include three common complaints: (1) the lack of diversity in contributors and content, such as thegender gap andsystemic biases (seeThe Hindu edit-a-thon discussed above), and the focus of some editors on niche content areas; (2) the ever-present problem of vandalism, but particularly the feedback loop where inaccuracies are cited in the press – "like a game of telephone, only at the end of the game, the garbled nonsense gets published in a newspaper"; and (3) petty arguments among editors, though this discussion also ends in more discussion of vandalism, such as those quixotic editors who like to change heights and weights.
The article also cites theWikipediocracy website as one "dedicated to destroying Wikipedia", though such a threat does not seem as existential when described as "less like a public service and more like a bunch of Mensa wannabes trying to high five, only to awkwardly smack each other in the nose". Lastly, the piece concludes that "Wikipedia is dying", citing statistics about declining numbers of "very active" editors and the lack of sufficient administrators.
All of these concerns have degrees of validity, and though not precisely news, the continuing focus on them is no doubt important in finding solutions. When high-profile articlesstop being written about Wikipedia's flaws, that would suggest irrelevance, which is a much surer sign of decline. No one complains about the functionality or value ofMyspace anymore.--Milo






Sevenfeatured articles were promoted these weeks.
Fivefeatured lists were promoted these weeks.
Onefeatured topic was promoted these weeks.
Fourteenfeatured pictures were promoted these weeks.
Your Traffic Reports for the weeks of June 26 – July 2, and July 3–9, 2016:
The dominant topic in Wikipedia traffic the week of June 26 to July 2 was sports, and more particularly football, withUEFA Euro 2016 in the top spot for third straight week. AndIceland's improbableteam (#22 in theWP:TOP25) victory over England in UEFA Euro 2016 put that country's article at #5.Lionel Messi's (#4) defeat at theCopa América (#12) final, and his subsequent retirement announcement, was also big news. In other news, the hangover fromBrexit (#25) kept theEuropean Union (#9) in the top ten for a second week, and putBoris Johnson (#13) andTheresa May (#17) on the Top 25 as well.Game of Thrones also merits a mention, taking slots #2 and #3, and itsseason finale episode article at #18.
Moving on to the week of July 3–9, sports dominated again this week, with the traditional return ofWimbledon joining the lead-up to theUEFA Euro 2016 football tournament, the latestUFC event, and an unexpected team change for an NBA superstar. But it was a sport of an entirely modern kind,Pokemon Go, that led the pack, and before you ask, yes,Pokemon is anesport. Traditional summer distractions such as movies and television round out the list, with the inclusion of politiciansDonald Trump andAndrea Leadsom after the Top 10 to remind us (barely) of the real world.
For the full top-25 lists (and our archives back to January 2013), seeWP:TOP25. Seethis section for an explanation of any exclusions. For a list of the most edited articles every week, seeWP:MOSTEDITED. For the most popular articles thatORES models predict are low quality, seeWP:POPULARLOWQUALITY.
The ten most popular articles on Wikipedia, as determined from theWP:5000 were:
| Rank | Article | Class | Views | Image | Notes |
|---|---|---|---|---|---|
| 1 | UEFA Euro 2016 | 1,590,000 | A third straight week in the top spot, though with less than half as many views as last week. The Round of 16 commenced on June 25, and the quarter-final rounds were underway when this weeks' report closed July 2. The final four teams were Portugal, Wales(!), Germany, and France, with the next match on July 6. | ||
| 2 | Game of Thrones | 1,126,688 | Last week theSeason 6 article was #6, while this general series article was #16 (with 730K views). Why the switch this week? No doubt it is because the season finale on June 26 (The Winds of Winter) (#18) caused more mainstream press coverage, prompting more people unfamiliar with the show to look it up on Wikipedia to see what they were missing. | ||
| 3 | Game of Thrones (season 6) | 1,103,448 | See #2. Numbers up slightly from last week. | ||
| 4 | Lionel Messi | 1,060,930 | Up from #21 and 564K views last week. TheArgentineforward and "best footballer on the planet"TM faced Chile in theCopa America Centenariofinal on June 26, and lost on penalty kicks after a 0–0 draw. The 29-year-old Messi announced his retirement after the game. | ||
| 5 | Iceland | 784,708 | Views spiked on the northern island country's article on June 27 and 28. On June 27, Iceland defeated England 2–1 in theirUEFA Euro 2016 Round of 16 match. But alas, Iceland fell to France on July 3 and did not make the semi-finals. | ||
| 6 | Pat Summitt | 764,584 | The longtime head coach of theTennessee Lady Volunteers basketball team, who won a record 1,098 games in her tenure, died at age 64. She retired from coaching in 2012 after being diagnosed withearly-onset Alzheimer's disease. Her public openness about her condition was widely admired and helped raise awareness of the disease and its impact. | ||
| 7 | Battle of the Somme | 757,121 | The 100th anniversary of the commencement of thisFirst World War battle fell on July 1. The battle was intended to hasten a victory for the Allies and was thelargest battle of the First World War on theWestern Front. More than one million men were wounded or killed, making it one of thebloodiest battles in history. | ||
| 8 | Independence Day: Resurgence | 755,170 | The 20-years-later sequel toIndependence Day premiered in the United States on June 24. As of July 4, its worldwide gross is $252 million; the film had a budget of $165 million. It has received mostly negative reviews, though I fully intend to see it. Down slightly from 810K views last week. | ||
| 9 | European Union | 736,104 | Views previously spiked on June 24 due to theBrexit vote, but traffic remained high (though declining each day) for this entire week as the aftermath of the vote began to be digested. Down from #3 and 1.97 million views last week. | ||
| 10 | Jesse Williams (actor) | 720,299 | At theBET Awards on June 26, this actor won a humanitarian award, and delivered a speech highlightingracial injustice,police brutality, andcultural appropriation, which drew press attention far beyond anything the BET Awards normally gets. (BET is an acronym for Black Entertainment Television, the most prominent television network targeting African American audiences.) |
The ten most popular articles on Wikipedia, as determined from theWP:5000 were:
| Rank | Article | Class | Views | Image | Notes |
|---|---|---|---|---|---|
| 1 | Pokémon Go | 1,371,390 | For most people born before theClinton administration,Pokémon is about as comprehensible as the religious customs of some lost Pacific island or the codes and shibboleths of an ancient secret society. Which, by the way, is exactly why your kids love it so much. It's too complicated to explain quickly but the latest iteration exploded into the public mind almost overnight (it currently has more users thanTinder in the US, despite only being in release for 5 days) due to its unique, and perhaps uniquely dangerous, gameplay. Thanks to the wonders ofaugmented reality,Google Maps andGPS, a real-time scavenger hunt has morphed with a video game; it's everywhere you go. Hold up your iPhone to a tree, there's a Pokémon sitting in a crook, waiting to be captured and sent to the death ring, er, I mean "gym". Look down on the pavement, and there's a cute Pokémon staring up at you. And hey look! There's one swimming in that deceptively close and surprisingly deep pond! And there's one across that very busy street! Yes,Pokemon Go-related accidents have already happened, as have muggings, since the game alerts any other players to your current location. Thankfully none of this has proven fatal, though it's only a matter of time before a health official is forced to remind the general public that real people do not get extra lives. | ||
| 2 | Sultan (2016 film) | 1,152,393 | One big difference betweenHollywood andBollywood is that in Bollywood, stars still matter. AndSalman Khan (pictured) rules the roost right now. His last big film,Bajrangi Bhaijaan, dominatedEid al-Fitr weekend and went on to make nearly $100 million. And now he's done it again. His latest, a wrestling drama, was also released on Eid and has taken in nearly ₹1.96 billion ($29 million) in its first six days. | ||
| 3 | Independence Day (United States) | 1,142,261 | This is the fourth US Independence Day since we started this list, which means it's time to look for patterns, and one that stands out is that while this article's numbers keep climbing year upon year, it has never been the #1 article for its week. Some have speculated that Americans already know enough about their founding holiday and don't need to look it up. | ||
| 4 | UEFA Euro 2016 | 988,687 | Numbers are down slightly for the quarter- and semi-finals, which saw the darlings of the tournament (Wales andIceland) predictably knocked out byFrance andPortugal. This list's timeframe ends before the 10 July final so expect numbers to shoot up again next week. | ||
| 5 | Juno (spacecraft) | 960,161 | Not allNASA missions need to be glamorous; this one, which began a slow, winding descent towardsJupiter on 4 July, won't be gracing us with grand vistas of the jewels of the Jovian realm. No: this one is hardcore, pick-to-the-cliff science. Have you ever seen a cutaway image of the inside of a gas giant? Well if not,here's one. Thing is, up until now, it's basically educated guesswork. We don't have any hard evidence of what's under those clouds. But we will, thanks to Juno, which will get the info by mapping Jupiter'sgravitational field. But to do so, it has to get close. Real close. As in, close enough to be fried by Jupiter's 12,000-Chernobyls-per-secondradiation belts. Needless to say, it's a tough little bugger, but its creators don't expect it to be producing useful science for more than 18 months before it's toast. | ||
| 6 | Nettie Stevens | 896,719 | This pioneeringgeneticist and discoverer of theXY sex-determination system got aGoogle Doodle on her 155th birthday on 7 July. | ||
| 7 | UFC 200 | 872,178 | The latest in themixed martial arts tournament series was held at theT-Mobile Arena inLas Vegas (pictured) on 9 July. HeadlinerAmanda Nunes defeatedMiesha Tate in the first round. | ||
| 8 | Serena Williams | 857,452 | The world women's number 1 tennis champion clinched yet another record on 9 July when she beatAngelique Kerber in straight sets to clinch her 22nd major singles title at her natural home,Wimbledon. Two more titles and she will equalMargaret Court's career record. | ||
| 9 | Antoine Griezmann | 849,627 | Olivier Giroud may have scored two goals forFrance in the Euro 2016 semi-final, but it was Griezmann who scored the most goals in the tournament. | ||
| 10 | Kevin Durant | 707,764 | The seven-timeNBA All-Star signed with the Western Conference championGolden State Warriors this week for a reported two-year, $54 million contract. |
On 10 June, arbitrator clerkL235 posted an announcement that the clerks were looking for script writers who"will work with the clerk team to automate portions of the clerks' procedures." These procedures include, but are not limited to,vetting new requests,opening andmanaging open cases and miscellaneous tasks such asarbitrator retirements. On7 July, L235 announced thatFred Gandt andΣ would be appointed as the script developers. Best of luck to both on future outings.
If any editor is interested in assisting, you can contact the clerks at clerks-l@lists.wikimedia.org.
A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as theWikimedia Research Newsletter.
A short paper presented at theJoint Conference on Digital Libraries titled "Quality Assessment of Wikipedia Articles Without Feature Engineering"[1] usesdeep learning to predict the quality of articles in the English Wikipedia. As the paper's title alludes, previous research on article quality has used a specific set of features to represent the articles, whereas the promise of deep learning is that the machine learner will determine the best representation on its own.
Some representation of the articles still requires to be chosen, and the paper uses "Doc2Vec", an extension ofWord2vec that usesunsupervisedmachine learning to learn vector representations of the articles. A benefit of this approach is that it is language neutral, whereas other approaches might utilize features that are language-specific. These vectors are learned from a training set based on the Wikimedia Foundation'sdataset of 30,000 English articles. A deep neural network using Google'sTensorFlow library is then trained using these vectors with the aim to predict to which of the English Wikipedia'sassessment classes an article belongs.
The performance of the classifier is compared to the current state of the art, which at the time of writing is the WMF's ownObjective Revision Evaluation Service (ORES) (disclaimer: the reviewer is the primary author of the research upon which ORES' article quality classifier is built). Since the number of articles in each class is fairly balanced, the proportion of correctly classified instances (accuracy) is used as the performance measure. ORES is reported to be 60% accurate (it currentlyreports 61.9% accuracy), and the deep neural network was found to be 55% accurate. As pointed out in the paper, this work is a first step towards using deep learning for this task, meaning that slightly lower performance is acceptable. The authors describe a couple of changes that will most likely improve the classifier and aim to do so in future work. Deep learning is an area where interesting things are happening, and if it can be used to improve our ability to automatically assess Wikipedia articles, a service that is already useful to many Wikipedians through services likeWikiProject X andSuggestBot, that is only for the better!
Dr. Tsung-Ho Liang (梁宗賀)[supp 1] is a systems analyst in the information center at theTainan City Government's Bureau of Education. He currently studiesbig data in education, especially dealing with unstructured data and natural language processing techniques. In 2013, he started a project to integrate the contents of Chinese Wikipedia with the Chinese Knowledge and Information Processing (CKIP) technology and established a new search engine for Chinese Wikipedia,[supp 2] –WikiSeeker (維基嬉客).
WikiSeeker is a tailor-made search system based on the Wikipedia corpus to leverage search effectiveness by providing structured association graphs with related Wikipedia articles for students' queries in Chinese. First, it produces a knowledge map with clear relationships among each field of knowledge, so students can easily identify the most important keywords among contents. Second, the search bar of WikiSeeker is capable of using natural language to search instead of typing keywords. You can seea tour of WikiSeeker on Youtube.
The above two features make WikiSeeker intuitive and easy to use for K-12 students. According to the research essay "WikiSeeker─The Study of the Impact of a Search System with Structured Association Graphs on Learning Effectiveness"[2] by the researcher Sheng-Nan Cheng (鄭盛南), two experimental groups were adopted in this study: one asks students to use Chinese Wikipedia directly to answer questions, and another asks students use the WikiSeeker website to answer the same questions. The results showed that the students who used WikiSeeker were 10.8% more correct in their answers (on average, 13.73 out of 19, compared to 15.8 out of 19 questions). Moreover, it was found that girls and middle-achieving students reached the highest learning improvement when using WikiSeeker. The conclusion suggests that WikiSeeker is suitable for students to acquire knowledge in Chinese Wikipedia.
Sentiment analysis - the automated extraction of subjective information expressed in text - has been applied to Wikipedia research in several recent papers.
Four researchers from Stanford University analyzed[3] all (non-neutral) votes in the English Wikipedia'srequest for adminship process cast from its inception in 2003 until 2013. These form adirected,signed graph with around 11,000 nodes (users) and 160,000 edges (votes). They removed the actual vote text ("support" and "oppose") and tried to reconstruct the vote by applyingsentiment analysis to the remaining comment text (where e.g. "I've no concerns, will make an excellent addition to the admin corps" indicates a positive vote). The performance of the resulting prediction model is described as "remarkably high, [...] as a consequence of the highly indicative, sometimes even formulaic, language used in the comments". It performed much better than a model trying to predict votes based on network characteristics alone (patterns of other support/oppose votes, using e.g. ideas frombalance theory like "an enemy of my enemy is my friend").
Is the editing frequency of Wikipedians influenced by negative or positive comments they receive on their user talk pages?
Astudent course project at the same university[4] tried to examine this question by analyzing the user talk pages of all users (around 620,000) who signed up in 2013 and made at least one article edit on the English Wikipedia, together with "thanks" messages received via the newsoftware feature introduced during that year. They related this data to the number of article edits per week. The authors report that "while we found some predictive value for future behavior in the sentimental content of messages received by Wikipedia editors, we do not have evidence to establish a causal relationship between these variables... we were able to detect macro-level patterns of behavior that appear to discredit the hypothesis that the sentimental content of user talk pages is a main driver of user churn on Wikipedia". As a limitation of their application of sentiment analysis in this situation, they note that "Most messages exchanged through user talk pages are not sentimentally-loaded, but rather talk about the Wikipedia guidelines and policies in a neutral manner", calling for the use of more sophisticated natural language processing techniques.
These results are somewhat in contrast to those of a paper titled "The Impact of Sentiment-driven Feedback on Knowledge Reuse in Online Communities",[5] which investigated "whether affective communication [...] in form of sentiment-driven feedback in discussions between Wikipedia editors motivates collaborative work", by analyzing a complete history dump of theSimple English Wikipedia (until 2011). The researchers focus on the "knowledge reuse" aspect of this collaborative work, quantified for "any two consecutive revisions of the same article page as the ratio of the number of words reused from the previous revision (e.g., copied, moved elsewhere, or restored) to the number of words newly created in the current revision." By relating the positivity or negativity of article talk page comments to editing activity in the article itself, the authors found that:
Besides observing that public positive feedback may have a positive effect on editor motivation, they also note that "non-public negative peer feedback could increase one's likelihood to engage in online social production by correcting inherent problems, behaviors, and attitudes in private peer conversations, which also strongly suggests that mechanisms for providing non-public negative feedback should be designed, incorporated, and tested in collaborative platforms such as wikis."
See also ourearlier coverage of sentiment analysis research, and acurrent research collaboration of the Wikimedia Foundation and other researchers that aims "to use machine learning and statistics to understand how attacking or 'toxic' language affects the contributor community on Wikipedia. The focus of our analysis is initially on talk page comments that exhibit harassment, personal attacks and aggressive tone."
Wikimania 2016, the annual global Wikimedia conference, took place in June in Esino Lario, Italy. Theprogramme contained various research-related session, including the annual"State of Wikimedia Research" presentation highlighting some of the most interesting scholarship from the past year (slides).
See theresearch events page on Meta-wiki for upcoming conferences and events, including submission deadlines.
A list of other recent publications that could not be covered in time for this issue –contributions are always welcome for reviewing or summarizing newly published research.
Other student project writeups from the fall 2015 CS229 course at Stanford (see also above):