we discuss how characteristics of the entries relate to their popularity on UD (.4). Table 8 characterizes the entries by formality and familiarity. A one-way anova test confirms that the differences between the groups are highly significant ( F 2, 298822.07,.001). Discussion and conclusion In this article, we have studied a complete snapshot (19992016) of UD to shed light on the characteristics of its content. Offensiveness Online platforms with user generated content are often susceptible to offensive content, which may be insulting, profane and/or harmful towards individuals as well as social groups 34,.
We only included headwords with at least three definitions. In the case of Wikipedia for instance, inaccuracies 5, edit wars and destructive interactions between contributors 6, 7 and biases in coverage and content 8, 9 are only a few to name among many undesirable aspects of the project that have been studied in detail. Offensiveness We experimented with different pilot setups in which we asked workers to annotate the level and type of offensiveness for individual definitions. Because the number of up and down votes varies highly depending on the popularity of the headword, we perform the analysis based on the rankings of entries (top ranked, second ranked and random) instead of the absolute number of up and down votes. Figure 3 shows a similar plot for fleek and on fleek, a phrase that went viral in 2014. Indeed, we found that this led to a higher agreement. Various factors could influence the up and down votes an entry receives, including whether the voter thinks the entry is offensive, informative, funny and whether the voter (dis)agrees with the expressed view. The number of new definitions for fleek and on fleek and other variations per year (December 1999July 2016).