We now compare the number of unique keywords in UD with the number of unique keywords in Wiktionary, another participatory dictionary. Wiktionary manifests a different policy than the UD. Wiktionary content is created and maintained by administrators (selected by the community), registered users, and anonymous contributors. Unlike UD, there are many different mechanisms in Wiktionary to ensure that content complies with community guidelines. Each page is accompanied by a talk page where users can discuss the content of the page and resolve any conflicts. In addition, Wiktionary contains guidelines for the structure and content of entries. Capitalization is consistent, and content or keywords that violate Wiktionary guidelines are removed. For example, while UD and Wiktionary have misspelled keywords (e.g., believe for believe), Wiktionary guidelines state that only common misspellings should be included, while rare spelling mistakes should be excluded.7 However, such guidelines do not exist in UDs. Wiktionary entries are therefore subject to a deeper level of curation. A notable feature of UD is that users can express their rating of different definitions for each keyword by adjusting the definition up or down.
There are few or no guidelines on “what is a good definition” in UD, and users are expected to judge the quality of definitions based on their own subjective perception of what an urban dictionary should be. Figure 6a shows the distribution of the number of votes for or against received by each definition for all definitions of all keywords. A similar trend is evident, where many definitions received very few votes (both up and down) and few definitions received many votes. Figure 6b shows a scatterplot of the number of negative votes versus the number of votes up for each definition. There is a striking correlation between the number of positive and negative votes for each definition, emphasizing the role of visibility rather than the quality of the number of votes. However, there seems to be a systematic deviation from the perfect correlation, with the number of positive votes usually exceeding the number of negative votes. This becomes clearer in Figure 6c, where the distribution of the ratio of votes for positive and negative votes is shown. Obviously, there are large differences between definitions, with some having more than 10 times more positive votes than negative votes and others vice versa.
Despite several attempts to contact Urban Dictionary to confirm their data-sharing policies, the authors were unable to confirm that depositing our data in a public repository would violate their terms and conditions. In addition, due to these concerns, it was not possible to host the current document in a public repository. With this in mind, the authors note that the R analysis code and annotations are available via github.com/alan-turing-institute/urban-dictionary-rsos2018. The authors are happy to provide researchers with the original data if they contact us personally. This statement was coordinated with the magazine. Because of case inconsistency in UD, we are experimenting with three approaches to matching keywords between the two dictionaries: no preprocessing, lowercase all characters, and mixed.8 Table 2 shows the result of this match. The number of unique keywords in UD is much higher and the lexical overlap is relatively small. Sometimes there is a correspondence at the lexical level (i.e. the keywords match), but UD or Wiktionary cover different or additional meanings. For example, phased is described in UD as “something that happens little by little – in phases,” a meaning that is also covered in Wiktionary. However, UD also describes other meanings, including “a word used when you ask if anyone wants to fight” and “hum” when you`re not drunk but not sober.
A little further, the term “company” is presented in the form of a “group of people; his companions or collaborators; a guest or guests. In this form, enterprise refers to the group of people who would be exposed to the information. The Internet enables large-scale collaborative projects and the emergence of Web 2.0 platforms where content producers and consumers meet has radically changed the information market.