Accessibility navigation

Consensus Theory and Relevance Judgment Model: a test case of English Wikipedia

Nema, W. S. (2018) Consensus Theory and Relevance Judgment Model: a test case of English Wikipedia. PhD thesis, University of Reading

[img] Text - Thesis Deposit Form
· Restricted to Repository staff only


It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.


The amount of online data in 2016 was estimated at 16 Zetta Bytes (102 1) and is predicted to be 44 ZB by 2020 and 163 ZB by 2025 (The IDe Data Age 2025 report, 2017). The advent of Internet of Things (loT), social data, and usage tracking will continue to contribute to exponential growth in data. In addition to data growth, the number of people connected to the internet by 2020 is expected to reach 4 billion people; half of today' s world population (Search Engine Watch, Sep 2107). Data quality is another major challenge at the content, metadata, standards, and semantic levels making it hard to determine authorship, currency, and content semantics and quality. Aside from this still, content is reduced from ideas to text and reconstructed by humans with different expressiveness and interpretation due to great variance in people (backgrounds, culture, value system, education, personas, etc.) and the context of their situation including stress, urgency, task complexity, etc. What exacerbates finding information online is that people are increasingly relying on search but continue to provide queries that are too short, i.e. less than 3 words, yielding millions of search results that are, with the exception for the first page or two, ignored. Absence of a librarian-like or ask-a-friend human dimension had reduced Information Seeking to Information Retrieval which was proven ineffective in recent decades. Therefore, this work takes a user-centred approach with emphasis on the human role to contribute to page scoring dynamically so that results would reflect happenings of the time and general historic trends of users. In other words, the focus is on what works regardless of the reasons in order to skip human variance complexities. Thus, the target solution is for the general average user not for personalisation. This is done via usage stats and explicit user feedback which are not standardised in today' s web world. Therefore, this work is applicable where these two factors are possible such as at the work place. The findings, though, are general in nature. This research follows the complete cycle of theory, modelling, algorithm translation, and experiment testing. Even if research contributed to knowledge through negative outcomes, back-tracking the challenge to a previous stage is still useful research (e.g. unsuccessful experiment or results > algorithm > model > theory). On the other hand, affirmative results do not necessarily mean acceptance of theory but rather opening another avenue of investigation by testing some other aspect of the theory/model. Either way, theory-driven research provides better context and richer potential for refutation/confirmation, and contribution. The proposed Consensus Theory and relevance judgment Model (CT&M) hypothesise how people might be influenced when making relevance judgment decisions. The proposed ConsensusRank (CR) algorithm is derived to test the model in a real-life web experiment that is open to anyone at any time. Whether CR can predict a new user' s explicit ranking better than CR's base, Google PageRank, is the real test. PageRank inspired CT&M due to the quantity and quality of inlinks. The quality dimension of the model was also influenced by Rieh' s (2002) model of relevance judgment. CT &M adds a quantitative consensus dimension and relates it inversely to Rieh's cognitive dimension. The Ranking Game web experiment provides an open web platform for testing the proposed CT &M theory/model and corresponding CR algorithm. It is based on the August-2016 English Wikipedia corpus (12.7 million pages) with Page View Statistics (PVS) for May, June, and July 2016. The game uses TREC's pooling method to merge top 20 results from major search engines and presents an alphabetic list for users' explicit ranking via drag and drop. The algorithms are: Google PageRank, Google real-time, Yahoo real-time, PVS, StatsRank (SR), ConsensusRank, Real-time meta search, and user ranking consensus. The same platform can be used for controlled experiments and captures implicit data for future research. This research modified Google' s PageRank with two human artefacts: Page View Statistics and explicit user ranking. Both are dynamic factors that reflect the nature of human behaviour and surrounding events of the time. This dynamism solves the known Power Law monopoly of top search results and addresses PageRank' s equal navigation probability assumption which we prove is not a good assumption. These adjustments bring the IS human dimension into IR, a repeatedly sought-after research objective. Furthermore, this is done in a theoretical context that is congruent with established field theories expanded using modern Computational Social Science statistical quantified approaches. Findings show great variability in outlink Page View Stats which disproves PageRank' s uniform navigation assumption. PVS consistent behaviour from month to month provides a good alternative to influence the likelihood of link navigation and further justifies StatsRank as a PageRank variant. Confirmed by the statistical Sign Test, results of typical web search engine tests, NDCG and MAP, affirmatively confirm the hypothesis that ConsensusRank improves over its base PageRank when compared against real-time meta-search from which it is completely independent. The strongest finding is that a new random user ranking is found to consistently correlate best to user ranking consensus, then to ConsensusRank, then to PageRank. The biggest contribution is probably the theoretical one: Consensus Theory and Relevance Judgment Model. This is because theories invite challenges and refutations which lead to field development. Contributions also include the proposed algorithms StatsRank (SR) and ConsensusRank (CR). Although not generally applicable due to the nature of explicit feedback and stats data capture, CR & SR are usable in the business world and do highlight the need to open up this type of data capture. Finally, The Ranking Game open web experiment research platform is one that goes well with theory and research experimentation.

Item Type:Thesis (PhD)
Thesis Supervisor:Tang, Y. and Liu, K.
Thesis/Report Department:Henley Business School
Identification Number/DOI:
Divisions:Henley Business School > Business Informatics, Systems and Accounting
ID Code:80635

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation