Skip to main content



PageRank: Detecting Attraction in a Conversation

In 2011, the Q&A web service Piazza quickly earns its popularity in over 330 schools and among tens and thousands of students. Piazza twists homework help site with social networking, providing an online forum for classmates to trade their knowledge.

However, an active forum may not always guarantee an enjoyable learning experience. Suppose that you open Piazza after a relaxing weekend. A huge number of unread posts pop up and you will probably have a bad feeling that some important discussions have been carried on but you simply have no clue where they hide. A recent research suggests that this is where PageRank algorithm can kick in and solve the mess. They use PageRank algorithm to signal what threads should be debated more.

PageRank can be viewed as a customization of a “random walk” in a Markov chain in which the states are pages, and the transitions are the links between pages. Having noticed that the sparsity of a relevant utterance in a conversation is similar to the sparsity of relevant contents from the web, the research group built a system which uses PageRank to rank the probability of a participant to reply to a given utterance, namely the importance of that specific utterance.

This system first takes in a set of important utterances identified by an existent system, together with the original conversations in text format. Then, the system models a conversations as a directed graph of links between utterances. Using PageRank, the system computes the ranks of the utterances based on both explicit links and implicit links. Here, explicit links are references specifying which utterance the chat participants address; implicit links connects utterances that are so strongly related in concept that the authors consider it unnecessary to give explicit link. In this way, the system is not only able to analyze participant-topic attraction, which measures the participants’ drive to involve in a certain topic, and topic-topic attraction, which measures the probability of a certain topic following another topic.

The current system cannot achieve the desired accuracy other existing offline systems. The researchers acknowledge that the explicit links are far from enough and they need to find ways to fully exploit the implicit links between utterances; additionally, the importance of explicit and implicit links should be discriminated. Apart from this, the utterance chains are usually very short and rarely have cycles like those in the web, therefore leading to most utterance being zero.

However, such system has the promise to perform real-time analysis, which is a great advantage over existent offline analysis tools. Existing applications can only work offline with limited speed because these applications use natural language processing methods such as semantic analysis. While PageRank algorithm should be much faster and more generally applicable because it does not require a learning phase and it is never limited to what were learnt.

Reference:

Using PageRank for Detecting the Attraction between Participants and Topics in a Conversation Chiru, C. (Dept. of Comput. Sci. & Eng., Univ. Politeh. of Bucharest, Bucharest, Romania); Rebedea, T.; Erbaru, A. Source: 10th International Conference on Web Information Systems and Technologies (WEBIST 2014). Proceedings, p 294-301, 2014

Comments

Leave a Reply

Blogging Calendar

October 2015
M T W T F S S
 1234
567891011
12131415161718
19202122232425
262728293031  

Archives