Skip to main content



Google’s PageRank Algorithm and Cancer Treatment

Source: Google Goes Cancer: Improving Outcome Prediction for Cancer Patients by Network-Based Ranking of Marker Genes

http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002511

Is PageRank algorithm only for search engines? The answer is definitely no. In fact, Google’s PageRank algorithm can be extended to many other areas, such as effectively analyzing the genomic information of tumors and predicting which factors affect cancer patients’ survival.

As we know, cancer is a disease caused by abnormal cell growth and the word tumor is used to describe the cancerous growth. The expression of marker genes, which determine whether the insertion of a nucleic acid sequence into a DNA is successful, in the tumor can help predict the clinical outcome of patients by indicating the status of the cancer in human body and what therapy the patient ought to receive. For instance, with the gene expression data the doctor can make better decisions about whether the patient is going to receive chemotherapy. However, it’s difficult to identify a limited set of genes related to tumor aggression because the genes and proteins in a cell interact and form a complicated network, in which numerous changes take place all the time.  

Therefore, in recent years researchers have been focusing on developing powerful tools for targeting key maker genes relevant to tumor aggression among thousands. Since the gene-protein network in the cell is actually similar to online network connected by hyperlinks, researchers are inspired by the way in which search engines filter a tremendous number of websites and recommend a few ones most relevant to the keyword the user inputs. They adapted Google’s PageRank algorithm to construct a new algorithm NetRank to search for cancer-related genes.

Similar to PageRank, which assigns authority scores to websites based on the hyperlinks among them, the NetRank algorithm gives a gene a score according to the scores of neighbors to which it connects. While PageRank uses the number of inlinks to a website as its initial authority score, NetRank starts with assigning each gene “the absolute correlation of its mRNA expression level with the patient survival time in the dataset”. The following procedure is the same for both–perform a sequence of updates to the scores of each node (website or gene) and select the ones with highest scores.

In the experiment, researchers apply both traditional methods (e.g. Pearson and Spearman correlation coefficients) and NetRank algorithm to identify relevant marker genes. The result shows that the prediction accuracy increases by 7% using NetRank. From the example of NetRank we can see that PageRank algorithm has the potential for improving the methodologies of other fields in addition to search engines.

Comments

Leave a Reply

Blogging Calendar

October 2017
M T W T F S S
 1
2345678
9101112131415
16171819202122
23242526272829
3031  

Archives