Please revise your five questions, and send them back so that I can post those. That's an assignment which will be graded pass/fail, due Monday.
Question of the Day (will we finally answer it?:):
This reading describes how Google uses the "Pagerank" algorithm to determine the importance or value of a webpage (and hence where it falls in the search results for a particular topic).
This is a good illustration of how graphs can be used in a dynamic situation (changing in time, like the graphs representing Europe we studied previously).
Strogatz asserts that "A page is good if good pages link to it," then discusses this self-referential definition of a good page. (p. 193)
The question is this: who decides which pages are good in the first place? As Strogatz describes, the network does!
"Worrying about content turned out to be an impractical way to rank webpages." (p. 192) We let people vote with their feet (or rather, with their clicks).
Graphs provide a useful way of illustrating how pages interact. If there's a link between two pages, then a directed arrow indicates it. Here's the (directed) graph of the "toy web" Strogatz considers, with the final rankings:
He justifies this ranking in a series of graphs, and a set of equations on page 195. Let's see how these equations work (we'll use this Excel spreadsheet in the end, but let's do a few by hand first....).
He starts by assigning all pages equal weight: in this case, if we call the total weight 1, each page starts with weight 1/3.
Notice that I've written the equations with an index, ,
rather than with the primes. That's because we keep updating
the values to get them at the
stage, and we update based on the previous stage's (
)
values.
We just "do it again", over and over....
This "systems of equations" is an example from the field of mathematics called "linear algebra". If you loved algebra, wait -- there's more!:)
Now here's the big question: How do you improve the value of your website, given that you understand the PageRank algorithm?
Side note: Google's plan to prioritize facts ticks off climate deniers: The strategy isn't being implemented yet, but the paper presented a method for adapting algorithms such that they would generate a "Knowledge-Based Trust" score for every page. To do this, the algorithm would pick out statements and compare them with Google's Knowledge Vault, a database of facts. It would also attempt to assess the trustworthiness of sources -- for example, a reputable news site versus a newly created WordPress blog....