google's weighting parameter list leaked
Highlights:
Change history: Google apparently keeps a copy of every version of every page it has ever indexed. Meaning, Google can "remember" every change ever made to a page. However, Google only uses the last 20 changes of a URL when analyzing links.
Google stores author information associated with content and tries to determine whether an entity is the author of the document
Google measures the average weighted font size of terms in documents (avgTermWeight) and anchor text.
it's likely the internal documents were accidentally included in a code review and pushed live from Google's internal code base, where they were then discovered
(Score: 2) by SomeRandomGeek on Friday May 31, @10:13PM
So, we have a list of 14,000 variables that are used in page ranking, but not how those variables are weighted. As a software developer myself, I immediately suspect that the vast majority of those factors aren't actually used, or are used in highly situational cases. They probably built algorithms to score a page on all these dimensions, and then chose a few dimensions that are actually useful in providing a good result. But did they rip out all of the other dimensions? No, of course not. Better to leave them in, because you never know what might be helpful in conjuction with some new dimension that you are building next year.
So, when the article says that Google is considering something, they have no idea. They know that Google has the capability to rank searches on that something. But maybe they don't actually use it, for any of a hundred reasons.
