Stories
Slash Boxes
Comments

SoylentNews is people

posted by LaminatorX on Tuesday November 18 2014, @08:47AM   Printer-friendly
from the Where's-John-Katz? dept.

Every year the works of thousands of authors enter the public domain, but only a small percentage of these end up being widely available. So how do organizations such as Project Gutenberg choose which works to focus on? Allen Riddell has developed an algorithm that automatically generates an independent ranking of notable authors for any given year. It is then a simple task to pick the works to focus on or to spot notable omissions from the past. Riddell’s approach is to look at what kind of public domain content the world has focused on in the past and then use this as a guide to find content that people are likely to focus on in the future.

Riddell’s algorithm begins with the Wikipedia entries of all authors in the English language edition (PDF)—more than a million of them. His algorithm extracts information such as the article length, article age, estimated views per day, time elapsed since last revision, and so on. This produces a “public domain ranking” of all the authors that appear on Wikipedia. For example, the author Virginia Woolf has a ranking of 1,081 out of 1,011,304 while the Italian painter Giuseppe Amisani, who died in the same year as Woolf, has a ranking of 580,363. So Riddell’s new ranking clearly suggests that organizations like Project Guttenberg should focus more on digitizing Woolf’s work than Amisani’s. Of the individuals who died in 1965 and whose work will enter the public domain next January in many parts of the world, the new algorithm picks out TS Eliot as the most highly ranked individual. Others highly ranked include Somerset Maugham, Winston Churchill, and Malcolm X.

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 0) by Anonymous Coward on Tuesday November 18 2014, @10:39PM

    by Anonymous Coward on Tuesday November 18 2014, @10:39PM (#117433)

    The Sono Bono Copyright Act doesn't extend copyright forever, it extends it for a finite time. I don't doubt that Disney will press for yet another extension, but for copyright not to last for "a limited time" would be plainly unconstitutional. But you know, 10,000 years is "a limited time".

    So what we actually have is that US works do enter the public, however they are older works than those which enter the public domain in other parts of the world.

  • (Score: 0) by Anonymous Coward on Wednesday November 19 2014, @12:14AM

    by Anonymous Coward on Wednesday November 19 2014, @12:14AM (#117460)

    Nothing newer than 1923 has entered the public domain in the US, thanks to Sonny Bono. And if Disney et. al. have anything to say about it, they'll just keep paying Congress to create laws that extend this further and further. Before 2018 I'm willing to bet that we'll see yet another copyright term extension act at the behest of these bastards. A copyright term that lasted 10,000 years might be argued as being practically unlimited though: 10,000 years is almost twice as long as all of recorded history! A copyright term of 100 years is nearly half the age of the United States as a country, and ought to be untenable on those grounds as well, if the government wasn't so beholden to these vested interests.