https://web.law.duke.edu/cspd/publicdomainday/2020/
Here are some of the works that will be entering the public domain in 2020. (To find more material from 1924, you can visit the Catalogue of Copyright Entries.)
[...] Unfortunately, the fact that works from 1924 are legally available does not mean they are actually available. After 95 years, many of these works are already lost or literally disintegrating (as with old films and recordings), evidence of what long copyright terms do to the conservation of cultural artifacts. In fact, one of the items we feature below, Clark Gable's debut in White Man, apparently no longer exists. For the works that have survived, however, their long-awaited entry into the public domain is still something to celebrate. (Under the 56-year copyright term that existed until 1978, we would really have something to celebrate – works from 1963 would be entering the public domain in 2020!)
(Score: 4, Interesting) by ElizabethGreene on Thursday January 02 2020, @02:40AM (8 children)
If I can scan books with a copyright date of 1924 or prior I can upload and share them legally now?
I'm ~10 miles from a 108 year old university library, so this is not a hypothetical question.
(Score: 5, Informative) by canopic jug on Thursday January 02 2020, @05:02AM (2 children)
Yes, though you should do it in coordination, or at least consultation, with Project Gutenberg or one of the other corresponding projects. They'll have a lot of advice about efficient methods.
Realize that the "easiest" way completely destroys the book and the library might not be so keen on that happening. Same for the next easiest way. Instead, you'll need some non-reflective glass plates and a frame which can hold the book open at a 90° angle while pressing the pages flat so they can be photographed with a digital camera.
Then once you have all the pages photographed, run them through an OCR in bulk. You can then publish the photos above the dirty-OCR insdie HTML documents as you work through the tedious correction process to fix the OCR mistakes. An alternate to OCR is to send the digital copies to two separate impoverished regions for re-keying under very low wages. Then just do a diff of the two and, presumably, the text is good where it is the same which leaves very little additional manual intervention.
If you are going to do this with valuable manuscripts or handwritten books, then OCR is not relevant and you should, for other reasons, work in coordination with the preservation department. In that case, you would probably take fine-grained film photos first, then digitize those. The reason for that is the negative, especially for a Hassleblad or similar large-format camera, is very much higher resolution than any of today's digital camera and can be kept on file and used to make research copies on demand for most use-cases without further distressing the original artifact. That is a good idea because increased visibility will also mean increased demand. Lastly, if the scan just ends up being a shopping list for thieves, there is still a somewhat usable surrogate on file.
Money is not free speech. Elections should not be auctions.
(Score: 0) by Anonymous Coward on Thursday January 02 2020, @04:15PM
http://scantailor.org/ [scantailor.org] - using this to preprocess the page images, greatly reduces the OCR errors afterwards. Even using a free OCR tool such as Tesseract.
For best OCR results, the page images should be at 600 DPI or thereabouts, and in any case, at no less than 300 DPI.
(Score: 2) by krishnoid on Friday January 03 2020, @09:39PM
It looks like Google does a lot of this [google.com]. They already have arrangements with some libraries to bring in expensive scanning equipment in and digitize old and rare books; maybe they could hook up with yours too.
(Score: 2) by maxwell demon on Thursday January 02 2020, @02:41PM (3 children)
Yes. However note that the owner of the physical book might place restrictions to what you can do with it. Including not allow you to scan it in the first place.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 2) by mcgrew on Thursday January 02 2020, @06:23PM (2 children)
Not likely at a university or a library, who are probably scanning it themselves, anyway. I found a copy of Only Yesterday, required reading in a college history class I had at SIU, at the University of Virginia's web site.
mcgrewbooks.com mcgrew.info nooze.org
(Score: 2) by maxwell demon on Thursday January 02 2020, @09:47PM (1 child)
That depends on the state of the book. Scanning it the wrong way may break it. You probably don't have expensive non-destructive book scanner.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 2) by mcgrew on Friday January 03 2020, @12:35PM
Your phone is a non-destructive book scanner good enough for an OCR to convert it to text. The Vatican is scanning thousand year old documents.
mcgrewbooks.com mcgrew.info nooze.org
(Score: 2) by mcgrew on Thursday January 02 2020, @06:21PM
Yes, you can, and if you find any good science fiction from then let me know and I'll post it on my book site.
mcgrewbooks.com mcgrew.info nooze.org