Stories
Slash Boxes
Comments

SoylentNews is people

posted by chromas on Sunday September 20 2020, @08:08AM   Printer-friendly
from the double-team-supreme dept.

Wayback Machine and Cloudflare team up to archive more of the Web:

The Internet Archive and Cloudflare have teamed up to archive the content of websites that use Cloudflare's Always Online service, increasing the odds that users will be able to view a recent version of a website during outages. The partnership will increase the number of webpages scanned by the Internet Archive, making the organization's Wayback Machine more useful to Internet users in general.

"Websites that enable Cloudflare's Always Online service will now have their content automatically archived, and if by chance the original host is not available to Cloudflare, then the Internet Archive will step in to make sure the pages get through to users," said an announcement by Mark Graham, director of the Internet Archive's Wayback Machine.

[...] The Internet Archive integration is available to Cloudflare's free users but will only back up the website every 30 days. Cloudflare's paying customers will get more frequent backups, every 15 days for Pro users and every five days for Business and Enterprise users.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 3, Insightful) by Anonymous Coward on Sunday September 20 2020, @09:36AM (2 children)

    by Anonymous Coward on Sunday September 20 2020, @09:36AM (#1053884)

    Too bad JavaScript frameworks are going to destroy that. Fetching resources programmatically, dynamically updating pages, SPAs for everyone, etc. In many ways, we were just on the cusp of really being able to save huge swaths of the Internet. And then we decided to throw that ability away. And that doesn't even get into the whole paywall and closed-platform issues that abound in the most traversed areas of the web.

    • (Score: 4, Insightful) by Booga1 on Sunday September 20 2020, @11:01AM (1 child)

      by Booga1 (6333) on Sunday September 20 2020, @11:01AM (#1053892)

      What kills me is pages that autoload stuff only when you scroll or click. Sure the base page loads fast, but then it hiccups and jostles again as you scroll down and it fetches things. Then you get to the point where everything's hidden behind a "read more" button where I'm thinking, "Uh, I made it t his far, why the hell do I have to click this barrier?" Then it has to download the next part of the article to put in the box that opens up. It's craziness.
      Worse yet is pages that autoload wholly new URLs as you scroll. My local news site just added that monstrosity of a feature. Now you can't even get back to where you were from the page URL if you keep scrolling. How can you possibly archive something that never stops loading things and loads them dynamically based on what the current news cycle is?

      • (Score: 3, Insightful) by Anonymous Coward on Sunday September 20 2020, @02:59PM

        by Anonymous Coward on Sunday September 20 2020, @02:59PM (#1053947)

        > How can you possibly archive something that never stops loading things and loads them dynamically based on what the current news cycle is?

        Perhaps the cooperation between Cloudflare and IA will come up with a solution to this problem? The librarians that I know are very resourceful people...and the Internet Archive is full of smart people with that mindset.

        I don't give a lot to charity, but I do support the IA because:

        Progress, far from consisting in change, depends on retentiveness. When change is absolute there remains no being to improve and no direction is set for possible improvement: and when experience is not retained, as among savages, infancy is perpetual. Those who cannot remember the past are condemned to repeat it.

        https://en.wikiquote.org/wiki/George_Santayana [wikiquote.org]

  • (Score: 3, Interesting) by Rosco P. Coltrane on Sunday September 20 2020, @10:59AM (4 children)

    by Rosco P. Coltrane (4757) on Sunday September 20 2020, @10:59AM (#1053891)

    This company is even scarier than Google, because they're almost as overreaching, but they mostly stay out of the limelight.

    And now they can reach for your data in the past. Better start being careful about what you say or do online if you don't want them to tell on you at the merest NSL.

    • (Score: 4, Insightful) by Anonymous Coward on Sunday September 20 2020, @11:07AM

      by Anonymous Coward on Sunday September 20 2020, @11:07AM (#1053893)

      This is no worse than what CloudFlare can already do. Internet Archive is a public service.

    • (Score: 1, Interesting) by Anonymous Coward on Sunday September 20 2020, @05:45PM (2 children)

      by Anonymous Coward on Sunday September 20 2020, @05:45PM (#1054008)

      So, if Cloudflare is scary, how do you feel about Akamai...one of (or the) pioneer in distributed web content?

      https://en.wikipedia.org/wiki/Akamai_Technologies [wikipedia.org]
      https://www.cdnplanet.com/compare/akamai/cloudflare/ [cdnplanet.com]

      • (Score: 0) by Anonymous Coward on Monday September 21 2020, @05:27AM (1 child)

        by Anonymous Coward on Monday September 21 2020, @05:27AM (#1054252)

        Akamai is where the Internet Archive got its initial batch of archived webpages and where its founder made the wealth he used to start archive.org.

        As you can tell neither then nor now was he particularly fond of copyright laws. I've noticed that a lot with people who become wealthy. They are not often in favor of laws until those laws protect their most valued assets...

        • (Score: 0) by Anonymous Coward on Monday September 21 2020, @09:34AM

          by Anonymous Coward on Monday September 21 2020, @09:34AM (#1054313)

          > Akamai is where the Internet Archive got its initial batch of archived webpages and where its founder made the wealth he used to start archive.org.

          Not according to https://en.wikipedia.org/wiki/Brewster_Kahle [wikipedia.org] :

          After graduation, he joined Thinking Machines team, where he was the lead engineer on the company's main product, the Connection Machine, for six years (1983–1989).[citation needed] There, he and others developed the WAIS system, the first Internet distributed search and document retrieval system, a precursor to the World Wide Web. In 1992, he co-founded, with Bruce Gilliat, WAIS, Inc. (sold to AOL in 1995 for $15 million), and, in 1996, Alexa Internet[11] (sold to Amazon.com in 1999[12]). At the same time as he started Alexa, he founded the Internet Archive, which he continues to direct. In 2001, he implemented the Wayback Machine, which allows public access to the World Wide Web archive that the Internet Archive has been gathering since 1996. Kahle was inspired to create the Wayback Machine after visiting the offices of Alta Vista, where he was struck by the immensity of the task being undertaken and achieved: to store and index everything that was on the Web.

          Perhaps you have confused https://en.wikipedia.org/wiki/Alexa_Internet [wikipedia.org] with Akamai?

(1)