Stories
Slash Boxes
Comments

SoylentNews is people

posted by janrinok on Saturday January 28 2023, @07:52PM   Printer-friendly

https://inventlikeanowner.com/blog/the-story-behind-asins-amazon-standard-identification-numbers/

During Amazon's earliest days (1994-1995), CTO Shel Kaphan and Software Engineer Paul (then) Barton-Davis had to write all the software needed to power Amazon.com on the day it offered its website to the world to sell books (official launch date was July 16, 1995). The book catalog was online, and it needed an index (well, it needed several indexes, but that's another story); specifically, it needed a unique key for each item in the catalog. Because the databases they were using to create the catalog were indexed by 10-character-long ISBN (International Standard Book Number), Shel and Paul decided to use ISBN as their key.

Unfortunately — and Shel was well aware of this very quickly, but of course by that time, it was too late — ISBNs are terribly abused in the United States. The company that issues ISBNs, Bowker, charges a lot of money for ISBNs (from the perspective of small publishers, anyway), and publishers don't necessarily read all the rules. Small publishers were re-using ISBNs, and they also took their range of ISBNs and numbered through the entire range, rather than respecting the rule that the final character is actually a checksum, and you can only iterate through some of the digits. (It's actually worse than just not using the last digit, but I'm not getting into that here.)

Shel very quickly removed all 'checksum software checks' (which would have made sure it was a legal ISBN), but Amazon was still stuck with a code base that stored the key value in 10 character strings, and which also stored them in other databases with similar constraints.

Read on to see how the problem was finally resolved - but it wasn't as simple as you might first have thought...


Original Submission

This discussion was created by janrinok (52) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 3, Interesting) by Anonymous Coward on Saturday January 28 2023, @09:04PM (2 children)

    by Anonymous Coward on Saturday January 28 2023, @09:04PM (#1289121)

    One of our engineering reference books was mis-categorized, this was in the late 1990s. Not sure if I found out by looking at Amazon or perhaps Barnes & Noble Online. The newer book was listed as a paperback version of an earlier book, when in fact it was a supplement to the earlier book (totally different content). Btw, the ISBN's are completely different.

    At any rate, I asked how to fix it and the answer came back that it had to be fixed at the Library of Congress--that was the book catalog the seller was using.

    Turned out that dealing with Library of Congress, loc.gov , was very straightforward and a pleasant experience. Not sure how it works now, but back then publishers gave one copy of every book published to LoC. My email was acknowledged and then a couple of weeks later the librarians had done the search and updated their database. Using the traditional "you only get two of three" rating system the LoC was cheap (free) and good...but a little slow.

    Some time later (forgot how long, maybe a month?) the change rippled through the system and that book has been correctly categorized since then. Both books are still in print.

    • (Score: 3, Informative) by istartedi on Saturday January 28 2023, @09:48PM

      by istartedi (123) on Saturday January 28 2023, @09:48PM (#1289131) Journal

      That is awesome, the kind of citizen-government interaction that gives you the warm fuzzies. I'd like to add that back when I lived in the DC area, I had an actual LoC card and used it for some research in to my ancestry. They were not only able to help me find a city that didn't exist on modern maps, but assisted me in printing out an enlarged version of the 19th century map that had it. I still have that print-out in a folder somewhere.

      --
      Appended to the end of comments you post. Max: 120 chars.
    • (Score: 3, Informative) by Freeman on Monday January 30 2023, @04:27PM

      by Freeman (732) on Monday January 30 2023, @04:27PM (#1289302) Journal

      This is a good write-up on the ins and outs of getting a book published, which goes over what role the LoC plays in this process.

      https://selfpublishingwithdale.com/index.php/2021/04/08/copyright-and-isbn-registration/ [selfpublishingwithdale.com]

      You don't have to do any of the steps to publish a book (as you can just self-publish, self-print, self-advertise, etc.). The biggest thing with getting your book in the LoC is that it provides a proof that you published X book at X time. It also provides an easy way for Libraries to get a record of your book (copy cataloging takes a few seconds to a few minutes). As opposed to having to originate a record (which can easily takes hours of a cataloger's time).

      --
      Joshua 1:9 "Be strong and of a good courage; be not afraid, neither be thou dismayed: for the Lord thy God is with thee"
  • (Score: 5, Insightful) by darkfeline on Saturday January 28 2023, @11:31PM

    by darkfeline (1030) on Saturday January 28 2023, @11:31PM (#1289140) Homepage

    I'm sure anyone who has had to build a database learned this lesson the hard way. Always keep your own ID for all records, no matter how "reliable" an external ID source may be. You can mark the external ID column as unique, but always key off of your own ID. Even if the external ID never causes issues, you may need to create your own records to handle specific use cases.

    It's the data equivalent of wrapping dependency library APIs.

    --
    Join the SDF Public Access UNIX System today!
  • (Score: 1) by shrewdsheep on Sunday January 29 2023, @10:11AM (2 children)

    by shrewdsheep (5215) on Sunday January 29 2023, @10:11AM (#1289167)

    What I didn't get was how ISBNs can be embedded into ASINs if the first letter is fixed (as "B" or "A"), thus only leaving 9 letters for the ISBN.

    • (Score: 1, Informative) by Anonymous Coward on Sunday January 29 2023, @09:36PM

      by Anonymous Coward on Sunday January 29 2023, @09:36PM (#1289210)

      I think what you are getting as is that tfl wasn't very well written, which was also my impression after reading it. But then it's an Amazon programmer, so my expectations were pretty low going in.

    • (Score: 1, Informative) by Anonymous Coward on Tuesday January 31 2023, @03:26AM

      by Anonymous Coward on Tuesday January 31 2023, @03:26AM (#1289425)
      Maybe there'll be a future story on how they discovered that problem and fixed it. 😉
  • (Score: 2) by mcgrew on Monday January 30 2023, @06:20PM

    by mcgrew (701) <publish@mcgrewbooks.com> on Monday January 30 2023, @06:20PM (#1289326) Homepage Journal

    The more ISBNs you buy, the cheaper they are, to an insane amount, and you need the ISBN to get in a bookstore. IIRC from my last purchase of ten, A single number is $125, Ten for $250, 100 for $1000. Meanwhile in non-Fascist yet still capitalist Canada the ISBN is provided when you register the copyright.

    --
    mcgrewbooks.com mcgrew.info nooze.org
(1)