Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Friday July 19 2019, @05:08PM   Printer-friendly
from the interesting dept.

Over the years I have viewed many a video on YouTube. I quickly noticed an "ID" string that appeared in each video URL. Here's an example: https://www.youtube.com/watch?v=ShvnDSgjfXw -- see that string "ShvnDSgjfXw"? What characters are permitted? How long is it?

Along the way, I came upon an amazingly useful utility: youtube-dl. I accidentally discovered that it will happily download a YouTube video given just the Video ID. (Don't let the name of the utility mislead you; it seems to work fine with Instagram, Twitter, Sound Cloud... it's amazing!)

Now with my curiosity suitably piqued, I started a genuine search for what the parameters were that defined a valid YouTube Video ID. This question on "Web Applications Stack Exchange" was most helpful. Especially this response.

It appears that the Video ID (and the Channel ID) are modified base64 encodings of 64-bit (and 128-bit) integers. The primary change is that the base64 encoding produces two characters that are verboten in URLs. A generated "/" is replaced with "-" and a generated "+" is replaced with a "_".

There is no official documentation claiming that the ID lengths are guaranteed to always be 11 or 22 characters long, but empirical evidence suggests that is the current, de-facto standard.

There is even mention of " the maximally-constrained regular expression (RegEx) for the videoId" being:

[0-9A-Za-z_-]{10}[048AEIMQUYcgkosw]

Things get even more interesting if you are using Windows. Under NTFS, file names default to be case-preserving, but case-insensitive. Say I create a file called "Foo.txt" and then get a directory listing. Sure enough, I see: "Foo.txt" displayed. The fun comes if I do "DIR foo.txt" or "DIR FOO.TXT" or any other variation... they all find the same file: "Foo.txt"; this is counter to Unix where filenames are case-sensitive and each of those variations would be treated as separate and distinct files. Though it is possible to make an NTFS volume case-sensitive, it is not for the faint of heart!

One could, therefore, reverse-engineer the integer that produced the Video ID and use that in addition (or for the adventuresome: instead of) the Video ID.

The whole discussion was well-worth the read and highly recommended for anyone who would like more information on where it came from and how it came about.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Interesting) by DannyB on Friday July 19 2019, @06:34PM (2 children)

    by DannyB (5839) Subscriber Badge on Friday July 19 2019, @06:34PM (#869100) Journal

    Windows has skrew ball file names.

    10. Create a new folder.
    20. Rename the folder to "God Mode.{ED7BA470-8E54-465E-825C-99712043E01C}".
    30. Folder becomes a new control panel shortcut. One that you didn't have before.
    40. Profit! (or cause havoc)

    --
    People today are educated enough to repeat what they are taught but not to question what they are taught.
    Starting Score:    1  point
    Moderation   +1  
       Interesting=1, Total=1
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3  
  • (Score: 0) by Anonymous Coward on Friday July 19 2019, @07:02PM (1 child)

    by Anonymous Coward on Friday July 19 2019, @07:02PM (#869118)

    God Mode.{ED7BA470-8E54-465E-825C-99712043E01C}

    Well obviously... I think I prefer case sensitive file systems though. I've also been encoding integers as base62 for use in web URL's for about 15 years now, extended to the file names for a flat html caching system that would be impossible to manipulate using a DOS shell.

    • (Score: 2) by DannyB on Monday July 22 2019, @02:05PM

      by DannyB (5839) Subscriber Badge on Monday July 22 2019, @02:05PM (#869948) Journal

      Why base62 instead of base64? Or did you mean base32?

      --
      People today are educated enough to repeat what they are taught but not to question what they are taught.