Stories
Slash Boxes
Comments

SoylentNews is people

The Fine print: The following are owned by whoever posted them. We are not responsible for them in any way.

Journal by turgid

Subtitle: Angry Old Man Shakes Fist and Shouts at the Sky/Get off my Lawn

Alas I am not young enough to know everything. Fortunately I am surrounded at work by people who are so I am not completely lost.

We had a very confident young hotshot who left some time ago for a very well-paid job "doing AI." He knew much. He knew that Bitbucket was the way to go. And we adopted bitbucket and we pay a subscription.

Bitbucket is pretty cool. It's very similar to GitLab. In a previous life I set up and ran my own GitLab server and had my own continuous integration pipeline. I really liked using it.

Now to the present. I have been doing PHB duties, then I was given emergency bug fixes to do on Critical Projects(TM) and all sorts of stuff and because reasons I am writing code again for Critical Projects(TM) with tight deadlines meanwhile trying to do all sorts of other stuff including teaching the young ones things about C (everything's Python nowadays).

We had a crazy head of projects who was from the headless chicken school of management and some months ago I was given a fortnight to write a suite of command line utilities to process some data in a pipeline from the latest Critical Project(TM). Specifications? Requirements? A rough idea of what might be in the system? Ha! Fortunately crazy head of projects got a new job and left.

I wrote this code, in C, from scratch, on my own, and in four days flat I had three working command line utilities which I had written using test driven development (TDD) and will an additional layer of automated tests run by shell scripts all at the beck and call of make. I cheated and wrote some scripts to help me write the code.

As you can imagine, these utilities are pretty small. We gave them to the new guy to finish off. Six weeks and lots of hand-holding later, I took them back to fix.

However, we have this "continuous integration" setup based on bitbucket. It's awfully like GitLab, which I used some years ago, so there are no great surprises.

Now we come to the super fun part. We build locally, not on Bitbucket's cloud, which is good. The strange thing is that since I got old, Docker has come along.

The young hotshot who knew everything decided that we needed to do all our builds in these Docker containers. Why?

A Docker container is just one of these LXC container things which is a form of paravirtualisation, somewhere between chroot jails and a proper VM, where the kernel presents an interface that looks like a whole system on its own (see Solaris 10 Containers). That means that you can run "arbitrary" Linux instances (with their own hostnames and IP addresses) on top of a single kernel. Or can you? Doesn't it have to be compatible with (integrated with) the kernel version and build that the host is running?

This is a cool feature. You can have many lightweight pretend virtual hosts on a single machine without having a hypervisor. You can also use it to have a user-land environment with a specific configuration nailed down (set of tools, applications, libraries, user accounts). It might be a great way to set up a controlled build environment for software.

For the last hundred years or so anyone who knows anything about making stuff (engineering) understands that you need to eliminate as much variation in your process as possible for quality, reliability and predictability.

So here's the thing - do you think our young hotshot and his successors have this sorted out? Think again!

I needed to set up some build pipelines and I was shocked. Apparently we are using a plethora of diverse Docker containers from the public Internet for building various software. But that's OK, they're cached locally...

Never mind that this stuff has to work properly.

Everyone in the team is developing their code on a different configuration. We have people using WSL (seriously) and others running various versions of Ubuntu in VMs. So we have these build pipelines running things like Alpine (because the image is small) which may or may not be anywhere near WSL or Ubuntu versions X to Y.

It gets better. Everything we do, every piece of software we build has its own Docker container. And then it goes onto a VM which gets "spun up" in the Microsoft(R) Azure(TM) cloud.

My little command line utilities, a few hundred k each compiled, get compiled in their own Docker container. That's hundreds and hundreds and hundreds of megabytes of random junk to compile a few thousand lines of C. When I type make on my command line (in my Ubuntu VM) each one takes under a second to build against the unit tests and rebuild again and run the automated regression tests.

The final thing that takes the cake is that I have to release these tools to another department (which they'll then put in a "pipeline" on "the cloud") and after about a year of having this amazing set-up for continuous integration, the young folk can't tell me (and they haven't figured it out yet) how to get the built binaries out of the build system.

Because the builds are done in Docker containers, the build artifacts are in the containers and the container images are deleted at the end of the build. So tell it not to delete the image? Put a step in the build script to copy the artifacts out onto a real disk volume?

"We don't know how."

There's a reason human beings haven't set foot on the Moon in over 50 years, and the way things are going our own stupidity will be the end of us. Mark my words.

Display Options Threshold/Breakthrough Reply to Comment Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by RS3 on Wednesday June 07, @05:27PM (3 children)

    by RS3 (6367) on Wednesday June 07, @05:27PM (#1310372)

    I've never used it, but I believe stunnel [archlinux.org] will work for any IP port.

    Also Red Hat's docs / how-to: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/security_guide/sec-configuring_stunnel_wrapper [redhat.com]

    The approach I usually use for these kinds of things is to limit the IP access range. So the containerized thing would only allow NFS mounts to local IP numbers, or just one IP (10.x.x.x, 192.168.x.x, etc.). If an Oscar loser gets into the host or one of the VMs, they'll wreck havoc, break statutes, statues, but it will be exciting and make for good news stories.

    Young people usually learn all about finger on their own, especially when it's against protocol.
     

    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2) by turgid on Wednesday June 07, @09:21PM

    by turgid (4318) on Wednesday June 07, @09:21PM (#1310412) Journal

    Excellent, more new toys to play with!

  • (Score: 1, Interesting) by Anonymous Coward on Thursday June 08, @06:24AM (1 child)

    by Anonymous Coward on Thursday June 08, @06:24AM (#1310475)

    I like your idea, but there is a risk of mounting at the wrong level. The guests should be as similar as possible, which means SPMD is usually the right choice. It can be done on the guest, but then you can end up duplicating complexity unnecessarily. Also, mounting in the guest using single shares can be tricky to prevent them from clobbering each other, especially if you have to worry about a container misbehaving. It can also simplify key/secret management to a degree. Those considerations mean that using a single share on the host and then map a subdirectory of that with the proper name to the same well-known location in the guest is one approach that was common. However, I do believe that Docker/Moby (and some other container/VM systems) automatically push that complexity using "automatic arguments," so that concern may be moot in their situation.

    • (Score: 1, Informative) by Anonymous Coward on Sunday June 11, @05:04AM

      by Anonymous Coward on Sunday June 11, @05:04AM (#1310954)

      You all may not know what SPMD is. In this context, it stands for Single Program Multiple Data. It is an approach to parallelism where the "programs" are the same, it is only something in their inputs that differ. There are a number of approaches that accomplish that. The most commonly used implementation sends a different message to each. However, most people start by changing something about the environment directly or building on the understanding of fork(), since those is the easiest to visualize.