Microsoft's Windows Subsystem for Linux is coming to all Windows 10 users (archive):
You won't have to be a tester to try Windows 10's new, built-in Linux kernel in the near future. Microsoft has confirmed that Windows Subsystem for Linux 2 will be widely available when Windows 10 version 2004 arrives. You'll have to install it manually for a "few months" until an update adds automatic installs and updates, but that's a small price to pay if you want Linux and Windows to coexist in peace and harmony. It'll be easier to set up, at least -- the kernel will now be delivered through Windows Update instead of forcing you to install an entire Windows image.
Embrace, Extend... Excite!
Windows blog post.
Previously: Windows 10 Will Soon Ship with a Full, Open Source, GPLed Linux Kernel
> I did some fairly serious Cygwin work in the 2003 timeframe, and I was really impressed with how "close to the metal" it was, I'd get 98%+ performance out of a Cygwin application as compared to the same code running on a dedicated Linux shell
I'm going to assert that your work load did not involve forking. We had an application stack that ran on Solaris and Linux. A decision came down that there should be a windows port, and to get this done with minimal fuss, some stuff would be properly ported, but some of the supporting scripts and less run-time critical bits would run under cygwin. Our experience was cygwin was at least 100 times slower when forking processes (and this had a lot to do with windows being slower than shit to fork processes-- this was microsoft's motivation for the shit ton of git patches they dropped that allow a bunch of operations to be performed without a zillion forks (forks on linux are as fast as threads on windows-- forks on windows are as slow as swapping floppy disks [almost]).
Windows doesn't support fork() or exec() natively. In order to emulate the behavior of fork(), cygwin has to jump through all sorts of hoops. It is a surprisingly complex operation. The context switches, mutexes, etc. produces a lot of overhead. Then if you call exec(), you end up causing another massive behavior emulation routine.
Yep, no forks in my little Cygwin toys.
I did a similar exercise just last month - ported a C++/Qt app to Python/PyQt - the first part of the port went swimmingly, 96% performance. Then I got to the "core" of what the C++ was doing: image pixel value manipulations, and Python tied itself in a knot: something like a 700x slowdown.
Then I got to the "core" of what the C++ was doing: image pixel value manipulations, and Python tied itself in a knot: something like a 700x slowdown.
Did you try having Pillow, NumPy, or another image or signal processing library do the inner loops?
Or was the nature of the problem such that it would require so many calls in and out of the library that the overhead of marshaling arguments between the Python and C environments would dominate runtime? Because I ran into a Python-to-C call rate bottleneck a few years ago when I was trying to get Pillow and PyPy to work together. I wanted to take the sum of each 16x16-pixel area in a difference image, and in Pillow, that requires making a cropped copy of each 16x16-pixel image. In PyPy, calls to extensions that use the traditional Python C extension API [blogspot.com] or ctypes are slow, whereas calls to extensions that use CFFI [readthedocs.io] (the Python port of LuaJIT FFI) are fast. Guess what Pillow wasn't using.
The real trick is learning time. I have a half-dozen rather straightforward problem statements to implement, and they're all obvious (to me) how to implement in Qt/C++, but constitute a pretty steep learning curve for me in Pillow/NumPy/OpenCV/what-have-you. Convert RGB to HSV or HSL, run statistics on the color channel values at each pixel location (things like mean, median, mode), make a transform which maps one set of images to have an identical color channel histograms as another set of images at each pixel location, copy-paste arbitrary ellipses from one image into another with alpha-blend margins, etc. etc. When the library has a function pre-built (like RGB to HSL transform) then, brilliant, say the magic words and it just happens.
My greatest frustration is when a library doesn't do something, because then the documentation tends to be silent on that point (hard to confirm a negative....), and when a library/language can't be made to do something in a practical manner, the documentation tends to be doubly silent on that point.