A journal entry that simply restates a comment I just wrote.
Memory closely integrated with processors at the chip level makes sense. You would upgrade memory and processing power together.
Another thing I think will eventually happen, but that will be controversial.
Hardware assisted GC
Note that all modern languages in the last 2 freaking decades have garbage collection. Remember "lisp machines" from the 1980's? Like Symbollics? Their systems didn't execute Lisp especially fast, but what they did was provide hardware level assistance for GC which made GC amazingly fast.
I look at the amazing things JVM (Java Virtual Machine) has done with GC. If only the JVM's GC could benefit all other languages (Python, JavaScript, Go, Lisps, etc). Of course, those languages could use JVM as a runtime. And GraalVM _might_ make something like that happen where lots of different languages run in the same runtime and can transparently call each others functions and classes and have a common set of underlying data types. Red Hat's Shenandoah and Oracle's open source ZGC are amazing garbage collector technology. Terabytes of memory with 1 ms GC pause times. Now imagine if you had hardware assistance for GC. (btw, why is Red Hat investing so much into Java development? I thought they were a Linux company? Could Red Hat, which is a publicly tiraded company, have some economic reason Java is making them lots of money?)
Rationale: GC is an economic reality. Ignore the whining of he C programmers in the peanut gallery for a moment. They'll jump up and down and accuse other professionals of not knowing how to manage memory. Ignore it. Why do we use high level languages (like C) instead of assembly language? Answer: human productivity! Our code would be so much more efficient if we wrote EVERYTHING including this SN board directly in assembly language!!! So why don't we??? Because, as C programmers are simply unwilling to admit, the economic reality is that programmers are vastly more productive in higher and ever higher level languages. Sure there is an efficiency cost to this. But we're optimizing for dollars not for bytes and cpu cycles. Hardware is cheap, developer time is expensive.
Slight aside: ARM processors already have some hardware provision for executing JVM bytecodes (gasp! omg!).
I'm surprised that modern Intel or AMD designs haven't introduced some hardware assistance for GC.
Symbollics hardware, IIRC, had extra bits in each memory word (36 bit words I think) to "tag" the type of information in every word. Then a way to efficiently find all words that happened to be a "pointer". A way to tag all words that were "reachable" or "marked" from the root set, etc.
Maybe this can happen if memory and processing elements become highly integrated and interconnected. Hardware design will follow the money just as programming languages and technology stacks do.
Others will believe that system design will stand still to conform to a romantic idealism that was the major economic reality once upon a time.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Nobody's crystal ball is perfect. But I did expect in the early 90's that most new languages would start having GC, and that did begin to happen about 2000.
Senile Senior software developers look at the business case beyond how personally amusing all this fun technology is.
(Score: 0) by Anonymous Coward on Wednesday February 26 2020, @04:09PM (3 children)
I heard that Jython to run python in a JVM exists. Is it usable and performant? I never tried it.
(Score: 2) by DannyB on Wednesday February 26 2020, @04:27PM (1 child)
I'm thinking about languages that compile to JVM bytecode.
According to this [wikipedia.org], "Jython compiles Python source code to Java bytecode (an intermediate language) either on demand or statically. "
I assume they mean JVM bytecode. Not sure what they mean about "either on demand or statically"
I've never tried it either. Sorry I don't have a better answer for you.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 1, Interesting) by Anonymous Coward on Wednesday February 26 2020, @06:57PM
It also seems they are not doing the python3 move.
(Score: 2) by krishnoid on Wednesday February 26 2020, @10:38PM
The bottom of the jython.org homepage shows ~10 reasonably-sized projects that use Jython, presumably suitable for production use. It's been around for 20 years, so if they haven't gotten it right against two slowly-moving targets ... ?
(Score: 1) by khallow on Wednesday February 26 2020, @04:21PM (5 children)
How can you speed up garbage collection via hardware? What sort of tasks would you target, if you were to design such a beast?
(Score: 2) by DannyB on Wednesday February 26 2020, @04:42PM
The only real details I know about Symbollics hardware was that every word of memory had tag bits. I have been told (in the 80's) by someone who used Symbollics (and also Lisp on conventional equipment) (from his previous employment on an unnamed Boeing military project) that the big advantage of Symbollics wasn't the general execution speed but the GC performance.
Now at that time, there probably wasn't the decades of GC research that there is today. So maybe GC hardware is unnecessary?
How to speed up GC? I hinted at it in the post. If you had tag bits on words, the ability to (somehow) rapidly "locate" pointer words and to "mark" reachable words from the root set would be a huge performance boost for GC.
A few years ago, I read some about Azul's "Zing" GC. It is for a JVM only on Linux, and only on very specific models of Intel brand processors. It uses the ability of hardware to indicate which pages have been written to, so that it can know which marked objects (in unmodified pages) haven't been modified.
Zing predates the now current ZGC (Oracle) and Shenandoah (Red Hat), which are similarly amazingly impressive, and both open source and part of OpenJDK.
I did buy a Garbage Collection textbook [amazon.com] in the late 90's for my own amusement. I'm sure it is out of date by now with all the work that has been done.
So your questions. How? I don't know the details, nor whether they are currently relevant.
What sort of tasks? Anything that uses GC, which is a lot of software written in a lot of modern languages. We're talking about everything written in at the very least: Java, JavaScript, Python, C#, any Lisp, and many others.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 3, Interesting) by DannyB on Wednesday February 26 2020, @04:45PM (3 children)
I would also add:
If anyone knows anything about Symbollics or other hardware assisted GC technology, it is a topic I would find interesting to hear about.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by krishnoid on Wednesday February 26 2020, @10:41PM (2 children)
You could ask their current incarnation [symbolics-dks.com]. I bet they have old architectural manuals and stuff that they could make available for the enthusiast.
(Score: 0) by Anonymous Coward on Thursday February 27 2020, @04:17AM
Before starting Symbolics (and also Lisp Machine Inc, LMI), the core group of hackers/wizards were at the MIT AI Lab, where they had a PDP-10 (or two?). This was also 36 bit architecture. Note the short reference to GC here, https://en.wikipedia.org/wiki/PDP-10#Registers [wikipedia.org]
Perhaps this will lead you back to something interesting? RMS hung out at LMI, some of his writings might be useful as well.
(Score: 2) by DannyB on Thursday February 27 2020, @03:12PM
It is a topic I find interesting, because I expected the rise of GC since about 1990. And then watched it happen. And watched the C programmers say, basically, that the entire industry is too stupid for going this direction.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by takyon on Wednesday February 26 2020, @05:02PM (9 children)
Doesn't plentiful memory lessen the need for garbage collection?
A single large pool of universal memory may also change the game.
The ReRAM included on 3DSoC [darpa.mil] is non-volatile, but might not be particularly dense (they are targeting 4 GB to start). On the other hand, its characteristics and fabrication may allow it to eventually scale up to a surprising amount, e.g. hundreds of gigabytes. Performance loss should be minor even if it takes longer to access upper layers.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 2) by DannyB on Wednesday February 26 2020, @07:43PM (2 children)
In the long run, no.
It's like a swimming pool with a faucet filling it (the application), and a drain emptying it (the GC). A multi threaded GC could be thought of as multiple drains, but that doesn't effectively matter. A bigger consideration is GC pause times. The current state of the art GCs use threads and have low GC pause times at the expense of increased cpu cost -- but we've got lots of cores, right?
With a bigger pool, it takes longer to fill up, but eventually a heap overflow will flood the back yard and possibly the den and living room.
I suppose I could mention the rate of allocation can also be a factor to consider. And a program that terminates prior to allocating enough to require a full GC could save on ALL expense of a GC. JVM has selectable GC's. Including epsilon GC. [java.net]
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by hendrikboom on Wednesday February 26 2020, @09:57PM (1 child)
So another good hardware feature would be proper small-scale locking between a running program and the garbage collector in another core.
(Score: 2) by DannyB on Wednesday February 26 2020, @10:23PM
One thing JVM does is "safepoints". Certain major GC operations wait for all "productive" threads to stop at a safepoint. I wonder what something at a finer level could achieve?
Another concept in GC is "read barrier" and "write barrier". Reading a pointer, just before dereferencing what it points to crosses the "read barrier". Writing a new pointer into a pointer slot (say within an object) crosses the "write barrier".
Some GC designs (at least in the 80s or 90s) would implement the read/write barriers as bits of extra code around these read/write operations on pointers.
If the application modifies a pointer in an object, then the GC needs to re-check that object, specifically that modified pointer, to see if what it points to has previously been marked as "reachable".
Some GC designs that relocated objects used an idea of "address forwarding". An object header would be marked with a "forwarding address" to where the object now actually lives. In the "read barrier" the system could invisibly find the new object from the stale pointer, without the application code being aware.
I have heard of at least one modern GC (Zing) that, as I understand it, uses the hardware's ability to detect modified pages that have been written to. All live objects in that page could now be considered to have had pointers potentially modified.
I don't work on GCs. But JVM will be getting "pluggable" GCs to solve the maintenance problems of having multiple GCs to choose from right now. So more people might try new approaches.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by DannyB on Wednesday February 26 2020, @07:55PM (1 child)
Non volatile memory could radically change OS design. More specifically, no longer separating "disk" (eg "storage") and "memory" could radically change OS design.
What we today think of as "booting" would be called "reformat / reinstall". (eg, load OS from other media, initializing memory from scratch)
What we today think of as "low power mode" could be "booting from cold start to OS already fully initialized and suspended in memory".
What we today think of as "launching an application" could be "reinstalling the application".
Files might be data structures in memory, in some cases, rather than byte sequences. Imagine the possibilities for file formats. Files would then need to have some sort of "serialization" mechanism by which file "structures" could be turned into byte sequences for transfer to another system, or to backup devices.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by takyon on Thursday February 27 2020, @03:24PM
HP's "The Machine" was supposed to do this stuff.
While it has been vaporware due to their reliance on memristors or another post-NAND/DRAM technology, they did try to create the feel of it by using 160 TB of DRAM.
HP Labs’ “Machine” dissolves the difference between disk and memory [arstechnica.com]
A closer look at HPE's 'The Machine' [theregister.co.uk]
RIP HPE's The Machine product, 2014-2016: We hardly knew ye [theregister.co.uk]
HP unveils 'The Machine': a big data computer with 160TB of RAM [digitaljournal.com]
Looking back at these articles, you can already see a problem. It connects cores to memory using an optical interconnect, with data traveling orders of magnitude more distance than 3DSoC layers. 3DSoC's ReRAM is non-volatile. The amount will start small but there are probably plenty of scenarios where 4-16 GB is enough. 3DSoC is like 2 or more revolutions in one concept. I'm expecting at least one update on the project this year and I will be submitting that in a heartbeat.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 2) by barbara hudson on Wednesday February 26 2020, @10:52PM (3 children)
SoylentNews is social media. Says so right in the slogan. Soylentnews is people, not tech.
(Score: 2) by DannyB on Friday February 28 2020, @09:48PM (2 children)
You don't need a special CPU to not have Swap files.
I have a nice PC at home running Linux Mint. It only has 32 GB of memory, and two SSD drives. I simply didn't set it up with a swap partition. It has no swap at all.
If I were to suddenly need swap, I could create a swap file, which
does not have as many fishis not as efficient as a swap partition. I have tested the procedure to create and activate a swap file. And then deactivate and remove the swap file. All without a reboot.Without any swap, I run Eclipse and other memory hungry Java software. I keep an eye on my memory use. I never get close to any danger of running out of memory.
Nothing special about the CPU here. It's all about the OS. I don't want the OS swap to put additional wear on the SSDs.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by barbara hudson on Saturday February 29 2020, @03:51AM (1 child)
SoylentNews is social media. Says so right in the slogan. Soylentnews is people, not tech.
(Score: 2) by DannyB on Monday March 02 2020, @03:21PM
It seems radical to me. Virtual memory was such a God send. But as I start to think about it, I realize that I don't use virtual memory myself, even though I have the hardware for it.
I would be reluctant to let go of it immediately. But I can see myself coming to embrace that point of view about unnecessary hardware.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 1, Touché) by Anonymous Coward on Wednesday February 26 2020, @07:11PM (2 children)
Since the first shaman made the first rattle and bamboozled the first idiot out of his things, it is the state of the world.
Experience is observing the unending stream of idiots getting bamboozled, year after year after year.
Wisdom is learning something from that observation.
(Score: 2) by DannyB on Wednesday February 26 2020, @07:44PM
Clarify. What is it you refer to that is thought to be magic? GC? Or the possibility of hardware support?
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by DannyB on Friday February 28 2020, @09:50PM
It is amusing that:
* Java has been in the top of language popularity for 15 years, never dropping below number 2 spot until very recently (JavaScript, Python)
* With few exceptions, all new languages in the last quarter century have GC
So just who is the idiot getting bamboozled?
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by shortscreen on Wednesday February 26 2020, @07:43PM (15 children)
Transistor budgets keep expanding and the engineers have evidently run out of ideas on how to fill that space already a while ago. Any weird stuff they decide to put in there now wouldn't surprise me. But can they do it without creating an additional avenue for side-channel exploits?
(Score: 2) by DannyB on Wednesday February 26 2020, @07:47PM (12 children)
I might be naive, but I would think a better use of transistor budgets would be simpler instruction sets, with more cores.
CPU speeds aren't getting massively faster. But you can keep adding more cores. And even more boxes. If software is architected properly.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by shortscreen on Wednesday February 26 2020, @08:09PM (10 children)
They're already throwing more cores into everything just because they can. How many do you need? Or should I say, how many do you want to pay for?
It's nice when software that has the performance need can utilize additional threads properly. But like you said at the beginning, doing things properly costs too much.
(Score: 2) by DannyB on Wednesday February 26 2020, @09:41PM
The cost is an excellent point to raise. In the comment that this journal entry came from, it started from: look back at Altair 8800 to today, now imagine from today to someday with CHEAP boards with massive cores and memory.
As for "need", going from the prior paragraph, software will grow to consume all available resources. And that's a GOOD thing. One man's "bloat" is another man's "features". Yes Microsoft Word consumes massively more resources than "Notepad". But it also does a lot more, including once-amazing things like interactively checking your spelling and grammar, as you type! Things that cost real resources, yet we take for granted and are happy to have.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 3, Touché) by hendrikboom on Wednesday February 26 2020, @09:52PM (1 child)
They're already throwing more cores into everything just because they
canhave no idea what else to do with all that chip area.(Score: 1, Insightful) by Anonymous Coward on Tuesday March 03 2020, @06:27AM
It's a brain dead way of "increasing performance".
(Score: 3, Interesting) by takyon on Wednesday February 26 2020, @11:00PM (6 children)
https://www.theregister.co.uk/2016/09/09/intel_soft_machines/ [theregister.co.uk]
https://en.wikipedia.org/wiki/Automatic_parallelization [wikipedia.org]
https://youtu.be/XW_h4KFr9js?t=482 [youtu.be]
Now that 8 cores is becoming the new minimum and 16 cores is "mainstream", you will be seeing more effort put into using many cores.
If programmers remain bad at writing multi-threaded code, compilers or machine learning might help pick up the slack. There is also a possibility (discussed in the video above) of hardware optimizing for more thread utilization in real time.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 2) by DannyB on Friday February 28 2020, @09:57PM (5 children)
It's about time! I've been thinking about this for over a decade. I already think about how to break problems into parallel code that can run on multiple cores. I like that a few years ago Java got new frameworks for making parallel computation a lot easier.
Speculation: someday we will have enough primary CPU cores that GPUs will be irrelevant. Yes, I know some will laugh. But they also laugh at GC and Java, which are both big industry trends, because they know better than everyone else. They probably also never could have imagined that things would progress much beyond the Altair 8800 in 1975 to the kind of computers we have today.
Back in the 1970s, as Pascal started to become popular, it not only brought strong typing with compile time checking, but also "structured programming", eg, no GOTO statement! (OMG, gasp!) Yes people thought that it might not be possible to write code without GOTO! Yet all languages since the 1980s don't have a GOTO and you don't seem to hear about it. And strong typing advantages are well known, despite early skeptics. I really don't get that one, all weak typing does is turn a compile time error into a runtime error -- that you don't get until six months after going to production.
Programmers will learn. At least ones that are any good and not tied to some status quo.
I'm old, but not set in my ways when it comes to tech.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by takyon on Friday February 28 2020, @10:24PM (4 children)
I was just thinking about this earlier when I was pondering where RPi would go in the future, and 3DSoC.
It's entirely possible, but if dedicated cores are more efficient, I think they will stick around.
What you will see in the near term is more SoC-like chips and APUs starting to kill off discrete GPUs more often. For example, the GPU found in AMD's Renoir is stronger than Intel/Nvidia expected and makes stuff like Nvidia's MX250 and MX350 laptop GPUs useless. The upcoming next-gen Xbox with 12 teraflops of GPU performance? That's an APU.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 2) by DannyB on Friday February 28 2020, @10:32PM (3 children)
I don't suggest that GPUs would go away overnight, and not soon. But at some point, if you had, say hundreds of CPU cores, or more, then the case for a specialized GPU might become questionable. I can envision the possibility. But I don't know that it will happen, and if so, not soon. But look at how much things can change in a couple decades.
I expect we're going to see a lot more SoC's. And more small board computers. And they'll continue to get cheaper. At some point, just like netbooks did over a decade ago, the economics cause new things to happen.
Suppose, something like an iPad costs $20. In what ways would this affect the world and our way of life?
What if Raspberry Pis were more powerful and only $1 each, what ideas would people have?
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by takyon on Saturday February 29 2020, @01:41PM (2 children)
Maybe. The scenario could be that you want your all-in-one manycore CPU (with RAM layers, of course) to continue to have more general-purpose cores (for what purpose? we don't know yet), and don't want to reduce that space/volume to use it for graphics-specific cores. And you get acceptable or perfect gaming/graphics performance with the CPU cores.
I will be keen to see these large core counts become useful for ordinary users. There could be "mainstream" 24-core on Zen 4 within a couple of years, and 32-core after that. Others will have 96-128 cores. Many users are fine with dual and quad-core, so software and games will have to think outside of the box to exploit (or waste) all of the available performance. Never underestimate the bloat!
Probably already true for some values of "iPad" (offbrand or refurbed), or entry-level smartphones which are arguably more useful.
Those have started a mobile payment revolution in Africa, among other things.
Well, you can look at Raspberry Pi Zero, which at $5-10 is sold at a slight profit, and has sparked many ideas. Something could be made smaller and cheaper, but it might not have any useful ports (fully wireless peripherals and charging?). Unlike a $300 CPU, materials, shipping, and packaging are dominant costs. Well, and the RAM I guess.
It will be interesting to see how RPi0 evolves. It's a slightly troublesome product for the Foundation, and they have tried to limit the amount that each customer can get, although you can order larger batches at a higher price. You could imagine a future version of that form factor hitting 2 or 4 cores, eventually hitting the RPi4's desktop level performance, or even surpassing it with a 3D chip. The two Micro-USB ports will be replaced with USB-C, and the mini-HDMI with micro-HDMI.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 2) by DannyB on Monday March 02 2020, @03:23PM (1 child)
That is the idea I was hoping for about ten years ago. We started having 4 cores, 8 threads. I was hoping it would pick up and increase.
As for the raspberry pi 0, what I would really like to see is a "mother" board that has sockets for a bunch of Pi zeros. A modestly priced, small cluster. That could go in a simple box.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 3, Interesting) by takyon on Tuesday March 03 2020, @04:35AM
Well, they made a HAT that you can plug four 0s into:
https://magpi.raspberrypi.org/articles/clusterhat-review-cluster-hat-kit [raspberrypi.org]
https://www.pishop.us/product/cluster-hat-kit-includes-4-x-raspberry-pi-zero-w/ [pishop.us]
Here's a board with up to 16:
https://hackaday.com/2016/01/25/raspberry-pi-zero-cluster-packs-a-punch/ [hackaday.com]
That one was never sold AFAIK, just a publicity stunt.
The problem you are going to run into is the performance of a Zero cluster. That could be partly because of the slow I/O, and partly because of bad performance per dollar at $5-15 per naked Zero (prices rise with wireless, presoldered headers, and/or bulk ordering). Slap a cheap x86 chip like the $50 Athlon 3000G into a cheap NUC kit, and it will kill the Zero cluster on performance/$ and maybe overall price (I think the truly cheap ones [notebookcheck.net] have yet to materialize).
Zero is getting old. It was first released in late 2015, using a faster clocked version of the SoC used in the RPi1 from 2012-2014. I would be surprised if it didn't get an update at some point, but that point could be years from now.
https://www.servethehome.com/aoa-analysis-marvell-thunderx2-equals-190-raspberry-pi-4/ [servethehome.com]
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 0) by Anonymous Coward on Tuesday March 03 2020, @06:22AM
With all the excess transistors they have nowadays, they could have a separate stack for parameters. That way if "stuff happens" (overflow) things get the wrong parameters rather than things get the wrong return address (which often means a bigger pwnage aka attacker runs code of choice).
But it's simpler and safer (job security-wise) to just add more cores than add a new way of doing things.
The other stuff to improve might be stuff like "getting the current monotonic time counter" and better ways of locking (multithreaded, multiprocessor or even multicomputers).
(Score: 2) by takyon on Wednesday February 26 2020, @10:43PM (1 child)
For many users, performance matters more than security. If you want both, take the machine offline.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 2) by DannyB on Friday February 28 2020, @10:00PM
For an online machine, just be sure you own the hardware. Don't share it with someone else's code.
You get performance, and some peace of mind about attacks which exploit the processor flaws. Assuming you can trust all your code, and libraries.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by The Mighty Buzzard on Wednesday February 26 2020, @10:00PM (28 children)
Reference counting is not garbage collection in the sense that you mean garbage collection and it has been used in the past ten years. It's simply automatically deallocating memory the instant it is unused, which is what all the people who look down on GC aim for in their code.
My rights don't end where your fear begins.
(Score: 2) by DannyB on Wednesday February 26 2020, @10:32PM (27 children)
It is a worthy aim. Been there, done that.
Problem is as programs become sufficiently complex, you inevitably have circular references.
Another problem is that the reference count thinking is inherently a "single thread" way of thinking about computation. One way to speed up the application is to have none of the GC operation be part of a thread that, say, services a web request. For example, the operation of servicing that request would not burden itself with checking reference counts, updating reference counts when pointers are copied, etc. Those cycles can be spent on other cores doing proper GC. Even if the GC takes more overall cycles, the reduction of the production workload's cycle count is the economic payoff.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 0) by Anonymous Coward on Wednesday February 26 2020, @11:42PM
A disciplined use of weak references disagrees with your assertion about inevitable circular references.
But your point about reference counting makes no sense. If the thread using the object doesn't touch the reference count, then there is no way for the collector to know when something is done using it. There would also be no way for that thread to allocate new objects. What you seem to be advocating for is a separate allocator/deallocator thread, and there are good reasons that design pattern fell out of use.
For what its worth, there is also no requirement that the "service" thread finalize the objects immediately after destroying all references, or at all for that matter, in a reference counting scheme.
(Score: 2) by The Mighty Buzzard on Thursday February 27 2020, @11:48AM (25 children)
You need to spend some time getting familiar with Rust. It puts a lie to that entire way of thinking without adding dev time overhead beyond the initial learning curve. It's both more efficient and safer than any GC language because it uses RC from the start, without sacrificing multi-threaded design.
My rights don't end where your fear begins.
(Score: 2) by DannyB on Thursday February 27 2020, @02:56PM
Pascal introduced compile time type safety to help the programmer.
My understanding of Rust is that it introduces concepts to assist the programmer, and detect at compile time problems with memory management.
But Rust is NOT GC, and not a substitute for GC where GC is appropriate. Please correct me here if I am wrong, but isn't Rust a tool to help developers manage the allocation, ownership and deallocation of memory? The point of GC is to completely avoid thinking about deallocation or management. You don't need any compile time tools to help you manage object lifetime, because you don't think about object lifetime. It becomes just one more low level detail that vanishes away, just like calculating the address of jump instruction targets in the machine code disappears from view.
If you think that all programming problems should be solved without GC, then we must disagree there. At certain levels of abstraction, use GC. Examples would include Lazy Evaluation, Theorem Provers, Computer Algebra Systems, Expert Systems, Prolog (programming logic) systems including MiniKanren.
The mental model with GC is to remove thinking about memory management at all. Not to make that management easier, but to get rid of it as one more bookkeeping detail the developer must consider. Just as C removes many bookkeeping details one would deal with in assembler. A language is too low level when it forces you to deal with the irrelevant. If you're building a unification matcher, then thinking about memory management is an irrelevant detail far removed from the problem domain.
You didn't mention my point about cpu cycles, but I'm going to put that in a separate reply.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by DannyB on Thursday February 27 2020, @03:04PM (23 children)
I want to emphasize a point about cpu cycles. Especially since C programmers are obsessed with this.
The thread that services a customer request should spend zero cpu cycles on memory management. Not one single cycle in a dealloc / dispose operation. Not one single cycle incrementing / decrementing reference counters. GC systems make allocation as close to a simple pointer increment as possible. You can't improve memory management performance much beyond zero cycles.
OTHER threads deal with collecting unused objects, cleaning up the heap, compacting objects to occupy contiguous space. Even if the cycles spent on GC exceed the cost of manual memory management. (Although I believe it has been shown that on large scale operations, the big scale aggregate cost of GC can be cheaper than all that inline memory management malloc/dispose code.)
The GC is overhead. The customer still pays the freight for that overhead. But the customer's requests to the system do not incur the cost of any of the GC cpu cycles. The money-making thread servicing the request just plows ahead allocating memory as if it magically comes from an infinite heap. No concern about keeping the heap organized. The money made by that thread pays the cost of the GC overhaed, but the execution of that money-making thread sees zero cpu cycles of memory management instructions in its stream of execution.
Does that point make sense?
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by DannyB on Thursday February 27 2020, @03:09PM (10 children)
It's an economic argument.
It's also an insulting argument. Arguments that are universally against GC for any purpose are arguments against the design of most languages of the last two decades. It's an argument that language designers, and users, are too dumb to know what they are doing.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by hendrikboom on Wednesday March 04 2020, @04:17AM (9 children)
There are situations where garbage-collection is intolerable.
Those are very rare.
(Score: 2) by DannyB on Wednesday March 04 2020, @03:24PM (8 children)
I agree. And even if they aren't completely rare, I would agree GC is not for everyone all the time.
I also said: if there were one perfect programming language we would all already be using it.
However I think GC (in most modern languages of the last quarter century) is the right move for most uses.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by hendrikboom on Wednesday March 04 2020, @11:14PM (7 children)
But GC would be good for almost everybody almost all of the time.
(Score: 2) by DannyB on Thursday March 05 2020, @08:29PM (6 children)
I agree with that. I tend to think that "most" software at or above the command line level could be written in higher level languages than C. Especially when you get to GUI level programs.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by hendrikboom on Friday March 06 2020, @04:58AM (5 children)
The guys working on Gambit, a dialect of Scheme, have even managed a version that boots straight into Gambit.
They even wrote their disk driver in Scheme!
-- hendrik
(Score: 2) by DannyB on Friday March 06 2020, @05:41PM (2 children)
Do you mean a disk image, on a hypervisor, boots directly into Scheme? That would be cool.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by hendrikboom on Friday March 06 2020, @08:40PM (1 child)
No. It boots directly on the metal. It's an old enough machine that hypervising was probably not relevant.
The hardware loads a boot track, and the boot track contains enough Scheme system for it to load the rest of the system using the language's module facility.
(Score: 2) by DannyB on Friday March 06 2020, @10:21PM
There is this thing now days called a "Library OS". One of these is called OSv [osv.io] (and on Github [github.com].
In a nutshell:
* It is open source
* It is NOT Linux
* It has a Linux-like userspace API -- minus things like forking, creating new processes
* It runs only one process
* That process CAN create threads
* Build procedure is that you "link" your workload with this "library" and produce a new binary. (using tools they provide)
* The built binary, just happens to be a bootable image, as if it were a "bootable disk" with a VHD suffix.
* It is NOT designed to run on bare metal, but under a hypervisor
* The VHD is way, way smaller than a VHD containing a real OS, even a stripped down Linux
* There is zero overhead for calls into the kernel
* The userspace is really kernel space, fully privileged; yes the userspace could do anything directly to the VM "hardware"
* This is way more efficient and requires very few resources
* You manage workloads by managing "VMs"
They provide tooling to make it easy to link with several different types of workloads such as Java, especially Java+Tomcat with your applications(s) installed into Tomcat.
I was thinking this Gambit Scheme might in the same vein as OSv and similar.
Aside, about Java / Tomcat:
* Apache Tomcat is a "Java Application Server"
* That is a web server, but so much more.
* There are multiple vendors that provide their own "Java Application Servers", of which Apache Tomcat is one of many.
* Similarly Sunbeam and Black & Decker both make toasters compatible with your bread and electrical outlet.
* You build applications for a Java Application Server.
* Your built application is a WAR file. (Really a ZIP file with restrictions on compression algorithms allowed, and a manifest)
* You can get Java Application servers even in IBM Mainframes. Or Raspberry PIs.
* You install these applications into a Java Application Server (eg, Tomcat) similarly to the fact that applications install into a Smartphone. You don't care what brand/model of smartphone. (ignoring differences between iOS/Android)
* Every application server has its own mechanisms for how you install and manage these applications
* Apache Tomcat has several ways, but one of them is a web based control panel that lets you install applications, turn them on and off, set each application's base URL, and remove them.
You can see why Java+Tomcat is a popular workload for OSv.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by DannyB on Friday March 06 2020, @05:43PM (1 child)
I see on their main page, under portability: " There are no external library dependencies, and OS API dependencies can be removed so as to run directly on the bare metal."
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by hendrikboom on Friday March 06 2020, @08:42PM
Yes, that's it. And yes, it's cool.
(Score: 0) by Anonymous Coward on Thursday February 27 2020, @09:35PM (11 children)
It still doesn't make sense. Ignore the idea of whether or not a dedicated thread should clean up/finalize/destruct/etc. unused objects. If the working thread(s) doesn't spend any time tracking how many references it has to an object, how is any sort of GC supposed to know that the object is unused?
(Score: 3, Informative) by NickM on Friday February 28 2020, @04:35AM (1 child)
Here is a naive stop the world algorithm : https://www.geeksforgeeks.org/mark-and-sweep-garbage-collection-algorithm/ [geeksforgeeks.org]
Once you master this algorithm have a look at the concurrent mark and sweep version, then move on to G1GC and then ZGC.
I a master of typographic, grammatical and miscellaneous errors !
(Score: 0) by Anonymous Coward on Friday February 28 2020, @08:57PM
Well I look dumb. I just realized my brain must have started a process to run `sed /reference count/references/g`continuously somehow. I somehow got into my brain that you were somehow advocating for objects not tracking what it links to or something weird like that. Rather than the actual idea of not requiring the target objects to track how many references point to them. Hopefully that makes my misunderstanding clearer. But for the record, here is the comment I was writing when I realized we've been talking past each other.
I already know how Garbage Collectors work and mark-and-sweep in particular. But to quote:
Built into that is the idea that one object references another. If the service thread doesn't keep track of references at all, how can GC know they are unused? Here are the first several lines of some code, and assume that the language is assign by reference instead of by copying value are the only assignments to said values.
Now somewhere in the rest of the code the reference to "d" (along with the rest but ignore that for now) needs to be destroyed, otherwise the objects created are all reachable from the root and will live for the life of the process. The memory allocated for the integer 2, for example, is never freed. If the thread doesn't doesn't deal with references how is the GC supposed to know it is done with them? Even then the original alloca [about here I realized my error].
(Score: 2) by DannyB on Friday February 28 2020, @03:59PM (8 children)
Your question is based on not understanding how GC works.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 0) by Anonymous Coward on Friday February 28 2020, @09:44PM (7 children)
It isn't. It is based on a error in my understanding of what you are advocating for, which I didn't realize until way to late.
(Score: 2) by DannyB on Friday February 28 2020, @10:04PM (6 children)
I think you get it.
The money-making thread pays the dollar cost for the other cores that run GC.
The money-making thread sees zero cpu cycles for memory management -- which happens on other cores. So request/response time is minimized. The money making thread sees no memory management instructions in its stream of execution. It just allocates (cheaply) as much as it needs. The GC system keeps up.
The very idea of reference counting is an antiquated "single-thread" way of thinking.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 0) by Anonymous Coward on Friday February 28 2020, @11:41PM
Thanks for the vote of confidence. I was just twisted around trying to model some weird completely-reference-free, by-value system or a centrally-managed memory system or something even weirder I thought you were putting forward. In the confusion, I wasn't articulating anything clearly, which made the whole thing worse. I had to break my brain, reboot, and reread the whole thread over before I got it.
But yes, I get what you mean now. Especially in our many-core processors, where you don't have to preempt the worker threads to make room for the GC threads and the GC may even be able wait for workers cooperatively. We've come a long way from when stopping the world was the best approach to GC, thanks to the many-core effect on concurrent GC. Heck, with enough RAM and cores you can use esoteric methods like dynamic escape analysis, thread tracking, CbV, or combination systems.
(Score: 2) by hendrikboom on Wednesday March 04 2020, @11:18PM (4 children)
But there is some coordination needed from the money process and the GC processes. There are a variety of mechanisms for this, but even the best I've seen does occasionally take some effort by the money process.
(Score: 2) by DannyB on Thursday March 05 2020, @08:28PM (3 children)
Can you elaborate? The money process simply allocates memory without worrying about keeping track of it.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by hendrikboom on Friday March 06 2020, @04:56AM (2 children)
Well, technically, if all it does is allocate memory, it may still have to wait for garbage collection if the collector isn't finished when it runs out.
But that wasn't what I meant.
I meant that it has to coordinate with the collector for the case that it modifies a pointer that the garbage collector has already traced. Then the garbage collector has to be informed to trace that new pointer too.
-- hendrik
(Score: 2) by DannyB on Friday March 06 2020, @05:39PM (1 child)
The modern collectors use hardware to detect this. If you modify an object, one of its members, in this case a pointer, then you are doing a WRITE operation to that page of memory. The modern collectors can later determine that pages of memory they had already fully traced now have additional modifications which have been made. (A write flag on the page, in hardware.) Other collectors use techniques like "read barriers" and/or "write barriers".
The collector keeps up by cleaning up faster than the rate of allocation. Even if more cores must run GC threads. In practice, in large production systems, it works. It's worked for years. Modern collectors make it even better.
While other languages have GC, I don't know of any that have the decades of GC research that has gone into the JVM. Multiple GC's to choose from. Each with various knobs and dials for tuning. With Red Hat's Shenandoah and Oracle's ZGC (both are open source in part of OpenJDK) you can have terabytes of memory and only 1 ms GC pauses. I heard, and I posted a link in a comment on one of my own recent journal entries about this, that in Java 14 (due in less than a couple weeks) raises the maximum heap size up to 16 Terabytes of memory. Obviously, due to popular demand.
If you think a fertilized egg is a child but an immigrant child is not, please don't pretend your concerns are religious
(Score: 2) by hendrikboom on Friday March 06 2020, @08:34PM
Which, of course, is the exact reason this discussion is about hardware-assisted GC.