[AMD APUs for 2021] ?
Cezanne-H | 45W | Zen3 | Vega 7 | 7nm
Cezanne-U | 15W | Zen3 | Vega 7 | 7nm
Lucienne-U | 15W | Zen2 | Vega 7 | 7nm
VanGogh | 9W | Zen2 | Navi2 | 7nm | LPDDR5
Pollock | 4.5W | Zen | Vega | 14nm[AMD APUs for 2022] ?
Rembrandt-H | 45W | Zen3+ | Navi2 | 6nm | LPDDR5/DDR5
Rembrandt-U | 15W | Zen3+ | Navi2 | 6nm | LPDDR5/DDR5
Barcelo-U | 15W | Zen3 | Vega 7 | 7nm
DragonCrest | 9W | Zen2 | Navi2 | 7nm | LPDDR5
Pollock | 4.5W | Zen | Vega | 14nm
(Assuming it's correct)
1. AMD is doing a lot of refreshes, starting with Lucienne as a Renoir replacement. Barcelo-U looks like a refresh of Cezanne-U, and Dragon Crest is a refresh of Van Gogh. They can bump clocks, shift prices around, get rid of old stock, etc.
2. Dragon Crest is a disappointment. I thought it could use Zen 3 and RDNA 2+/3. It will no doubt top out at 4 cores, the same as Van Gogh. Now we'll have to see how good and expensive Van Gogh is.
3. No use of "5nm" anytime soon. There was some speculation that some APU products could use it sooner. TSMC "6nm" is a refined "7nm" with only a somewhat smaller die area.
4. Pollock is probably made due to contractual obligations with GlobalFoundries. It's probably a dual-core. It will be found in dirt cheap devices.
None of the existing Zen/Zen+ APUs target that low of a TDP:
https://en.wikipedia.org/wiki/Zen_(first_generation_microarchitecture)#Mobile_APUs
https://en.wikipedia.org/wiki/Zen%2B#Mobile
It should replace the likes of the Excavator-based A6-9220C in $100 laptops. It will outperform that chip while using less power (4.5 Watt vs. 6 Watt TDP).
Years ago I thought it would be fun to do some 3D graphics completely from scratch in C, using SDL for output. I wrote some code to do basic vector and matrix functions (single- and double-precision). I wrote functions to draw lines and so on, to project vertices from 3D to 2D, to scale them and to render them on the screen.
I got a 64-bit computer and it still worked.
A couple of years ago I bought an AMD Ryzen 7 2700U laptop which is running Slackware. I had all sorts of trouble with it, particularly with the integrated graphics which was causing it to hang.
I tried my old code on it, though, and was pleased to see it run. The second time I ran it, it didn't work. My code checks for pixels that are off screen and doesn't try to plot them, and outputs an error message to the console. I was getting thousands of them. So I figured that there was something wrong with the machine. Computers are supposed to be deterministic. You should get exactly the same results each time you run a program.
This summer I built a new PC with an AMD Ryzen 5 3600 CPU and an nVidia GTX 1650 graphics card, running Slackware. Today I thought I'd try my old code. It ran perfectly first time. The second time, I got thousands of error messages about the pixels being out of range.
What's going on? What subtle bug has it revealed in my (very simple) code? On other machines it didn't have this problem.
Could it be that the ancient version of SDL that I'm using does something weird to the state of the hardware?
Update: Fixed it. It was an uninitialised local variable which just happened to be zero when it was supposed to some of the time.
I had a line of code at the end of the loop which updated the camera (view point) position so that I could have it "flying forward" each frame, but I had decided to stop the flying at one point many years ago. I forgot to assign (0,0,0) to the increment vector. Sometimes it was getting random garbage in it and when the vertices were getting translated, they were getting ludicrous values.
It wasn't my multi-threading (I commented that out).
It's funny how a change of CPU revealed that bug.
Industry insiders reveal that Intel's current 10 nm yields are nowhere near viable for “full production” and even with the SuperFin tech, 10 nm production may never match that of the 14 nm nodes. Plus we get more details about the performance issues with the upcoming server-grade CPUs, even more delays for the 7 nm nodes and new details about the product outsourcing plans.
[...] Thanks to a transcript of a recent Susquehanna International Group call provided by Reddit user uzzi38, we now get even more details regarding delays, unsatisfactory yields and production outsourcing.
[...] Upcoming SuperFin 10 nm nodes are “unquestionably far better than base 10nm. Better in just about every way. Yields are better (50+%), but still not as good at the 14 nm nodes.
Cannon Lake (the first processors to be produced on the base 10 nm node launched 2 years ago) had yields lower than 25% even with iGPU disabled.
[...] Backporting the upcoming Rocket Lake processors was not a good move: the new models ended up too big and are expected to be power hogs.
[...] The delays are problematic: “Ice Lake was supposed to compete with [AMD’s] Rome (and it would still lose even with the on-paper and significantly better specs than reality). [AMD’s] Milan is going to likely beat Sapphire Rapids over a year earlier and before Ice Lake rolls out. By the time Sapphire Rapids releases, [AMD’s] Genoa will be here or right around the corner. Genoa was supposed to compete with Granite Rapids which is late 2023 now.”
“If both companies iterate perfectly on their roadmaps as planned (much easier on AMD's side right now), Intel can not catch up to AMD until late 2024 or early 2025.”
Susquehanna International Group
Susquehanna International Group, LLP (SIG) is a privately held, global trading and technology firm. SIG comprises a number of affiliated entities specializing in trading and proprietary investments in equities, fixed income, energy, commodity, index and derivative products, private equity and venture capital, research, customer trading and institutional sales. Susquehanna is probably best known for expertise in derivatives pricing and trading, especially equity options.
This has been known for a while, but there are more details. The i9-10900K ("Comet Lake") is a 10-core CPU with 20 MB of cache, while the i9-11900K ("Rocket Lake") will have just 8 cores and 16 MB of cache. Given a 10-15% IPC increase, the 11900K will have slightly worse multi-threaded performance. Rocket Lake will support PCIe 4.0, however, and it increases memory support to DDR4-3200 from 2666/2933.
Rocket Lake can probably beat AMD's Zen 3 in single-threaded and gaming performance, although AMD could try another "XT" refresh to stay on top. One advantage of doing so is that it forces tech reviewers to look at essentially the same CPUs again, except with newer drivers and a small performance bump. It would also allow AMD to maintain the elevated MSRPs for the new CPUs while letting the older ones drop a bit.
Intel's Alder Lake CPUs are scheduled to come later in 2021, on a "10nm" node, with DDR5 memory support. These will use a hybrid architecture with up to 8 big cores and 8 small Atom cores. Intel may advertise the top model as having "8 cores, 24 threads", since Atom only has 1 thread per core and is weaker. "16 cores, 24 threads" would be the less scrupulous possibility.
I'll be keeping an eye out to see which CPUs become the first to ship with Microsoft Pluton, since that will probably mark the last generation of x86 CPUs some of you will consider buying.
The NVIDIA GeForce RTX 3080 mobile outed with 16 GB of VRAM, a GA104 GPU and 6,144 CUDA cores
The first details of the GeForce RTX 3080 mobile have appeared online. Not only do they offer an insight into what to expect from high-end gaming laptops next year, but they also raise the question about the continued existence of merely a 10 GB edition of the desktop variant of the RTX 3080.
[...] The apparent confirmation of NVIDIA equipping the mobile version of the RTX 3080 with 16 GB of VRAM raises questions about its desktop counterpart, though. Currently, NVIDIA only sells a 10 GB edition, but there had been rumours about it releasing a 20 GB version. NVIDIA had allegedly cancelled this SKU, but a recent EEC registration by MSI suggests otherwise. It would be strange for NVIDIA to sell mobile GPU with 60% more VRAM than its desktop namesake, so maybe a 20 GB variant of the RTX 3080 is on the way after all.
Coming to a $2,500 laptop near you.
(Previous mobile flagship was the NVIDIA GeForce RTX 2080 Super Max-Q with 8 GB of VRAM. Edit: Max-Q is the same thing but with a lower TDP of 80 Watts instead of 150 Watts, still great for warming the lap.)
Politicians urge people to buy Australian wine in defiance of China
Politicians from various Western countries have asked consumers to buy Australian wine in order to fight back against China's punitive tariffs on the beverage.
China is the most important export market for Australian wine, and winemakers are scrambling to find new markets following Beijing's decision to slap tariffs of up to 212% on Australian wine imports amid an escalating trade war between the two nations.
China rejects Australian PM's call to apologise for 'repugnant' tweet
China’s foreign ministry has rejected calls from the Australian prime minister to apologise over an inflammatory tweet over war crimes allegations, insisting it is Australia that should be saying sorry for the loss of life in Afghanistan.
The prime minister, Scott Morrison, had demanded the Chinese government apologise and take down a “repugnant” foreign ministry tweet that depicted an Australian soldier cutting the throat of a civilian in Afghanistan.
[...] The tweet was accompanied by an inflammatory image that appears to depict an Australian soldier cutting the throat of a young civilian holding a sheep, together with the words “Don’t be afraid, we are coming to bring you peace!”
[...] At the ministry of foreign affairs’ regular press conference on Monday, Zhao did not appear. Instead Hua Chunying, director of the ministry’s department of information, addressed media, doubling down on Zhao’s tweet.
“The Australian side has been reacting so strongly to my colleague’s tweet. Why is that? Do they think that their merciless killing of Afghan civilians is justified but the condemnation of such ruthless brutality is not? Afghan lives matter,” Hua said.
Hua said Australian soldiers committed “heinous crimes”, detailing some of the more graphic findings from the Brereton report, and the Australian government owed the Afghan people a formal apology.
WeChat blocks Australian Prime Minister in doctored image dispute
The Chinese social media platform WeChat blocked a message by Australia Prime Minister Scott Morrison...
China's WeChat blocks Australian PM in doctored image dispute
China’s WeChat social media platform blocked a message by Australia Prime Minister Scott Morrison amid a dispute between Canberra and Beijing over the doctored tweeted image of an Australian soldier.
[...] Morrison took to WeChat on Tuesday to criticise the “false image”, while offering praise to Australia’s Chinese community.
In his message, Morrison defended Australia’s handling of a war crimes investigation into the actions of special forces in Afghanistan, and said Australia would deal with “thorny issues” in a transparent manner.
But that message appeared to be blocked by Wednesday evening, with a note appearing from the “Weixin Official Accounts Platform Operation Center” saying the content was unable to be viewed because it violated regulations, including distorting historical events and confusing the public.
Hong Kong Activists Sentenced For Their Role In Anti-Government Protest
A trio of young Hong Kong opposition activists have been sentenced after pleading guilty to organizing a demonstration last year as part of a larger protest against Hong Kong's receding autonomy.
Their sentencing on Wednesday is the latest blow to the region's opposition movement, which seeks to preserve Hong Kong's limited autonomy from Beijing.
The three — Joshua Wong, Agnes Chow and Ivan Lam — have been held without bail since pleading guilty in late November for organizing and participating in the protest last year that surrounded police headquarters. Wong, Chow and Lam, all in their 20s, are also founding members of the now-disbanded Demosisto opposition political party.
One of the biggest changes in the Ryzen 7 5800U is the cache redesign which now packs 16 MB of L3 cache versus the 8 MB of L3 cache on Ryzen 7 4800U. The L2 cache will still be 4 MB or 512 KB per core. This would allow for reduced latency and faster inter-core interconnect bandwidth.
The doubling of L3 cache to 16 MB is interesting, and could have been necessary to reproduce Zen 3's benefits in Cezanne. From a Renoir review:
AMD’s Mobile Revival: Redefining the Notebook Business with the Ryzen 9 4900HS (A Review)
For Renoir, AMD decided to minimize the amount of L3 cache to 1 MB per core, compared to 4 MB per core on the desktop Ryzen variants and 4 MB per core for Threadripper and EPYC. The reduction in the size of the cache does three things: (a) makes the die smaller and easier to manufacture, (b) makes the die use less power when turned on, but (c) causes more cache misses and accesses to main memory, causing a slight performance per clock decrease.
With (c), normally doubling (2x) the size of the cache gives a square root of 2 decrease in cache misses. Therefore going down from 4 MB on the other designs to 1 MB on these designs should imply that there will be twice as many cache misses from L3, and thus twice as many memory accesses. However, because AMD uses a non-inclusive cache policy on the L3 that accepts L2 cache evictions only, there’s actually less scope here for performance loss. Where it might hurt performance most is actually in integrated graphics, however AMD says that as a whole the Zen2+Vega8 Renoir chip has a substantial uplift in performance compared to the Zen+Vega11 Picasso design that went into the Surface Laptop 3.
[...] It’s important to note that even though the chip has 8 MB of L3 total across the two CCX domains, each core can only access the L3 within its own CCX, and not the L3 of the other CCX domain. So while the chip is correct in saying there is 8 MB of L3 total, no core has access to all the L3. This applies to the desktop and enterprise chips as well (in case it wasn’t explicitly stated before).
It sounds like each core of an 8-core Cezanne APU should be able to access up to 16 MB, from 4 MB.
The Ryzen 7 5800U is reportedly equipped with an enhanced Vega GPU featuring 8 CUs or 512 cores, clocked in at 2000 MHz.
That's 14% more frequency than the 4800U's 8 Vega CUs. Probably on the same "7nm" node. AMD is currently delivering only modest graphics improvements on its top APUs, since many of them will be paired with discrete mobile graphics chips. AMD's lower-powered Van Gogh should have up to a Zen 2 quad-core paired with RDNA 2 graphics (faster graphics than Cezanne seems likely), and Cezanne's successor "Rembrandt" will have RDNA 2 graphics (or RDNA 2+).
AMD will apparently also be releasing a Renoir (Zen 2) refresh known as "Lucienne", using the same 5000-series naming. Nobody liked that.
The Ryzen 7 5700U (Zen 2/Lucienne) will outperform the Ryzen 7 4800U, using the same 8 cores and 16 threads with a slightly higher boost clock and better graphics clock speeds, and will probably be found at cheaper price points than the 4800U.
6-core Ryzen 5 5600U (Zen 3/Cezanne) will have 12 MB of L3 cache, still an improvement over Renoir. Ryzen 3 5400U (Zen 3/Cezanne) will have 8 MB of L3 cache, usable in full by 4 cores.
Rembrandt will have some significant improvements over Cezanne, landing in 2022. It looks like every APU with RDNA 2 graphics in it will have "CVML", which I take to mean machine learning acceleration using the graphics cores. Rembrandt will also have PCIe4 and LPDDR5, on TSMC's "6nm" process, which may only offer a density increase (no performance or efficiency improvements).
The same chart in that article points to all Zen 4 desktop CPUs having a graphics chiplet, which would be a nice change.
Saudi Arabia denies crown prince held 'secret meeting' with Israeli PM
Saudi Arabia's foreign minister has denied that Israeli Prime Minister Benjamin Netanyahu flew to the Gulf kingdom on Sunday to secretly meet Crown Prince Mohammed bin Salman.
"No such meeting occurred," Prince Faisal bin Farhan Al Saud tweeted.
Mr Netanyahu has declined to comment on the Israeli reports that he was on board a private jet that travelled from Tel Aviv to the Red Sea city of Neom.
[...] Also on Monday, a delegation of senior Israeli officials travelled to Sudan on what would also be the first such visit to a formerly hostile country, an unnamed Israeli official confirmed. The countries are expected to map out areas of co-operation.
Citing unnamed Israeli sources, Israeli public broadcaster Kan and other media earlier reported that Mr Netanyahu and the head of the Mossad intelligence service, Yossi Cohen, attended talks in Saudi Arabia on Sunday evening with Crown Prince Mohammed and US Secretary of State Mike Pompeo.
Israel's Benjamin Netanyahu visits Saudi Arabia, official says
It's called a career because I career from disaster to disaster. I'm seriously thinking about changing jobs again. I've been in the current one for several years.
What's happened? Well, I'm quite lucky in that I've had plenty of opportunities to play with cool toys, to meet very clever people and to write code.
However, I have skills, learned previously that are under-utilised, if used at all, and I'm frightened of losing them.
I'm also suffering from PHB. I know the term is pejorative, and the guy who's the PHB is actually a nice guy. I get on with him. There's not a bad bone in his body, but he's acting as my project leader.
The thing is, I'm training him, but he doesn't know what he doesn't know, and I'm doing a bad job of explaining to him. So he keeps asking the same questions and making the same suggestions.
I have been asked to write a device driver. It's not the first one I've worked on, but it is the most complex, and I need to make sure it works properly. So, in order to understand the behaviour that needs to be implemented I have identified some FOSS that I can build on - to stand on the shoulders of giants - and I have implemented a test harness with unit tests and some model code in user-land to understand the behaviour before putting it in the kernel. I also have a nice automated build in a VM to mitigate against crashes.
There are a number of problems. This is a waterfall company. That means weeks and months of writing documents and drawing diagrams and leaving coding to the last minute, finding out it doesn't work and frantically rewriting it several times, doubling the schedule and blowing the budget. The fact that I have already found and fixed several flaws in various plans (including a buggy API, provided by good friends, that didn't even compile) doesn't register.
Second, drawing the pictures commits you to implementing the picture in code, even if you find out later that something was misunderstood or missing. Then you have to argue with a team and go round in circles to get it changed.
Third, and this takes the biscuit, "So is a driver a kernel module?"
In this context, yes. Yes it certainly is. This is not FUSE where there is a bare minimum kernel module with call-backs into user space.
Fourth, "I want you to use ${C++LIBRARY} in it."
It's kernel code. It's in C. And the functionality that your ${C++LIBRARY} provides in user space is something that the kernel already provides itself. We should be using what the kernel provides. "But..."
And we're supposed to be doing these new projects faster, cheaper and better than before.
20 years to go until I can retire. Argh!