That seems particularly short-sighted. Your criteria for ARM needing to worry apparently is production silicon with vector capabilities and according to this article that will be available next year.
It is a valid point right now to argue that RISC-V isn't a threat while production parts are not available to purchase. Ad present, even if I wanted to acquire a competing product, I would be unable to due to the lack of ability to make an acquisition. The situation may change next year, but we won't know until we see what both ARM and RISC-V look like at that point. The dynamics may change on both sides of that coin after time has passed.
Strawman. He didn't say they are not a threat. He said ARM has nothing to worry about. The distinction is slim I admit and also just semantic, but the sentiment I read from The_Assimilator was more about how this company has not produced a product that ARM would need to worry about in half a decade and by saying that ARM has nothing to worry about, implies that they will not be producing any product that would compete with ARM for a long time. But the article states that the will be releasing the product that The_Assimilator says compete with ARM next year, and any company would naturally worry about a competing product that close on the horizon.
In terms of being threatened, ARM cannot be threatened by something that cannot exist. But they may want to worry, i.e. take seriously the notion that a competitor may be threatening them soon.
There's never anything wrong with competitive designs. Either way, we are likely to enjoy improvements that result from various interests attempting to get ahead of one another so it's good all the same no matter what the outcome.
ARM was able to take over because of its inherent advantage over x86: Efficiency. As R&D advanced, ARM cores like the Apple A* line and Qualcomm Snapdragon have slowly worked towards Intel-level performance. MIPS has no large inherent benefit over ARM, but it has some of the same disadvantages vs. x86 such as larger executable code size. MIPS is an elegant architecture and lacks some of the strange baggage the architectures have (read about THUMB some time) but the story of semiconductors does have elegance as a main protagonist. See Intel's history of success as an example.
Indeed, back in the early 8 bit days, the RCA CDP1802 was absolutely the most elegant architecture, but most people probably day "the what?". Outside of some niche applications, where the completely static nature allowed it to clock down to 0 and draw picoamps, and it being all CMOS so low power even when running, and being available in rad hardened form, it was all but invisible in the pre-IBM PC 8 bit days. Elegance doesn't equal design wins.
Interesting, I hadn't heard of this. I wonder if the same principles could be applied to idle processors today. Now if only Android would let the CPU go idle for more than a few ms.
The registers and such aren't static though, I don't think, on most modern processors, so they have to run at some minimum clock rate to be refreshed. Unused elements can be turned off today, and with turbo you have varying frequency, but the old 1802 could run from 0-4MHz (or 6, or 8, or as high as 12MHz in the last variants. It was almost RISC-like, as well, just 91 instructions, and that very well organized (for Hex, Intel just loved Octal so the 8080 instruction set it highly aligned for octal - it makes no sense in hex). Most 1802 contemporaries had a very small register count - the 1802 has 16x16 general purpose registers plus accumulator and a coupe others. Of the 16, any one could be program counter, which could be switched on the fly (that's how they did subroutine calls with no CALL instruction), and any could be the index pointer for memory directed operations, also switchable on the fly, which means the 1802 has the best operand ever - SEX, for SEt indeX. It was my first computer, and later when I had to add an assembly routing to an Apple 2 BASIC program to get proper performance, I was hugely frustrated by the lack of registers int he 6502. SO many people worship the 6502, I frankly hated the thing. 1802 remains my favorite 8-bitter, followed by the Z80. Best part is, that computer I built from a kit more than 40 years ago still works perfectly.
A bit of research shows that Intel, AMD, and all the major ARM SOC vendors are using SRAM for all cache. Intel does use some DRAM in its Iris Pro graphics, which can be used by the CPU cache.
I'd put it differently. Pragmatism equals design wins, and I see ARM as the most pragmatic company out there.
Intel insisted on x86 uber alles, even where it made no sense (Larrabee, mobile) and paid the price.
RISC-V has insisted on a certain kind of intellectual purity that makes no sense in terms of commerce, or the future properties of CPU manufacturing (plentiful transistors).
ARM on the other hand, has always done a really masterful job of adding enough new functionality to get what they need at not too much cost, of changing the ISA when appropriate (but not too often), of accepting that some markets (like ARM-M) need different types of vector/DSP extensions from what's appropriate for ARM-A.
It feels very much like MIPS. (Since it comes from much the same people, substantially [too much so, IMHO] unchanged in those beliefs since the 1980s.)
It's a MIPS variant indeed. This is why it's so funny when people try to claim it's a modern ISA - it's literally based on 80's RISCs. Same people, same minimalistic approach to reducing instructions at the cost of larger codesize and lower performance. No lessons learnt from MIPS...
My point is it's the same people repeating the exact same mistakes. It has the same issues as MIPS like no register offset addressing or base with update. Some things are worse, for example branch ranges and immediate ranges are smaller than MIPS. That's what you get when you're stuck in the 80's dogma of making decode as simple as possible...
Arm never did things like the other RISCs. Is it possible to learn and do better today? Sure, look at AArch64 for example.
That's an exceptionally silly riposte. Are you unaware that ARM has constantly evolved their instruction set, not just tweaks but experimenting with substantial changes (like Thumb and Thumb2)?
There is a HUGE amount of learning that informed ARMv8, from the dropping of predication and shifting everywhere, to the way constants are encoded, to high-impact ideas like load/store pair and their particular version of conditional selection, to the codification of the memory ordering rules. Look at SVE as the newest version of something very different from what they were doing earlier.
On the other hand you can find demonstrations on how targeting RISC-V ISA can produce smaller end-products compared to targeting ARM or specifically MIPS.
Modularity of the ISA is another thing and the most appealing factor still is the open nature of the ISA. This is what likely drives the adoption outside of US academia in companies like WD and in academic-industrial projects in Europe (the exascale accelerator) and India (national ISA). The aim for some schools is to produce graduates directly familiar with an ISA and architectures utilized in the industry without additional training.
I do wonder what effect the variable length instruction ecoding have on security if the system software is lacking on those demanding edge use-cases in the future, though.
Smaller products in what way? Saving a fraction of a mm^2 due to simplified decode is a great marketing story without doubt. However if you look at a modern SoC, typically less than 5% is devoted to the actual CPU cores. If the resulting larger codesize means you need to add more cache/flash/DRAM, increase clock frequency to deal with the extra instructions or makes it harder for a compiler to produce efficient code, is it really an optimal system-wide decision?
RISC-V is very similar to MIPS - MIPS never was great at codesize. When optimizing for size, compilers call special library functions to emulate instructions which are available on Arm. So you pay for saving a few transistors with lower performance and higher power consumption.
It's not a MIPS variant. MIPS is based on work at Stanford. RISC-V is the latest incarnation of the Berkeley RISC project. You are probably thinking of SPARC which is a derivative of earlier RISC project work. MIPS is only related in that it comes from similar ideas but the two projects, Stanford and Berkeley were different.
That's like making a big deal about the difference between Spanish and Portuguese. Sure, if you're Spanish this is a big deal. But to the rest of the world they're basically the same thing; created by people in constant contact and with the same world view.
Are they as different as Portuguese and Arabic? Spanish and Chinese? Are you really so ignorant that you don't know the family resemblance of Romance languages?
RISC-V has practically nothing in common with Berkeley RISC-I/SPARC (no condition codes, no register windows etc). Basically Berkeley adopted Stanford's approach to RISC and created a MIPS variant.
Stop calling it a MIPS variant. Just because they reached similar conclusions doesn't mean they are related. By your logic Ryzen is a variant of Core.
Furthermore I'd argue that your criticisms of RISC-V and MIPS lacking instructions misses the entire point of RISC. Storage is cheap. Who cares if the code is bigger? Mobile devices are packing hundreds of gigs of storage and PCs have terabytes today. Save the silicon, every bit counts there when its making heat, drawing power and complicating clock propagation.
Would you prefer it being called a MIPS clone instead? I haven't seen two ISAs with such a great similarity as MIPS and RISC-V.
You're applying 80's RISC dogma which are no longer relevant. Transistors are cheap and efficient today, so we don't need to minimize them. We no longer optimize just the core or decoder but optimize the system as a whole. Who cares if you saved a few mW in the decoder when moving the extra instructions between DRAM and caches costs 10-100 times as much?
The RISC-V focus on simple instructions and decode is as crazy as a cult. They even want to add instruction fusion for eg. indexed accesses. So first simplify decode by leaving out useful instructions, then make it more complex again to try to make up for the missing instructions...
I've no problem with making comparisons to aspects of MIPS but saying its a clone or derivative of it is reductionist.
You're applying 80's RISC dogma which are no longer relevant{/quote]
You do realize that RISC won right? Since Pentium Pro all x86 cores have been internally RISC with a big decoder slapped on the front so nobody had to rehash years of work in their compilers or break legacy code.
The RISC-V focus on simple instructions and decode is as crazy as a cult. They even want to add instruction fusion for eg. indexed accesses. So first simplify decode by leaving out useful instructions, then make it more complex again to try to make up for the missing instructions...
You accuse everyone else of holding on to an 80's dogma yet you are the one who sounds like they are from the 80's. Like some die hard greybeard who want give up their VAX.
ARM nor any instruction set has any inherent advantage over any other. Anybody making a statement like that is just plain ignorant of how modern CPU's are designed. This is besides the fact that if ARM was inherently better than x86 as you claim it would have already displaced x86 on the desktop and server. In fact, every single desktop and server ARM architecture developed so far has fallen on it's face in competition against the x86 processors.
x86 CPU's haven't used x86 instructions internally since the Pentium Pro in the mid 90's. The shift to out of order execution required that an x86 instruction decoder be added and abstraction from the instruction set became the norm. Since the x86 instruction set was abstracted with a hardware abstraction layer I dare say every single Intel CPU since the Pentium pro has used a different internal RISC architecture than every other generation with no two being exactly identical. This has allowed Intel massive flexibility to pursue whatever internal architecture works best with their FAB process while maintaining x86 compatibility through the decoder which occupies almost no space anymore. On modern processors that decoder occupies something like 0.001% of the die and simply translates all those x86 instructions into whatever internal architecture the CPU actually uses.
If I'm not mistake ARM moved to an instruction decoder with the shift to out of order execution as well and their designs since no longer use pure ARM instructions within the core although the simplicity of the ARM risc architecture means they don't need as much abstraction as x86, there is no point in being anchored to the design parameters of the instruction set when hardware decoders are so cheap.
The only reason ARM dominates the markets it does without Intel competition is that Intel is unwilling to compete in those markets at those prices. If Intel was to produce and cell smartphone chips that were competitive in both performance and price with the ARM chips they'd cannibalize their higher margin products when OEM grabbed those chips and started making higher end products by stapling 10 inexpensive cell phone processors together and ending up with a product that's competitive the chips they sell for $1000. That's why on a lot of the cheaper products Intel sells they put restrictions on their use.
You might not remember but Intel went on a design spree in 2008 when there were market indications and predictions that the tablet and smartphone were going to destroy the PC marketplace. They had almost a dozen design teams producing low power and high performance CPU's. The products that came out of that Ranged from Edison on the low end to the server atoms like Avoton that were 25watt 8 core CPU's. Intel's executives canceled most of these products or put major restrictions (such as amount of RAM, wattage, etc) on their use to try to avoid cannibalizing higher margin products (for example Avoton had some ridiculous restrictions such as no more than two memory slots). In this time period they produced a mostly competitive product for smartphones (it was about 5% slower than the highest end qualacom chip at the time) but they didn't sell any because they set the price higher than what Qualacom wanted for their ARM chip. You can find articles on those Chips on google and you will note the reviewers that lamented about the price and restrictions Intel put on the chip because they destroyed it's competitiveness. But that's the thing, Intel's executives and board didn't want to compete in this market.
Intel has always struggled with competing in these lower margin products because they know that if they produce a performant low power chip and sell it ARM cheap (ARM chips typically sell with single digit margins) there will be a dozen OEM's like Dell, HP or Lenovo that start stapling a dozen together and selling them as replacements for very high margin x86 products (Intel has 60% percent margins on their higher end products and can push margins as high as 75% on their server chips).
ARM doesn't have any inherent advantage over Intel or AMD because of their instruction set. They do have a slight advantage because of their business structure allows them to avoid the production side and focus on design and they have a lot of partners to help advance the ecosystem while ARM the company isn't effected by Qualcomm or Broadcom selling ARM chips with 5% margins. But make no mistake, IMO if Intel wanted to slash their margins to the level that the ARM chip makers get (and watch their stock price crater) they could easily put an x86 chip into every market ARM dominates right now and become the number one seller. They choose not to because of the damage it would do to their stock price and the high end market.
That's quite a long-winded way of saying "I don't believe ISA matters"...
But the fact is, it does. Intel spent over $10 Billion to get into the phone/tablet market. They didn't just lower their margins, they slashed them - they literally paid $100 for each chip they "sold"! And despite having a process advantage at the time, the mobile Atoms still weren't competitive on power or performance. Given how hard they tried and how much money they spent, it's safe to say the x86 ISA complexity prevented them making competitive chips.
The same is true at the high end. Mobile phones already have the same single-threaded performance as the fastest x86 CPU you can buy today. Do you think (or hope) it will end there? Arm consistently improves performance by 20-30% per year. In the next few years both Intel and AMD are in for some serious competition from much faster Arm cores in laptops and servers.
Classic SIMD (SSE/AVX or Neon) is not nearly as helpful as Dynamic Scheduling (or Out of order execution). Yes, you can have hand-coded loops with good performance, but that's it. And they only work for very regular code.
In the 80s, instruction sets made a significant difference.
But in the 90s, superscalar out-of-order came out and it beat everything else, by a large margin. These days, that's how you get performance, pretty much (high IPC from dynamic scheduling).
With that "classic SIMD", the instruction set and register width sometimes increased a lot with each generational jump, and developers had been limited to produce code for an ISA a couple generations back: for the lowest-spec hardware that users were expected to own. There have also not been very good development tools and compilers, which have forced developers to hand-code or to use libraries that were geared towards only certain kinds of loops.
The first of these is about to change with new ISA. RISC-V's leading SIMD proposal and the SVE extension to ARM processors use _scalable_ vectors, where the register width is not limited by the ISA but by the specific processor it runs on. These ISAs are therefore expected to remain more stable than classic SIMD ISAs have. Compilers are also now much better than before at auto-vectorising code to run on SIMD hardware. These two improvements together mean that more code could be SIMD instructions, and that more of a processor's potential could be taken advantage of.
High-performance computing has been largely taken over by GPUs, which are in essence super-wide SIMD machines, using predicate vectors for much of its flow control. (Predicates being only late additions to SSE and Neon) The scalable vector proposal for RISC-V is by some considered so promising that there have been even been talks about building GPUs based around the RISC-V SIMD ISA -- optimised for SIMD first and general-compute second.
If their business was selling physical cores, you might have a point, but like ARM they're an IP company. But unlike ARM, adopters don't need an expensive architecture license to develop their own cores, and unlike ARM the architecture is designed for adopters to extend, with well-defined rules for operation encoding to do so. Early adopters are building their own cores, some with standard cores, but many with their own core designs or ISA extensions that would be impracticle in ARM's ecosystem. One of the reasons that companies don't really extend ARM is that you'd need a new ARM architecture license if ARM changes (as they did with ARMv8, say) and now you want to bring your investment forward -- you've locked yourself in to ARM licensing cost, and you're in a hard spot if you don't like the way ARM moves next.
It's also worth noting that RISC-v has taken a lot of time to do their vector ISA right -- not only is the vector ISA homogenous and complete (every suitable scaler op has a vector equivalent) but it's structure is programmer-centric and forward-compatible -- that is, you write the vector code using the appropriate ALU width for the problem, and the CPU runs it across the full vector width it actually has. If you run your vector code on a machine 2 years from now and the vector unit is twice as wide, that same code runs twice as fast, and perhaps twice as fast again in two more years. Or 16 times faster next year on a specialized RISC-V vector accelerator. This is so much better than traditional SIMD ISAs like AVX/SSE/MMX, Altivec, or NEON -- if Intel had done this with their vector ISA, original SSE code would run 8-16 times faster today, instruction-for-instruction.
You scoff at where they are 5 years in, but where they are is competitive with ARM's own current IP. The industry momentum shown by that and the ecosystem buildup around risc-v is incredible.
"where they are is competitive with ARM's own current IP"
It might match performance of a 4.5 year old Cortex-A72 next year, maybe (*). But that's nowhere near being competitive with Arm's current IP... Arm sells much faster and more efficient cores like Cortex-A77.
(*) It's easy to make bold claims in marketing, let's see how it performs in the real world.
Most cellphones sold today are still using 4 or 8 A53 cores. A core that gets better performance in less die area is sure to attract some notice.
More to the point, my raspberry pi 4 with 4x A72@1.5GHz along with a crappy SD card and 4GB of slow, single-Lane RAM is almost fast enough for daily use doing normal consumer things and light software development. 4 of these cores at almost twice the speed paired with slightly better IO and RAM is probably all the more computing most people need.
It would be hard to find a niche that Arm hasn't already covered. Remember that both Cortex-A53 and A72 have fast dual floating point units as well as SIMD, but the U8 doesn't include SIMD, so any area comparisons are going to look great for the U8.
But that seems to have got entangled with a DIFFERENT research concept (namely run the vector engine asynchronously from the rest of the CPU), which certainly can't help with getting the ideas in commercial production on time.
I've no idea how this will play out - full Hwacha (SVE + decoupled execution) - Hwacha as a "normal" sort of instruction set, like SVE, or - commercial partners settle on a smaller NEON-like instruction set to get basic SIMD up and running.
This is obviously a application specific product and not meant to be as universal as ARM or off the shelf RISC SoC's.
It seems their edge is in having efficient execution units to reduce power consumption so this will be good for ultra low power devices that still need decent performance.
How is SVE even remotely comparable to this? An extension to ARM still has the inherent 'flaws' of ARM. That its RISC. Adding x64, SSE, SVX, VX, etc to x86 didn't change that fact its still x86.
You clearly lack the foresight to see this company has a (niche) product that fills a gap in the market.
Ideal RISC does not support SIMD because one of the requirements of RISC is a single instruction does a single operation. SIMD "Single Instruction Multiple Data" is antithetical to that. Even the RISC-V designers view SIMD/VectorProcessing as a necessary evil and purposefully keep it limited.
There is no such requirement in RISC. SIMD is big and complicated of course but so is floating point and it naturally fits with the floating point pipeline.
I mean, look at intel! They've been stuck at 14nm for the past half-decade. And this company can fix their issues quite quickly. They're much more nimble.
Yes because Arduino, ESP8x and RaspPi, all need SIMD and Vect ops. (Bit of sarcasm there) These devices sell in the millions mostly as IOT Edge or embedded control devices.
One of the Intel CxOs, back around the release of the 8086, allowed as how he'd rather have the chip in every Ford than in every PC. Not likely anyone would say so today, but the use of embedded cpu is where this all started, not PC cpu.
What matters, if anyone can do it, is an analysis of dissimilar ISA, ARM v. RISC-V for example, without regard to implementation, e.g. cache size and other 'stretchable' components that depend on engineering of silicon (area, mostly), not abstract architecture. As many have said over the years, RISC machines (real world) have incrementally included CISC instructions.
The 2.3x IPC part is ideal, the processor isn't magically going to never stall etc. If they can actually get as close as 3.1/3.22 that's very good. And yes the wording makes you want to add one but they clearly didn't mean that.
So this is an out-of-order architecture, but does it also involve speculative execution, and if so have they put in some protection against Spectre attacks? I see a branch prediction block in there...
How do people use microprocessors? Do they write programs for them in assembly language? Or do they purchase or download programs that other people have already written? Since it's mostly the latter, what matters isn't the elegance of the architecture, but how much is already written for it. That's why we're going to be stuck with x86 for a while.
No one writes programs in assembly language. They write them in a portable language (C, C++, etc) which can be cross-compiled to various architectures. Or in an interpreted language (JavaScript, Python, etc), which does not care what architecture it's running on.
So it really doesn't matter anymore what the underlying architecture is. It only did (ironically) when people did write programs in assembly, which is architecture dependent.
I would LOVE for this to take off to high performance desktop. Open source, anyone can develop a HP core and computing would take off like never before, rather than relying on the two incumbent CPU makers of x86. Bleh.
Probably find myself in no-man's-land with this, but, as far as I'm concerned, the more choice there is amongst CPU architectures, the better. I don't believe in the one size or type fits all. So, good news that RISC-V is growing; if nothing else, it keeps ARM on their game. And what's wrong with that?
"and ever since SiFive has been in an upward trend of success and hypergrowth."
Is it a paid promotion? If not, can you please avoid using their marketing BS?
SiFive is funded by Intel to try and take some steam from Arm. There is nothing wrong with that, but the ISA has zero innovation and even behind the modern Arm ISA.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
68 Comments
Back to Article
The_Assimilator - Wednesday, October 30, 2019 - link
Half a decade and they don't even have a production part with SIMD or vector capabilities? Arm has nothing to worry about.bji - Wednesday, October 30, 2019 - link
That seems particularly short-sighted. Your criteria for ARM needing to worry apparently is production silicon with vector capabilities and according to this article that will be available next year.PeachNCream - Wednesday, October 30, 2019 - link
It is a valid point right now to argue that RISC-V isn't a threat while production parts are not available to purchase. Ad present, even if I wanted to acquire a competing product, I would be unable to due to the lack of ability to make an acquisition. The situation may change next year, but we won't know until we see what both ARM and RISC-V look like at that point. The dynamics may change on both sides of that coin after time has passed.bji - Wednesday, October 30, 2019 - link
Strawman. He didn't say they are not a threat. He said ARM has nothing to worry about. The distinction is slim I admit and also just semantic, but the sentiment I read from The_Assimilator was more about how this company has not produced a product that ARM would need to worry about in half a decade and by saying that ARM has nothing to worry about, implies that they will not be producing any product that would compete with ARM for a long time. But the article states that the will be releasing the product that The_Assimilator says compete with ARM next year, and any company would naturally worry about a competing product that close on the horizon.In terms of being threatened, ARM cannot be threatened by something that cannot exist. But they may want to worry, i.e. take seriously the notion that a competitor may be threatening them soon.
bji - Wednesday, October 30, 2019 - link
"by something that cannot exist" should have been "by something that does not exist". Post editing, Ananadtech-style!PeachNCream - Wednesday, October 30, 2019 - link
Okay, that's cool.There's never anything wrong with competitive designs. Either way, we are likely to enjoy improvements that result from various interests attempting to get ahead of one another so it's good all the same no matter what the outcome.
Sivar - Wednesday, October 30, 2019 - link
ARM was able to take over because of its inherent advantage over x86: Efficiency. As R&D advanced, ARM cores like the Apple A* line and Qualcomm Snapdragon have slowly worked towards Intel-level performance.MIPS has no large inherent benefit over ARM, but it has some of the same disadvantages vs. x86 such as larger executable code size. MIPS is an elegant architecture and lacks some of the strange baggage the architectures have (read about THUMB some time) but the story of semiconductors does have elegance as a main protagonist. See Intel's history of success as an example.
rrinker - Wednesday, October 30, 2019 - link
Indeed, back in the early 8 bit days, the RCA CDP1802 was absolutely the most elegant architecture, but most people probably day "the what?". Outside of some niche applications, where the completely static nature allowed it to clock down to 0 and draw picoamps, and it being all CMOS so low power even when running, and being available in rad hardened form, it was all but invisible in the pre-IBM PC 8 bit days. Elegance doesn't equal design wins.Sivar - Wednesday, October 30, 2019 - link
Interesting, I hadn't heard of this. I wonder if the same principles could be applied to idle processors today. Now if only Android would let the CPU go idle for more than a few ms.rrinker - Wednesday, October 30, 2019 - link
The registers and such aren't static though, I don't think, on most modern processors, so they have to run at some minimum clock rate to be refreshed. Unused elements can be turned off today, and with turbo you have varying frequency, but the old 1802 could run from 0-4MHz (or 6, or 8, or as high as 12MHz in the last variants. It was almost RISC-like, as well, just 91 instructions, and that very well organized (for Hex, Intel just loved Octal so the 8080 instruction set it highly aligned for octal - it makes no sense in hex). Most 1802 contemporaries had a very small register count - the 1802 has 16x16 general purpose registers plus accumulator and a coupe others. Of the 16, any one could be program counter, which could be switched on the fly (that's how they did subroutine calls with no CALL instruction), and any could be the index pointer for memory directed operations, also switchable on the fly, which means the 1802 has the best operand ever - SEX, for SEt indeX. It was my first computer, and later when I had to add an assembly routing to an Apple 2 BASIC program to get proper performance, I was hugely frustrated by the lack of registers int he 6502. SO many people worship the 6502, I frankly hated the thing. 1802 remains my favorite 8-bitter, followed by the Z80. Best part is, that computer I built from a kit more than 40 years ago still works perfectly.AshlayW - Saturday, November 2, 2019 - link
Thanks for the read :) very informativehecksagon - Sunday, November 3, 2019 - link
A bit of research shows that Intel, AMD, and all the major ARM SOC vendors are using SRAM for all cache. Intel does use some DRAM in its Iris Pro graphics, which can be used by the CPU cache.peevee - Tuesday, November 5, 2019 - link
"The registers and such aren't static though"They absolutely are.
name99 - Wednesday, October 30, 2019 - link
I'd put it differently.Pragmatism equals design wins, and I see ARM as the most pragmatic company out there.
Intel insisted on x86 uber alles, even where it made no sense (Larrabee, mobile) and paid the price.
RISC-V has insisted on a certain kind of intellectual purity that makes no sense in terms of commerce, or the future properties of CPU manufacturing (plentiful transistors).
ARM on the other hand, has always done a really masterful job of adding enough new functionality to get what they need at not too much cost, of changing the ISA when appropriate (but not too often), of accepting that some markets (like ARM-M) need different types of vector/DSP extensions from what's appropriate for ARM-A.
bji - Wednesday, October 30, 2019 - link
But this is not the MIPS ISA. It's a new ISA. Why are you mentioning MIPS?name99 - Wednesday, October 30, 2019 - link
It feels very much like MIPS. (Since it comes from much the same people, substantially [too much so, IMHO] unchanged in those beliefs since the 1980s.)Wilco1 - Wednesday, October 30, 2019 - link
It's a MIPS variant indeed. This is why it's so funny when people try to claim it's a modern ISA - it's literally based on 80's RISCs. Same people, same minimalistic approach to reducing instructions at the cost of larger codesize and lower performance. No lessons learnt from MIPS...bji - Wednesday, October 30, 2019 - link
ARM is literally 80's RISC too. What is your point? That nobody who have designed an ISA in the past can make a better ISA in the future?Wilco1 - Wednesday, October 30, 2019 - link
My point is it's the same people repeating the exact same mistakes. It has the same issues as MIPS like no register offset addressing or base with update. Some things are worse, for example branch ranges and immediate ranges are smaller than MIPS. That's what you get when you're stuck in the 80's dogma of making decode as simple as possible...Arm never did things like the other RISCs. Is it possible to learn and do better today? Sure, look at AArch64 for example.
name99 - Thursday, October 31, 2019 - link
That's an exceptionally silly riposte. Are you unaware that ARM has constantly evolved their instruction set, not just tweaks but experimenting with substantial changes (like Thumb and Thumb2)?There is a HUGE amount of learning that informed ARMv8, from the dropping of predication and shifting everywhere, to the way constants are encoded, to high-impact ideas like load/store pair and their particular version of conditional selection, to the codification of the memory ordering rules.
Look at SVE as the newest version of something very different from what they were doing earlier.
peevee - Tuesday, November 5, 2019 - link
"ARM is literally 80's RISC too"Armv8? No it is not. It has very many complex instructions, more than any CISC CPU from the 80s.
TeXWiller - Wednesday, October 30, 2019 - link
On the other hand you can find demonstrations on how targeting RISC-V ISA can produce smaller end-products compared to targeting ARM or specifically MIPS.Modularity of the ISA is another thing and the most appealing factor still is the open nature of the ISA. This is what likely drives the adoption outside of US academia in companies like WD and in academic-industrial projects in Europe (the exascale accelerator) and India (national ISA). The aim for some schools is to produce graduates directly familiar with an ISA and architectures utilized in the industry without additional training.
I do wonder what effect the variable length instruction ecoding have on security if the system software is lacking on those demanding edge use-cases in the future, though.
Wilco1 - Wednesday, October 30, 2019 - link
Smaller products in what way? Saving a fraction of a mm^2 due to simplified decode is a great marketing story without doubt. However if you look at a modern SoC, typically less than 5% is devoted to the actual CPU cores. If the resulting larger codesize means you need to add more cache/flash/DRAM, increase clock frequency to deal with the extra instructions or makes it harder for a compiler to produce efficient code, is it really an optimal system-wide decision?TeXWiller - Wednesday, October 30, 2019 - link
I meant in terms of codesize as that was one of the bases of the MIPS comparison. Sorry for the confusion.Wilco1 - Thursday, October 31, 2019 - link
RISC-V is very similar to MIPS - MIPS never was great at codesize. When optimizing for size, compilers call special library functions to emulate instructions which are available on Arm. So you pay for saving a few transistors with lower performance and higher power consumption.zmatt - Thursday, October 31, 2019 - link
It's not a MIPS variant. MIPS is based on work at Stanford. RISC-V is the latest incarnation of the Berkeley RISC project. You are probably thinking of SPARC which is a derivative of earlier RISC project work. MIPS is only related in that it comes from similar ideas but the two projects, Stanford and Berkeley were different.name99 - Thursday, October 31, 2019 - link
That's like making a big deal about the difference between Spanish and Portuguese.Sure, if you're Spanish this is a big deal. But to the rest of the world they're basically the same thing; created by people in constant contact and with the same world view.
zmatt - Friday, November 1, 2019 - link
Well, Spanish and Portuguese are different. And claiming they are the same gets you labeled as either an idiot or a bigot.name99 - Friday, November 1, 2019 - link
Are they as different as Portuguese and Arabic? Spanish and Chinese?Are you really so ignorant that you don't know the family resemblance of Romance languages?
Wilco1 - Thursday, October 31, 2019 - link
RISC-V has practically nothing in common with Berkeley RISC-I/SPARC (no condition codes, no register windows etc). Basically Berkeley adopted Stanford's approach to RISC and created a MIPS variant.zmatt - Friday, November 1, 2019 - link
Stop calling it a MIPS variant. Just because they reached similar conclusions doesn't mean they are related. By your logic Ryzen is a variant of Core.Furthermore I'd argue that your criticisms of RISC-V and MIPS lacking instructions misses the entire point of RISC. Storage is cheap. Who cares if the code is bigger? Mobile devices are packing hundreds of gigs of storage and PCs have terabytes today. Save the silicon, every bit counts there when its making heat, drawing power and complicating clock propagation.
Wilco1 - Friday, November 1, 2019 - link
Would you prefer it being called a MIPS clone instead? I haven't seen two ISAs with such a great similarity as MIPS and RISC-V.You're applying 80's RISC dogma which are no longer relevant. Transistors are cheap and efficient today, so we don't need to minimize them. We no longer optimize just the core or decoder but optimize the system as a whole. Who cares if you saved a few mW in the decoder when moving the extra instructions between DRAM and caches costs 10-100 times as much?
The RISC-V focus on simple instructions and decode is as crazy as a cult. They even want to add instruction fusion for eg. indexed accesses. So first simplify decode by leaving out useful instructions, then make it more complex again to try to make up for the missing instructions...
zmatt - Monday, November 4, 2019 - link
I've no problem with making comparisons to aspects of MIPS but saying its a clone or derivative of it is reductionist.Threska - Wednesday, November 6, 2019 - link
Storage is cheap. Bandwidth isn't. Moving more around to get the same effect isn't always better.rahvin - Friday, November 1, 2019 - link
ARM nor any instruction set has any inherent advantage over any other. Anybody making a statement like that is just plain ignorant of how modern CPU's are designed. This is besides the fact that if ARM was inherently better than x86 as you claim it would have already displaced x86 on the desktop and server. In fact, every single desktop and server ARM architecture developed so far has fallen on it's face in competition against the x86 processors.x86 CPU's haven't used x86 instructions internally since the Pentium Pro in the mid 90's. The shift to out of order execution required that an x86 instruction decoder be added and abstraction from the instruction set became the norm. Since the x86 instruction set was abstracted with a hardware abstraction layer I dare say every single Intel CPU since the Pentium pro has used a different internal RISC architecture than every other generation with no two being exactly identical. This has allowed Intel massive flexibility to pursue whatever internal architecture works best with their FAB process while maintaining x86 compatibility through the decoder which occupies almost no space anymore. On modern processors that decoder occupies something like 0.001% of the die and simply translates all those x86 instructions into whatever internal architecture the CPU actually uses.
If I'm not mistake ARM moved to an instruction decoder with the shift to out of order execution as well and their designs since no longer use pure ARM instructions within the core although the simplicity of the ARM risc architecture means they don't need as much abstraction as x86, there is no point in being anchored to the design parameters of the instruction set when hardware decoders are so cheap.
The only reason ARM dominates the markets it does without Intel competition is that Intel is unwilling to compete in those markets at those prices. If Intel was to produce and cell smartphone chips that were competitive in both performance and price with the ARM chips they'd cannibalize their higher margin products when OEM grabbed those chips and started making higher end products by stapling 10 inexpensive cell phone processors together and ending up with a product that's competitive the chips they sell for $1000. That's why on a lot of the cheaper products Intel sells they put restrictions on their use.
You might not remember but Intel went on a design spree in 2008 when there were market indications and predictions that the tablet and smartphone were going to destroy the PC marketplace. They had almost a dozen design teams producing low power and high performance CPU's. The products that came out of that Ranged from Edison on the low end to the server atoms like Avoton that were 25watt 8 core CPU's. Intel's executives canceled most of these products or put major restrictions (such as amount of RAM, wattage, etc) on their use to try to avoid cannibalizing higher margin products (for example Avoton had some ridiculous restrictions such as no more than two memory slots). In this time period they produced a mostly competitive product for smartphones (it was about 5% slower than the highest end qualacom chip at the time) but they didn't sell any because they set the price higher than what Qualacom wanted for their ARM chip. You can find articles on those Chips on google and you will note the reviewers that lamented about the price and restrictions Intel put on the chip because they destroyed it's competitiveness. But that's the thing, Intel's executives and board didn't want to compete in this market.
Intel has always struggled with competing in these lower margin products because they know that if they produce a performant low power chip and sell it ARM cheap (ARM chips typically sell with single digit margins) there will be a dozen OEM's like Dell, HP or Lenovo that start stapling a dozen together and selling them as replacements for very high margin x86 products (Intel has 60% percent margins on their higher end products and can push margins as high as 75% on their server chips).
ARM doesn't have any inherent advantage over Intel or AMD because of their instruction set. They do have a slight advantage because of their business structure allows them to avoid the production side and focus on design and they have a lot of partners to help advance the ecosystem while ARM the company isn't effected by Qualcomm or Broadcom selling ARM chips with 5% margins. But make no mistake, IMO if Intel wanted to slash their margins to the level that the ARM chip makers get (and watch their stock price crater) they could easily put an x86 chip into every market ARM dominates right now and become the number one seller. They choose not to because of the damage it would do to their stock price and the high end market.
Wilco1 - Saturday, November 2, 2019 - link
That's quite a long-winded way of saying "I don't believe ISA matters"...But the fact is, it does. Intel spent over $10 Billion to get into the phone/tablet market. They didn't just lower their margins, they slashed them - they literally paid $100 for each chip they "sold"! And despite having a process advantage at the time, the mobile Atoms still weren't competitive on power or performance. Given how hard they tried and how much money they spent, it's safe to say the x86 ISA complexity prevented them making competitive chips.
The same is true at the high end. Mobile phones already have the same single-threaded performance as the fastest x86 CPU you can buy today. Do you think (or hope) it will end there? Arm consistently improves performance by 20-30% per year. In the next few years both Intel and AMD are in for some serious competition from much faster Arm cores in laptops and servers.
vladpetric - Wednesday, October 30, 2019 - link
Classic SIMD (SSE/AVX or Neon) is not nearly as helpful as Dynamic Scheduling (or Out of order execution). Yes, you can have hand-coded loops with good performance, but that's it. And they only work for very regular code.In the 80s, instruction sets made a significant difference.
But in the 90s, superscalar out-of-order came out and it beat everything else, by a large margin. These days, that's how you get performance, pretty much (high IPC from dynamic scheduling).
Threska - Friday, November 1, 2019 - link
"But in the 90s, superscalar out-of-order came out and it beat everything else, by a large margin."And now we're paying the security price.
vladpetric - Thursday, November 7, 2019 - link
At this time, turn off hyper-threading and you'll be fine.Findecanor - Sunday, November 3, 2019 - link
With that "classic SIMD", the instruction set and register width sometimes increased a lot with each generational jump, and developers had been limited to produce code for an ISA a couple generations back: for the lowest-spec hardware that users were expected to own.There have also not been very good development tools and compilers, which have forced developers to hand-code or to use libraries that were geared towards only certain kinds of loops.
The first of these is about to change with new ISA. RISC-V's leading SIMD proposal and the SVE extension to ARM processors use _scalable_ vectors, where the register width is not limited by the ISA but by the specific processor it runs on. These ISAs are therefore expected to remain more stable than classic SIMD ISAs have.
Compilers are also now much better than before at auto-vectorising code to run on SIMD hardware.
These two improvements together mean that more code could be SIMD instructions, and that more of a processor's potential could be taken advantage of.
High-performance computing has been largely taken over by GPUs, which are in essence super-wide SIMD machines, using predicate vectors for much of its flow control. (Predicates being only late additions to SSE and Neon)
The scalable vector proposal for RISC-V is by some considered so promising that there have been even been talks about building GPUs based around the RISC-V SIMD ISA -- optimised for SIMD first and general-compute second.
vladpetric - Thursday, November 7, 2019 - link
You're right. RISC-V SIMD, as opposed to classic SIMD, is really something to be excited about.I really disagree about auto-vectorising though, unless we're talking about FORTRAN code.
The parent post implied that not having classic SIMD in RISC-V is something of a showstopper.
ravyne - Wednesday, October 30, 2019 - link
If their business was selling physical cores, you might have a point, but like ARM they're an IP company. But unlike ARM, adopters don't need an expensive architecture license to develop their own cores, and unlike ARM the architecture is designed for adopters to extend, with well-defined rules for operation encoding to do so. Early adopters are building their own cores, some with standard cores, but many with their own core designs or ISA extensions that would be impracticle in ARM's ecosystem. One of the reasons that companies don't really extend ARM is that you'd need a new ARM architecture license if ARM changes (as they did with ARMv8, say) and now you want to bring your investment forward -- you've locked yourself in to ARM licensing cost, and you're in a hard spot if you don't like the way ARM moves next.It's also worth noting that RISC-v has taken a lot of time to do their vector ISA right -- not only is the vector ISA homogenous and complete (every suitable scaler op has a vector equivalent) but it's structure is programmer-centric and forward-compatible -- that is, you write the vector code using the appropriate ALU width for the problem, and the CPU runs it across the full vector width it actually has. If you run your vector code on a machine 2 years from now and the vector unit is twice as wide, that same code runs twice as fast, and perhaps twice as fast again in two more years. Or 16 times faster next year on a specialized RISC-V vector accelerator. This is so much better than traditional SIMD ISAs like AVX/SSE/MMX, Altivec, or NEON -- if Intel had done this with their vector ISA, original SSE code would run 8-16 times faster today, instruction-for-instruction.
You scoff at where they are 5 years in, but where they are is competitive with ARM's own current IP. The industry momentum shown by that and the ecosystem buildup around risc-v is incredible.
Wilco1 - Wednesday, October 30, 2019 - link
"where they are is competitive with ARM's own current IP"It might match performance of a 4.5 year old Cortex-A72 next year, maybe (*). But that's nowhere near being competitive with Arm's current IP... Arm sells much faster and more efficient cores like Cortex-A77.
(*) It's easy to make bold claims in marketing, let's see how it performs in the real world.
quadrivial - Wednesday, October 30, 2019 - link
Most cellphones sold today are still using 4 or 8 A53 cores. A core that gets better performance in less die area is sure to attract some notice.More to the point, my raspberry pi 4 with 4x A72@1.5GHz along with a crappy SD card and 4GB of slow, single-Lane RAM is almost fast enough for daily use doing normal consumer things and light software development. 4 of these cores at almost twice the speed paired with slightly better IO and RAM is probably all the more computing most people need.
Wilco1 - Thursday, October 31, 2019 - link
It would be hard to find a niche that Arm hasn't already covered. Remember that both Cortex-A53 and A72 have fast dual floating point units as well as SIMD, but the U8 doesn't include SIMD, so any area comparisons are going to look great for the U8.name99 - Wednesday, October 30, 2019 - link
Parts of the RISC-V community have been investigating an instruction set that looks like SVEhttps://content.riscv.org/wp-content/uploads/2018/...
But that seems to have got entangled with a DIFFERENT research concept (namely run the vector engine asynchronously from the rest of the CPU), which certainly can't help with getting the ideas in commercial production on time.
I've no idea how this will play out
- full Hwacha (SVE + decoupled execution)
- Hwacha as a "normal" sort of instruction set, like SVE, or
- commercial partners settle on a smaller NEON-like instruction set to get basic SIMD up and running.
Samus - Wednesday, October 30, 2019 - link
This is obviously a application specific product and not meant to be as universal as ARM or off the shelf RISC SoC's.It seems their edge is in having efficient execution units to reduce power consumption so this will be good for ultra low power devices that still need decent performance.
cpuaddicted - Wednesday, October 30, 2019 - link
You mean not unlike the SVE capable ARM chips available today? lolSamus - Thursday, October 31, 2019 - link
How is SVE even remotely comparable to this? An extension to ARM still has the inherent 'flaws' of ARM. That its RISC. Adding x64, SSE, SVX, VX, etc to x86 didn't change that fact its still x86.You clearly lack the foresight to see this company has a (niche) product that fills a gap in the market.
bcronce - Wednesday, October 30, 2019 - link
Ideal RISC does not support SIMD because one of the requirements of RISC is a single instruction does a single operation. SIMD "Single Instruction Multiple Data" is antithetical to that. Even the RISC-V designers view SIMD/VectorProcessing as a necessary evil and purposefully keep it limited.Wilco1 - Thursday, October 31, 2019 - link
There is no such requirement in RISC. SIMD is big and complicated of course but so is floating point and it naturally fits with the floating point pipeline.Lbibass - Tuesday, November 5, 2019 - link
I mean, look at intel! They've been stuck at 14nm for the past half-decade. And this company can fix their issues quite quickly. They're much more nimble.digitalgriffin - Wednesday, November 6, 2019 - link
Yes because Arduino, ESP8x and RaspPi, all need SIMD and Vect ops. (Bit of sarcasm there) These devices sell in the millions mostly as IOT Edge or embedded control devices.FunBunny2 - Wednesday, October 30, 2019 - link
One of the Intel CxOs, back around the release of the 8086, allowed as how he'd rather have the chip in every Ford than in every PC. Not likely anyone would say so today, but the use of embedded cpu is where this all started, not PC cpu.What matters, if anyone can do it, is an analysis of dissimilar ISA, ARM v. RISC-V for example, without regard to implementation, e.g. cache size and other 'stretchable' components that depend on engineering of silicon (area, mostly), not abstract architecture. As many have said over the years, RISC machines (real world) have incrementally included CISC instructions.
name99 - Wednesday, October 30, 2019 - link
It's SiFive's first OoO core, not the first RISC-V OoO core.BOOM (Berkeley Out of Order Machine) is from around 2016
https://github.com/riscv-boom/riscv-boom
levizx - Wednesday, October 30, 2019 - link
2.3X IPC * 1.4X F = 3.22X PERFand since 2.3, 1.4 are "higher" while 3.1 is "total", it actually should be
3.3*2.4=7.92X performance >> 3.1X
Something isn't right.
The_Assimilator - Wednesday, October 30, 2019 - link
It's marketing to secure more funding because the company doesn't actually have any real silicon to show, what do you expect?surt - Saturday, November 2, 2019 - link
The 2.3x IPC part is ideal, the processor isn't magically going to never stall etc. If they can actually get as close as 3.1/3.22 that's very good. And yes the wording makes you want to add one but they clearly didn't mean that.EugeneBelford - Wednesday, October 30, 2019 - link
Kate Libby: RISC is goodSamus - Friday, November 1, 2019 - link
Hackers predicted so much yet nothing at all :)GreenReaper - Thursday, October 31, 2019 - link
So this is an out-of-order architecture, but does it also involve speculative execution, and if so have they put in some protection against Spectre attacks? I see a branch prediction block in there...notashill - Friday, November 1, 2019 - link
Of course it has speculative execution, just like almost every other CPU made in the past 25 or so years.quadibloc - Thursday, October 31, 2019 - link
How do people use microprocessors? Do they write programs for them in assembly language? Or do they purchase or download programs that other people have already written? Since it's mostly the latter, what matters isn't the elegance of the architecture, but how much is already written for it. That's why we're going to be stuck with x86 for a while.andychow - Thursday, October 31, 2019 - link
No one writes programs in assembly language. They write them in a portable language (C, C++, etc) which can be cross-compiled to various architectures. Or in an interpreted language (JavaScript, Python, etc), which does not care what architecture it's running on.So it really doesn't matter anymore what the underlying architecture is. It only did (ironically) when people did write programs in assembly, which is architecture dependent.
AshlayW - Saturday, November 2, 2019 - link
I would LOVE for this to take off to high performance desktop. Open source, anyone can develop a HP core and computing would take off like never before, rather than relying on the two incumbent CPU makers of x86. Bleh.eastcoast_pete - Sunday, November 3, 2019 - link
Probably find myself in no-man's-land with this, but, as far as I'm concerned, the more choice there is amongst CPU architectures, the better. I don't believe in the one size or type fits all. So, good news that RISC-V is growing; if nothing else, it keeps ARM on their game. And what's wrong with that?peevee - Tuesday, November 5, 2019 - link
"and ever since SiFive has been in an upward trend of success and hypergrowth."Is it a paid promotion? If not, can you please avoid using their marketing BS?
SiFive is funded by Intel to try and take some steam from Arm. There is nothing wrong with that, but the ISA has zero innovation and even behind the modern Arm ISA.
GreenReaper - Thursday, November 7, 2019 - link
It's called a "content-led campaign": https://www.futureplc.com/services/advertising/