The Pursuit of Clock Speed

Thus far I have pointed out that a number of resources in Bulldozer have gone down in number compared to their abundance in AMD's Phenom II architecture. Many of these tradeoffs were made in order to keep die size in check while adding new features (e.g. wider front end, larger queues/data structures, new instruction support). Everywhere from the Bulldozer front-end through the execution clusters, AMD's opportunity to increase performance depends on both efficiency and clock speed. Bulldozer has to make better use of its resources than Phenom II as well as run at higher frequencies to outperform its predecessor. As a result, a major target for Bulldozer was to be able to scale to higher clock speeds.

AMD's architects called this pursuit a low gate count per pipeline stage design. By reducing the number of gates per pipeline stage, you reduce the time spent in each stage and can increase the overall frequency of the processor. If this sounds familiar, it's because Intel used similar logic in the creation of the Pentium 4.

Where Bulldozer is different is AMD insists the design didn't aggressively pursue frequency like the P4, but rather aggressively pursued gate count reduction per stage. According to AMD, the former results in power problems while the latter is more manageable.

AMD's target for Bulldozer was a 30% higher frequency than the previous generation architecture. Unfortunately that's a fairly vague statement and I couldn't get AMD to commit to anything more pronounced, but if we look at the top-end Phenom II X6 at 3.3GHz a 30% increase in frequency would put Bulldozer at 4.3GHz.

Unfortunately 4.3GHz isn't what the top-end AMD FX CPU ships at. The best we'll get at launch is 3.6GHz, a meager 9% increase over the outgoing architecture. Turbo Core does get AMD close to those initial frequency targets, however the turbo frequencies are only typically seen for very short periods of time.

As you may remember from the Pentium 4 days, a significantly deeper pipeline can bring with it significant penalties. We have two prior examples of architectures that increased pipeline length over their predecessors: Willamette and Prescott.

Willamette doubled the pipeline length of the P6 and it was due to make up for it by the corresponding increase in clock frequency. If you do less per clock cycle, you need to throw more clock cycles at the problem to have a neutral impact on performance. Although Willamette ran at higher clock speeds than the outgoing P6 architecture, the increase in frequency was gated by process technology. It wasn't until Northwood arrived that Intel could hit the clock speeds required to truly put distance between its newest and older architectures.

Prescott lengthened the pipeline once more, this time quite significantly. Much to our surprise however, thanks to a lot of clever work on the architecture side Intel was able to keep average instructions executed per clock constant while increasing the length of the pipe. This enabled Prescott to hit higher frequencies and deliver more performance at the same time, without starting at an inherent disadvantage. Where Prescott did fall short however was in the power consumption department. Running at extremely high frequencies required very high voltages and as a result, power consumption skyrocketed.

AMD's goal with Bulldozer was to have IPC remain constant compared to its predecessor, while increasing frequency, similar to Prescott. If IPC can remain constant, any frequency increases will translate into performance advantages. AMD attempted to do this through a wider front end, larger data structures within the chip and a wider execution path through each core. In many senses it succeeded, however single threaded performance still took a hit compared to Phenom II:

 

Cinebench 11.5 - Single Threaded

At the same clock speed, Phenom II is almost 7% faster per core than Bulldozer according to our Cinebench results. This takes into account all of the aforementioned IPC improvements. Despite AMD's efforts, IPC went down.

A slight reduction in IPC however is easily made up for by an increase in operating frequency. Unfortunately, it doesn't appear that AMD was able to hit the clock targets it needed for Bulldozer this time around.

We've recently reported on Global Foundries' issues with 32nm yields. I can't help but wonder if the same type of issues that are impacting Llano today are also holding Bulldozer back.

The Architecture Power Management and Real Turbo Core
Comments Locked

430 Comments

View All Comments

  • dingetje - Wednesday, October 12, 2011 - link

    i agree with some that bulldozer is more like faildozer, but...

    let's keep supporting amd so the one getting piledrive'd in the naughty place will not be you when intel has zero competition left because you did not want to spend a little more for a little less....and let's be honest, it IS just a little.

    if enough ppl drop amd, in the end WE will be the one paying for amd's lack of support.

    at least amd is trying.....the question is, what are YOU going to do to stop intel becoming your bunghole-piledriving overlord?
  • wolfman3k5 - Wednesday, October 12, 2011 - link

    Supporting incompetence is like socialism (or even communism). Eventually those that are supported will sit around like dogs all day and do nothing but lick their hairy balls...
  • dingetje - Wednesday, October 12, 2011 - link

    ah...someone has been brainwashed by watching to much fox news.
    communism baaaad boogabooga!! ....duhhhhh lol roflmao

    sure, capitalism works...however, it only works when there actually IS competition.
    i wish your (most likely already loose) rectum good luck.
  • wolfman3k5 - Wednesday, October 12, 2011 - link

    Apparently money won't motivate the Monkey Engineers at AMD, so maybe making fun of them will. I mean, where is their pride, right?

    By the way, I've seen real socialism, so I have a clue what it is. And it is what I just described. I don't like Intel because they are not healthy for our economy, yet their only competition just pulled a gigantic fuck-up.
  • dingetje - Wednesday, October 12, 2011 - link

    oooooo oooga boooga socialism is bad....it take away aaalll you money...it verrry baddd.....oooooogabooogaboooo!! LOL

    have fun getting eaten alive by china after your capitalistic model became cancerous and will die from the inside out.

    your country is bought and paid for and will be eaten alive by the "communistic" chinese who are in fact just the same as what the usa has become: a corporate dictatorship (not communism and certainly not socialism).

    sorry, i didnt mean to scare you more than you obviously already are.
    i would send you some lube to easy the pain, but i'm all out ;)
  • UberApfel - Wednesday, October 12, 2011 - link

    My god you're all so retarded...

    Dingetje; China has serious issues when it comes to the welfare of their people. China only owns 10% of our debt, and that is thanks to China becoming capitalistic as a nation.

    Wolfman; Bulldozer is a server procressor. The server market is where the money is especially with the cloud and enthusiast-class desktops becoming rare. Intel has 30X AMD's market capital... they can afford to target multiple markets. AMD can't.

    Bulldozer is superior with integer processing in both performance-per-core and performance-per-watt. Of course; I do wonder why desktop applications even need floating point... (numbers < -2^63 or > 2^63)
  • hasu - Wednesday, October 12, 2011 - link

    Like wise... killing or trying to control competition is also communism.
  • radium69 - Wednesday, October 12, 2011 - link

    Jeebus, that power consumption is going through the roof!
    Also there were some rumors that it would go up to 8Ghz, I wonder if would use a Kw by then...

    I want to see how they compare to each other when overclocked to 4,5 or more or less.
    Also Anand, can you do a efficiency test? Various overclocking speeds and bench these while monitoring the power consumption. Might make an interesting article :)
  • ypsylon - Wednesday, October 12, 2011 - link

    Not really - even including AMD fanboys. AMD can't understand that to move forward you must abolish old stuff for good. Brand new and spanking Bulldozer has it roots in ancient K6. Do something new for crying out loud or get lost and stop wasting time. Don't release CPUs just for the sake of offering something. That is not the point of CPU market. Even Intel can shoot themselves in the foot with X79. Looks like it will be similar failure to FailDozer. Nobody will invest in entirely new platform for 10 maybe 15% performance boost over X58 which is the new 775 socket. Long live the S1366! Plenty of life and fuel left in Nehalems, plenty... If you wanted to buy Bulldozer then go and buy X58 platform. After nearly 4 years on the market it is [somewhat ;)] dirt cheap.

    Anand one thing: I find it puzzling that you reckon that Bulldozer will do well in server environments. With that kind of performance/Watt and inefficient power management? No chance in hell. i7/Xenons will eat FailDozers for breakfast.
  • wolfman3k5 - Wednesday, October 12, 2011 - link

    I'm not. I completely agree with everything that you've said.

    And, if I might add: Dear AMD, and dear AMD engineers (and lazy fucks that you are), throwing more cache at an already inefficient architecture is not going to solve your problem. Add to that that you people (yes, you AMD people) are calling a 4 Core CPU an 8 Core because you've added another Integer Unit to each core. WTF?! That's almost like calling a quad core Intel 2600K and 8 Core CPU because it has Hyper Threading.

    I have been an avid AMD supporter since 1996. I have spent many thousands of dollars on your CPUs and other hardware that you people make. I'm done. Not another penny! Ever!

Log in

Don't have an account? Sign up now