CPU Performance

Readers of our motherboard review section will have noted the trend in modern motherboards to implement a form of MultiCore Enhancement / Acceleration / Turbo (read our report here) on their motherboards. This does several things, including better benchmark results at stock settings (not entirely needed if overclocking is an end-user goal) at the expense of heat and temperature. It also gives in essence an automatic overclock which may be against what the user wants. Our testing methodology is ‘out-of-the-box’, with the latest public BIOS installed and XMP enabled, and thus subject to the whims of this feature. It is ultimately up to the motherboard manufacturer to take this risk – and manufacturers taking risks in the setup is something they do on every product (think C-state settings, USB priority, DPC Latency / monitoring priority, memory subtimings at JEDEC). Processor speed change is part of that risk, and ultimately if no overclocking is planned, some motherboards will affect how fast that shiny new processor goes and can be an important factor in the system build.

For reference, the ASRock X99 WS-E/10G has MultiCore Turbo enabled by default on BIOS 1.01A.

Point Calculations – 3D Movement Algorithm Test: link

3DPM is a self-penned benchmark, taking basic 3D movement algorithms used in Brownian Motion simulations and testing them for speed. High floating point performance, MHz and IPC wins in the single thread version, whereas the multithread version has to handle the threads and loves more cores.

3D Particle Movement: Single Threaded

3D Particle Movement: MultiThreaded

Compression – WinRAR 5.0.1: link

Our WinRAR test from 2013 is updated to the latest version of WinRAR at the start of 2014. We compress a set of 2867 files across 320 folders totaling 1.52 GB in size – 95% of these files are small typical website files, and the rest (90% of the size) are small 30 second 720p videos.

WinRAR 5.01, 2867 files, 1.52 GB

Image Manipulation – FastStone Image Viewer 4.9: link

Similarly to WinRAR, the FastStone test us updated for 2014 to the latest version. FastStone is the program I use to perform quick or bulk actions on images, such as resizing, adjusting for color and cropping. In our test we take a series of 170 images in various sizes and formats and convert them all into 640x480 .gif files, maintaining the aspect ratio. FastStone does not use multithreading for this test, and thus single threaded performance is often the winner.

FastStone Image Viewer 4.9

Video Conversion – Handbrake v0.9.9: link

Handbrake is a media conversion tool that was initially designed to help DVD ISOs and Video CDs into more common video formats. The principle today is still the same, primarily as an output for H.264 + AAC/MP3 audio within an MKV container. In our test we use the same videos as in the Xilisoft test, and results are given in frames per second.

HandBrake v0.9.9 LQ Film

HandBrake v0.9.9 2x4K

Rendering – PovRay 3.7: link

The Persistence of Vision RayTracer, or PovRay, is a freeware package for as the name suggests, ray tracing. It is a pure renderer, rather than modeling software, but the latest beta version contains a handy benchmark for stressing all processing threads on a platform. We have been using this test in motherboard reviews to test memory stability at various CPU speeds to good effect – if it passes the test, the IMC in the CPU is stable for a given CPU speed. As a CPU test, it runs for approximately 2-3 minutes on high end platforms.

POV-Ray 3.7 Beta RC4

Synthetic – 7-Zip 9.2: link

As an open source compression tool, 7-Zip is a popular tool for making sets of files easier to handle and transfer. The software offers up its own benchmark, to which we report the result.

7-zip Benchmark

System Performance Gaming Performance
Comments Locked

45 Comments

View All Comments

  • gsvelto - Tuesday, December 16, 2014 - link

    Where I worked we had extensive 10G SFP+ deployments with ping latency measured in single-digit µs. The latency numbers you gave are for pure-throughput oriented, low CPU overhead transfers and are obviously unacceptable if your applications are latency sensitive. Obtaining those numbers usually requires tweaking your power-scaling/idle governors as well as kernel offloads. The benefits you get are very significant on a number of loads (e.g. lots of small file over NFS for example) and 10GBase-T can be a lot slower on those workloads. But as I mentioned in my previous post 10GBase-T is not only slower, it's also more expensive, more power hungry and has a minimum physical transfer size of 400 bytes. So if you're load is composed of small packets and you don't have the luxury of aggregating them (because latency matters) then your maximum achievable bandwidth is greatly diminished.
  • shodanshok - Wednesday, December 17, 2014 - link

    Sure, packet size play a far bigger role for 10GBase-T then optical (or even copper) SFP+ links.

    Anyway, the pings tried before were for relatively small IP packets (physical size = 84 bytes), which are way lower then typical packet size.

    For message-passing workloads SFP+ is surely a better fit, but for MPI it is generally better to use more latency-oriented protocol stacks (if I don't go wrong, Infiniband use a lightweight protocol stack for this very reason).

    Regards.
  • T2k - Monday, December 15, 2014 - link

    Nonsense. CAT6a or even CAT6 would work just fine.
  • Daniel Egger - Monday, December 15, 2014 - link

    You're missing the point. Sure Cat.6a would be sufficient (it's hard to find Cat.7 sockets anyway but the cabling used nowadays is mostly Cat.7 specced, not Cat.6a) but the problem is to end up with a properly balanced wiring that is capable of properly establishing such a link. Also copper cabling deteriorates over time so the measurement protocol might not be worth snitch by the time you try to establish a 10GBase-T connection...

    Cat.6 is only usable with special qualification (TIA-155-A) over short distances.
  • DCide - Tuesday, December 16, 2014 - link

    I don't think T2k's missing the point at all. Those cables will work fine - especially for the target market for this board.

    You also had a number of other objections a few weeks ago, when this board was announced. Thankfully most of those have already been answered in the excellent posts here. It's indeed quite possible (and practical) to use the full 10GBase-T bandwidth right now, whether making a single transfer between two machines or serving multiple clients. At the time you said this was *very* difficult, implying no one will be able to take advantage of it. Fortunately, ASRock engineers understood the (very attainable) potential better than this. Hopefully now the market will embrace it, and we'll see more boards like this. Then we'll once again see network speeds that can keep up with everyday storage media (at least for a while).
  • shodanshok - Tuesday, December 16, 2014 - link

    You are right, but the familiar RJ45 & cables can be a strong motivation to go with 10GBase-T in some cases. For a quick example: one of our customer bought two Dell 720xd to use as virtualization boxes. The first R720xd is the active one, while the second 720xd is used as hot-standby being constantly synchronized using DRBD. The two boxes are directly connected with a simple Cat 6e cable.

    As the final customer was in charge to do both the physical installation and the normal hardware maintenance, a familiar networking equipment as RJ45 port and cables were strongly favored by him.

    Moreover, it is expected that within 2 die shrinks 10GBase-T controller become cheap/low power enough that they can be integrated pervasively, similar to how 1GBase-T replaced the old 100 Mb standard.

    Regards.
  • DigitalFreak - Monday, December 15, 2014 - link

    Don't know why the went with 8 PCI-E lanes for the 10Gig controller. 4 would have been plenty.

    1 PCI-E 3.0 lane is 1GB per second (x4 = 4GB). 10Gig max is 1.25 GB per second, dual port = 2.5 GB per second. Even with overhead you'd still never saturate an x4 link. Could have used the extra x4 for something else.
  • The Melon - Monday, December 15, 2014 - link

    I personally think it would be a perfect board if they replaced the Intel X540 controller with a Mellanox ConnectX-3 dual QSFP solution so we could choose between FDR IB and 40/10/1Gb Ethernet per port.

    Either that or simply a version with the same slot layout and drop the Intel X540 chip.

    Bottom line though is no matter how they lay it out we will find something to complain about.
  • Ian Cutress - Tuesday, November 1, 2016 - link

    The controller is PCIe 2.0, not PCIe 3.0. You need to use a PCIe 3.0 controller to get PCIe 3.0 speeds.
  • eanazag - Monday, December 15, 2014 - link

    I am assuming we are talking about the free ESXi Hypervisor in the test setup.

    SR-IOV (IOMMU) is not an enabled feature on ESXi with the free license. What this means is that networking is going to tax the CPU more heavily. Citrix Xenserver does support SR-IOV on the free product, which it is all free now - you just pay for support. This is a consideration to base the results of the testing methodology used here.

    Another good way to test 10GbE is using iSCSI where the server side is a NAS and the single client is where the disk is attached. The iSCSI LUN (hard drive) needs have something going on with an SSD. It can just be 3 spindle HDDs in RAID 5. You can use disk test software to drive the benchmarking. If you opt to use Xenserver with Windows as the iSCSI client. Have the VM directly connect to the NAS instead of using Xenserver to the iSCSI LUN because you will hit a performance cap from VM to host in the typical add disk within Xen. This is in older 6.2 version. Creedance is not fully out of beta yet. I have done no testing on Creedance and the contained changes are significant to performance.

    About two years ago I was working on coming up with the best iSCSI setup for VMs using HDDs in RAID and SSDs as caches. I was using Intel X540-T2's without a switch. I was working with Nexenta Stor and Sun/Oracle Solaris as iSCSI target servers run on physical hardware, Xen, and VMware. I encountered some interesting behavior in all cases. VMware's sub-storage yielded better hard drive performance. I kept running into an artifical performance limit because of the Windows client and how Xen handles the disks it provides. The recommendation was to add the iSCSI disk directly to the VM as the limit wouldn't show up there. VMware still imposed a performance ding on (Hit>10%) my setup. Physical hardware had the best performance for the NAS side.

Log in

Don't have an account? Sign up now