Intel Z68 Chipset & Smart Response Technology (SSD Caching) Reviewby Anand Lal Shimpi on May 11, 2011 2:34 AM EST
The problem with Sandy Bridge was simple: if you wanted to use Intel's integrated graphics, you had to buy a motherboard based on an H-series chipset. Unfortunately, Intel's H-series chipsets don't let you overclock the CPU or memory—only the integrated GPU. If you want to overclock the CPU and/or memory, you need a P-series chipset—which doesn't support Sandy Bridge's on-die GPU. Intel effectively forced overclockers to buy discrete GPUs from AMD or NVIDIA, even if they didn't need the added GPU power.
The situation got more complicated from there. Sandy Bridge's Quick Sync was one of the best features of the platform, however it was only available when you used the CPU's on-die GPU, which once again meant you needed an H-series chipset with no support for overclocking. You could either have Quick Sync or overclocking, but not both (at least initially).
Finally, Intel did very little to actually move chipsets forward with its 6-series Sandy Bridge platform. Native USB 3.0 support was out and won't be included until Ivy Bridge, we got a pair of 6Gbps SATA ports and PCIe 2.0 slots but not much else. I can't help but feel like Intel was purposefully very conservative with its chipset design. Despite all of that, the seemingly conservative chipset design was plagued by the single largest bug Intel has ever faced publicly.
As strong as the Sandy Bridge launch was, the 6-series chipset did little to help it.
Addressing the Problems: Z68
In our Sandy Bridge review I mentioned a chipset that would come out in Q2 that would solve most of Sandy Bridge's platform issues. A quick look at the calendar reveals that it's indeed the second quarter of the year, and a quick look at the photo below reveals the first motherboard to hit our labs based on Intel's new Z68 chipset:
Architecturally Intel's Z68 chipset is no different than the H67. It supports video output from any Sandy Bridge CPU and has the same number of USB, SATA and PCIe lanes. What the Z68 chipset adds however is full overclocking support for CPU, memory and integrated graphics giving you the choice to do pretty much anything you'd want.
Pricing should be similar to P67 with motherboards selling for a $5—$10 premium. Not all Z68 motherboards will come with video out, those that do may have an additional $5 premium on top of that in order to cover the licensing fees for Lucid's Virtu software that will likely be bundled with most if not all Z68 motherboards that have iGPU out. Lucid's software excluded, any price premium is a little ridiculous here given that the functionality offered by Z68 should've been there from the start. I'm hoping over time Intel will come to its senses but for now, Z68 will still be sold at a slight premium over P67.
Overclocking: It Works
Ian will have more on overclocking in his article on ASUS' first Z68 motherboard, but in short it works as expected. You can use Sandy Bridge's integrated graphics and still overclock your CPU. Of course the Sandy Bridge overclocking limits still apply—if you don't have a CPU that supports Turbo (e.g. Core i3 2100), your chip is entirely clock locked.
Ian found that overclocking behavior on Z68 was pretty similar to P67. You can obviously also overclock the on-die GPU on Z68 boards with video out.
The Quick Sync Problem
Back in February we previewed Lucid's Virtu software, which allows you to have a discrete GPU but still use Sandy Bridge's on-die GPU for Quick Sync, video decoding and basic 2D/3D acceleration.
Virtu works by intercepting the command stream directed at your GPU. Depending on the source of the commands, they are directed at either your discrete GPU (dGPU) or on-die GPU (iGPU).
There are two physical approaches to setting up Virtu. You can either connect your display to the iGPU or dGPU. If you do the former (i-mode), the iGPU handles all display duties and any rendering done on the dGPU has to be copied over to the iGPU's frame buffer before being output to your display. Note that you can run an application in a window that requires the dGPU while running another that uses the iGPU (e.g. Quick Sync).
As you can guess, there is some amount of overhead in the process, which we've measured to varying degrees. When it works well the overhead is typically limited to around 10%, however we've seen situations where a native dGPU setup is over 40% faster.
|Lucid Virtu i-mode Performance Comparison (1920 x 1200—Highest Quality Settings)|
|Metro 2033||Mafia II||World of Warcraft||Starcraft 2||DiRT 2|
|AMD Radeon HD 6970||35.2 fps||61.5 fps||81.3 fps||115.6 fps||137.7 fps|
|AMD Radeon HD 6970 (Virtu)||24.3 fps||58.7 fps||74.8 fps||116.6 fps||117.9 fps|
The dGPU doesn't completely turn off when it's not in use in this situation, however it will be in its lowest possible idle state.
The second approach (d-mode) requires that you connect your display directly to the dGPU. This is the preferred route for the absolute best 3D performance since there's no copying of frame buffers. The downside here is that you will likely have higher idle power as Sandy Bridge's on-die GPU is probably more power efficient under non-3D gaming loads than any high end discrete GPU.
With a display connected to the dGPU and with Virtu running you can still access Quick Sync. CrossFire and SLI are both supported in d-mode only.
As I mentioned before, Lucid determines where to send commands based on the source of the commands. In i-mode all commands go to the iGPU by default, and in d-mode everything goes to the dGPU. The only exceptions are if there are particular application profiles defined within the Virtu software that list exceptions. In i-mode that means a list of games/apps that should run on the dGPU, and in d-mode that is a smaller list of apps that use Quick Sync (as everything else should run on the dGPU).
Virtu works although there are still some random issues when running in i-mode. Your best bet to keep Quick Sync functionality and maintain the best overall 3D performance is to hook your display up to your dGPU and only use Sandy Bridge's GPU for transcoding. Ultimately I'd like to see Intel enable this functionality without the use of 3rd party software utilities.
Post Your CommentPlease log in or sign up to comment.
View All Comments
MrCromulent - Wednesday, May 11, 2011 - linkThanks for the review! Good to see that Intel's SSD caching actually works quite well.
I'm looking forward to the next generation of SB notebooks with a ~20GB mSATA SSD drive combined with a 1TB 2,5" hard drive.
dac7nco - Wednesday, May 11, 2011 - linkIndeed. I'd be interested in seeing how a Crucial M4 64GB mated to a pair of short-stroked single-platter Samsung drives in RAID-0 would perform in a dedicated gaming system.
JarredWalton - Wednesday, May 11, 2011 - linkReally? Man, I thought short-stroking drives was all but dead these days. That's the whole point of SSDs: if you're so concerned about storage performance that you're willing to short-stroke an HDD, just move to a full SSD and be done with it. Plus, storage is only a minor bottleneck in a "dedicated gaming system"; your GPU is the biggest concern, at least if you have any reasonable CPU and enough RAM.
My biggest concern with SRT is the reliability stuff Anand mentions. I would *love* to be able to put in a 128GB SSD with a large 2TB HDD and completely forget about doing any sort of optimization. That seems like something that would need to be done at the hardware level, though, and you always run the risk of data loss if the SSD cache somehow fails (though that should be relatively unlikely). Heck, all HDDs already have a 16-64MB cache on them, and I'd like the SSD to be a slower but much larger supplement to that.
Anyway, what concerns me is that we're not talking about caching at the level of, say, your CPU's L1 or L2 or even L3 cache. There's no reason the caching algorithm couldn't look at a much longer history of use so that things like your core OS files never get evicted (i.e. they are loaded every time you boot and accessed frequently, so even if you install a big application all of the OS files still have far higher hit frequency). Maybe that does happen and it's only in the constraints of initial testing that the performance degrades quickly (e.g. Anand installed the OS and apps, but he hasn't been using/rebooting the system for weeks on end).
The "least recently used" algorithm most caching schemes use is fine, but I wonder if the SSD cache could track something else. Without knowing exactly how they're implementing the caching algorithm, it's hard to say would could be improved, and I understand the idea of a newly installed app getting cached early on ("Hey, they user is putting on a new application, so he's probably going to run that soon!"). Still, if installing 30GB of apps and data evicts pretty much everything from the 20GB cache, that doesn't seem like the most effective way of doing things--especially when some games are pushing into the 20+ GB range.
bji - Wednesday, May 11, 2011 - linkIt seems like a good way to do it would be for the software to recognize periods of high disk activity and weigh caching of all LBAs during that period much higher.
So for example, system boot, where lots and lots of files are read off of the drive, would be a situation where the software would recognize that there is a high rate of disk I/O going on and to weigh all of the files loaded during this time very highly in caching.
The more intense the disk I/O, the higher the weight. This would essentially mean that the periods that you most want to speed up - those with heavy disk I/O - are most likely to benefit from the caching, and disk activity that is typically less intense (say, starting a small application that you use frequently but that is relatively quick to load because of the small number of disk hits) would only be cached if it didn't interfere with the caching of more performance-critical data.
All that being said, I am not a fan of complex caching mechanisms like this to try to improve performance. The big drawback, as pointed out in this well-presented article, is that there is a lack of consistency; sometimes you will get good performance and sometimes not, depending on tons of intangible factors affecting what is and what isn't in the cache. Furthermore, you are always introducing extra overhead in the complexity of the caching schemes, and in this case because it's being driven by a piece of software on the CPU, and because data is being shuffled around and written/read multiple times more than it would have with no caching involved.
Then again, it is highly unlikely to *hurt* performance so if you don't mind sometimes waiting more than other times for the same thing to happen (this in particular drives me crazy though; if I am used to a program loading in 5 seconds, the time it takes 10 seconds really stands out like a sore thumb), and can absorb the extra cost involved, then it's not a totally unreasonable way to try to get a little bit of performance.
Zoomer - Wednesday, May 11, 2011 - linkOr the filesystem can manage the cache. That would be a much more intelligent and foolproof way to do this.
vol7ron - Wednesday, May 11, 2011 - linkCan you point a RAM Disk to this caching drive?
bji - Wednesday, May 11, 2011 - linkWhat is the algorithm that the filesystem would use to decide what data to cache in preference to other cacheable data? That is the question at hand, and it doesn't matter at what level of the software stack it's done, the problem is effectively the same.
Mr Perfect - Wednesday, May 11, 2011 - link<quote>I would *love* to be able to put in a 128GB SSD with a large 2TB HDD and completely forget about doing any sort of optimization.</quote>
I heartily agree with that. Everyone is so gung ho about having a SSD for OS and applications, a HD for data and then <b>manually managing the data!</b> Isn't technology supposed to being doing this for us? Isn't that the point? Enthusiast computers should be doing things the consumer level stuff can't even dream about.
Intel, please, for the love of all that is holy, remove the 64GB limit.
Mr Perfect - Wednesday, May 11, 2011 - linkOn a completely unrelated note, why is the AT commenting software unable to do things the DailyTech site can? Quotes, bolding, italics and useful formatting features like that would really be welcome. :)
JarredWalton - Wednesday, May 11, 2011 - linkI'm not sure when they got removed, but standard BBS markup still works, if you know the codes. So...
[ B ]/[ /B ] = Bolded text
[ I ]/[ /I ] = Bolded text
[ U ]/[ /U ] = Bolded text
There used to be an option to do links, but that got nuked at some point. I think the "highlight" option is also gone... but let's test:
[ H ]/[ /H ] = [h]Bolded text[/h]
So why don't we have the same setup as DT? Well, we *are* separate sites, even though DT started as a branch off of AT. They have their own site designer/web programmer, and some of the stuff they have (i.e. voting) is sort of cool. However, we would like to think most commenting on AT is of the quality type so we don't need to worry about ratings. Most people end up just saying "show all posts" regardless, so other than seeing that "wow, a lot of people didn't like that post" there's not much point to it. And limiting posts to plain text with no WYSIWYG editor does reduce page complexity a bit I suppose.