NVIDIA Updates Tegra Roadmap Details at GTC - Logan and Parker Detailedby Brian Klug & Ryan Smith on March 19, 2013 1:50 PM EST
We're at NVIDIA's GTC 2013 event where team green just updated their official roadmap and shared some more details about their Tegra portfolio, specifically additional information about Logan and Parker, the codename for the SoCs beyond Tegra 4. First up is Logan, which will be NVIDIA's first SoC with CUDA inside, specifically courtesy a Kepler architecture GPU capable of CUDA 5.0 and OpenGL 4.3. There's no details on the CPU side of things, but we're told to expect Logan demos (and samples) inside 2013 and production inside devices early 2014.
It’s interesting to note that with the move to a Kepler architecture GPU, Logan will be taking on a vastly increased graphics feature set relative to Tegra 4. With Kepler comes OpenGL 4.3 capabilities, meaning that NVIDIA is not just catching up to OpenGL ES 3.0, but shooting right past it. Tessellation, compute shaders, and geometry shaders among other things are all present in OpenGL 4.3, far exeeding the much more limited and specialized OpenGL ES 3.0 feature set. Given the promotion that NVIDIA is putting into this - they've been making it quite clear t everyone that Logan will be OpenGL 4.3 capable - this may mean that NVIDIA intends to use OpenGL 4.3 as a competitive advantage with Logan, attracting developers and users looking for a more feature-filled SoC than what current OpenGL ES 3.0 SoCs are slated to provide.
On a final note about Logan, it’s interesting to note that Kepler has a fairly strict shader block granularity of 1 SMX, i.e. 192 CUDA cores. While NVIDIA can always redefine Kepler to mean what they say it means, if they do stick to that granularity then it should give us a very narrow range of possible GPU configurations for Logan.
After Logan is Parker, which NVIDIA shared will contain the codename Denver CPU NVIDIA is working on, with 64 bit capabilities and codename Maxwell GPU. Parker will also be built using 3D FinFET transistors, likely from TSMC.
Like Logan, it's clear that Parker will be benefitting from being based on a recent NVIDIA dGPU. While we don't know a great deal about Maxwell since it doesn't launch for roughly another year, NVIDIA has told us that Maxwell will support unified virtual memory. With Logan NVIDIA gains CUDA capabilities due to Kepler, but with Parker they are laying down the groundwork for full-on heterogeneous computing in a vein similar to what AMD and the ARM partners are doing with HSA. NVIDIA has so far not talked about heterogeneous computing in great detail since they only provide GPUs and limited functionality SoCs, but with Denver giving them an in-house CPU to pair with their in-house GPUs, products like Parker will be giving them new options to explore. And perhaps more meaningfully, the means to counter HSA-enabled ARM SoCs from rival firms.
In addition NVIDIA showed off a new product named Kayla which is a small, mITX-like board running a Tegra 3 SoC and unnamed new low power Kepler family GPU.
Post Your CommentPlease log in or sign up to comment.
View All Comments
ahmadamaj - Tuesday, March 19, 2013 - link10 times increase in performance in 2 years!!
phoenix_rizzen - Tuesday, March 19, 2013 - linkNot too hard, considering how crappy their performance has been with Tegra1-3.
Gopi - Tuesday, March 19, 2013 - linkHeheh.. I agree. Their performance has always been abysmal in the realm of power and performance grid. Trying to integrate two road maps is a good idea only if the multi-dimensional issues are resolved. I wonder how CUDA can fit into Android based platform ? Aren't CUDA libs themselves add up to lot of over weight ?
ahmadamaj - Tuesday, March 19, 2013 - linkgetting 10 times the performance of Tegra 1 shouldn't have been a big issue. But now A15's are already pushing the power limits for small form factor. Imagine in 2 years today's A15 will be the LITTLE in the big.LITTLE!!
DanNeely - Tuesday, March 19, 2013 - linkSomething with A15 performance levels might be the LITTLE core; but it won't be the A15 itself any more than the A8 or A9 (or ARM11 cores) are being used in current implementations; instead the new ultra power efficient A7 core is being used. Even more than the A8/9 the A15 is optimized for high performance/high power consumption (relative to other ARM designs); while the LITTLE core needs to be tuned for minimum idle power above all else.
Arms next generation after the A15 are the A53/57; both using a newer instruction set and paired for big.LITTLE operation.
Performance per core, per mhz is only 20% higher for A53/57 than with A7/15 cores. Clock speeds might go up a bit more; but that's expensive in power consumption and per core performance is already a bit past where Moores law broke for higher powered CPU because all the easy ways to speed them up are done. Since Arm already went multi-core I suspect most of the transistor growth in future SoCs will end up being in the GPU and/or integration of more of the handful of still off chip components while CPU performance levels off.
ahmadamaj - Tuesday, March 19, 2013 - linkYeah, I guess it doesn't have to be 10 times increase in CPU performance. Although it would be interesting to see what the 3D transistors will do for ARM.
mabellon - Tuesday, March 19, 2013 - linkIn the past with the Tegra3, and I believe with the Exynos 5, you are correct in that the little core is a previous more efficient architecture. However with Tegra 4 the 5th core is an A15. I suspect the core is more efficient due to a lower clock speed, voltage and other tunings.
cmikeh2 - Tuesday, March 19, 2013 - linkRumor has it that Nvidia is using a different type of transistor for different parts of the die. The quad core A15's are using high performance high leakage transistors while the +1 A15 is using a low leakage low performance transistor design.
extide - Wednesday, March 20, 2013 - linkTegra 4 is NOT big.LITTLE btw.
MrSpadge - Wednesday, March 20, 2013 - linkBIG.little is different from nVidias 4+1: the former uses different cores compatible to the same instruction set, whereas the latter uses similar cores with transistors of one of them tuned for low power consumption.