Tuesday, March 02, 2010

GPU Supercomputing Rundown

/. linked to this opinion article on AMD, Intel and NVidia in the next decade (full disclosure, I own shares in NVidia). Some of the opinions about OpenCL and CUDA are the same as I expressed in my post about Intel purchasing RapidMind and Cilk. Since I made that post, Intel has decided to abandon plans to release Larabee as a consumer GPU. This was not a big surprise, since Larabee could not compete with AMD's HD 5870 or 5970 products. The consumer GPU space is currently dominated in terms of teraflops per dollar and per watt by these AMD/ATI cards. NVidia still dominates the GPU computing market because their investment in CUDA. AMD was the first to support OpenCL, but OpenCL seems more and more like an also-ran API. AMD was also the first to support DirectX 11, but NVidia downplays this.

Adobe is going to support CUDA in their tools. I don't know how AMD is going to turn the GPU supercomputing market in their favor: hardware leads are not enough except perhaps in the highest of high-end supercomputing applications.

NVidia's move to make their new 40nm Fermi architecture with a behemoth 3B transitors has proven troublesome with yields reported in the single digits. Reports inidicate that Nvidia will disable disfunctional stream processors similar to the way IBM boosted yields of the Cell processor for the PS3 by enabling only 7 of the 8 SPE processors.

It seems like there will be an increasing trend towards defect tolerant designs and design-for-yield EDA tools; once upon a time, I had an offer to intern at IBM developing such a tool. A number of issues at smaller geometries arise such as variace in wire dimensions and gate-lengths affecting the switching speed. Also, with more components in a chip there is simply more that can be broken. Possibilities for yield enhancment include disabling cache regions, lowering individual core speeds, and disabling disfunctional cores entirely.