HPCWire reports on AMD's latest GPUs clocking in with 200 GFlops double precision performance under 150 Watts or 1.33 GFlops/Watt. That translates to a double precision petaflop for .75 MWatts compared to the RoadRunner which consumes 3 MWatts. The AMD GPU is about 2-3 times the peak GFlops/Watt FPGA floating point performance numbers, though I speculate the new Altera Stratix IV may be competitive. Cleerspeed apparently wins the double-precision efficiency competition with a 2 GFLops/Watt and 2.5 GFLops/Watt chips. Performance for specific functions can vary substantially though there is still no standard language to make it practical to spec.
AMD claims that they will support the OpenCL ("Computing Language") specification. OpenCL is still non-existant as far as I can tell. From the HPCWire article:
"In an attempt to unify the programmer interface AMD has announced its support for OpenCL."
Steve Jobs mentioned OpenCL support in Snow Leopard and now it looks like the Khronos Group is trying to organize the effort to actually make the standard. Intel should join the fun and say the Larabee will support the OpenCL standard.