The State of OpenCL & the First End-User OpenCL Drivers
by Ryan Smith on October 6, 2009 12:00 AM EST- Posted in
- Ryan's Ramblings
Last week NVIDIA released their first set of end-user OpenCL drivers. Previously OpenCL drivers had only been available for developers on the NVIDIA side of things, and this continues to be the case on the AMD side of things. With NVIDIA’s driver release, the launch of AMD’s 5800 series, and some recent developments with OpenCL, this is a good time to recap the current state of OpenCL, and what has changed since our OpenCL introductory article from last year.
A CPU & GPU Framework
Although we commonly talk about OpenCL alongside GPUs, it’s technically a hardware agnostic parallel programming framework. Any device implementing OpenCL should be cable of running any OpenCL kernel, so long as the developers take in to account querying the host device ahead of time as to not spawn too many threads at once. And while GPUs (being the parallel beasts that they are) are the primary focus, OpenCL is also intended for use on CPUs and more exotic processors such as the Cell BE and DSPs.
What this means is that when it comes to discussing the use of OpenCL on computers, we have two things to focus on. Not only is there the use of OpenCL on the GPU, but there’s the use of OpenCL on CPUs. If Khronos has their way, then OpenCL will be a commonly used framework for CPUs both to take better advantage of multi-core CPUs (8 threaded i7 anyone?) and as a fallback mechanism for when OpenCL isn’t available on a GPU.
This also makes things tricky when it comes to who is responsible for what. AMD for example, in making both GPUs and CPUs, is writing drivers for both. They are currently sampling their CPU driver as part of their latest Stream SDK (even if it is a GPU programming SDK), and their entire CPU+GPU driver set has been submitted to the Khronos group for certification.
NVIDIA on the other hand is not a CPU manufacturer (Tegra aside), so they are only responsible for having a GPU OpenCL driver, which is what they have been giving to developers for months. They have submitted it to Khronos and it has been certified, and as we mentioned they have released it to the public as of last week. NVIDIA is not responsible for a CPU driver, and as such they are reliant on AMD and Intel for OpenCL CPU drivers. AMD likes to pick at NVIDIA for this, but ultimately it’s not going to matter once everyone finally gets up to speed.
Intel thus far is the laggard; they do not have an OpenCL implementation in any kind of public testing, for either CPUs or GPUs. For AMD GPU users this won’t be an issue, since AMD’s CPU driver will work on Intel CPUs as well. For NVIDIA GPU users with Intel CPUs, they'll be waiting on Intel for a CPU driver. Do note however that a CPU driver isn't required to use OpenCL on a GPU, and indeed we expect the first significant OpenCL applications to be intended to run solely on GPUs anyhow. So it's not a bad situation for NVIDIA, it's just one that needs to be solved sooner than later.
OpenCL ICD: Coming Soon
Unfortunately matters are made particularly complex by the fact that on Windows and Linux, writing an OpenCL program right now requires linking against a vendor-specific OpenCL driver. The code itself is still cross-platform/cross-device, but in terms of compiling and linking OpenCL has not been fully abstracted. It’s not yet at the point where it’s possible to write and run a single Windows/Linux program that will work with any OpenCL device. It would be the equivalent of requiring an OpenGL game (e.g. Quake) to have a different binary for each GPU vendor’s drivers.
The solution to this problem is that OpenCL needs an Installable Client Driver (ICD), just like OpenGL does. With an ICD developers can link against that, and it will handle the duties of passing things off to vendor-specific drivers. However an ICD isn’t ready yet, and in fact we don’t know when it will be ready. NVIDIA - who chairs the OpenCL working group - tells us that the WG is “driving to get an ICD implementation released as quickly as possible”, but with no timetable attached to that. The effort right now appears to be on getting more OpenCL 1.0 implementations certified (NV is certified, AMD is in progress), with an ICD to follow.
Meanwhile Apple, in the traditional Apple manner, has simply done a runaround on the whole issue. When it comes to drivers they shipped Snow Leopard with their own OpenCL CPU driver, and they have GPU drivers for both AMD and NVIDIA cards. Their OpenCL framework doesn’t have an ICD per-say, but it has features that allow developers to query for devices and use any they like. It effectively accomplishes the same thing, but it’s only of use when writing programs against Apple’s framework. But to Apple’s credit, as of this moment they currently have the only complete OpenCL platform, offering CPU+GPU development and execution with a full degree of abstraction.
What GPUs Will Support OpenCL
One final matter is what GPUs will support OpenCL. While OpenCL is based around the hardware aspects of DirectX10-class hardware, being DX10 compliant isn’t enough. Even among NVIDIA and AMD, there will be some DX10 hardware that won’t support OpenCL.
NVIDIA: Anything that runs CUDA will run OpenCL. In practice, this means anything in the 8-series or later that has 256MB or more of VRAM. NVIDIA has a full list here.
AMD: AMD will only be supporting OpenCL on the 4000 series and later. Presumably there was some feature in the OpenCL 1.0 specification that AMD didn’t implement until the 4000 series, which NVIDIA had since the launch of the 8-series. Given that AMD is giving Brook+ the heave-ho in favor of OpenCL, this will mean that there’s going to continue to be a limited selection of GPGPU applications that work on these cards as compared to the 4000 series and later.
End-User Drivers
Finally to wrap this up, we have the catalyst of this story: drivers. As we previously mentioned, NVIDIA released their OpenCL-enabled 190.89 drivers to the public last week, which we’re happy to see even if the applications themselves aren’t quite ready. This driver release was a special release outside of NVIDIA’s mainline driver releases however, and as such they’re already out of date. NVIDIA released their 191.07 WHQL-certified driver set yesterday, and these drivers don’t include OpenCL support. So while NVIDIA is shipping an OpenCL driver for both developers and end-users, it’s going to be a bit longer until it shows up in a regular release.
AMD meanwhile is still in a developer-only beta, which makes sense given that they’re still waiting on certification. The estimates we’ve heard is that the process takes a month, so with AMD having submitted their drivers early last month, they should be certified soon if everything went well.
67 Comments
View All Comments
drmo - Wednesday, October 7, 2009 - link
"In fact, AMD even claimed that they were the first to support DirectCompute with the launch of the HD5870 (if they added CS5.0 to the statement they would be right, but they didn't). "Most of the releases and news reports I read specifically said the first WHQL certified DirectX 11 and DirectCompute 11 driver, not that they were the first with DirectCompute.
http://www.amd.com/us/press-releases/Pages/amd-pre...">http://www.amd.com/us/press-releases/Pages/amd-pre...
Scali - Wednesday, October 7, 2009 - link
I'm talking about statements like here:http://www.hpcwire.com/topic/developertools/AMD-Su...">http://www.hpcwire.com/topic/developert...Review-b...
"•AMD's upcoming next generation ATI Radeon family of DirectX 11 enabled graphics processors are expected to be the first to support accelerated processing on the GPU through DirectCompute."
This statement was made in late September, while nVidia already released WHQL drivers with DirectCompute in July:
http://www.nvidia.com/object/win7_winvista_32bit_1...">http://www.nvidia.com/object/win7_winvista_32bit_1...
"Supports Microsoft’s new DirectX Compute API on Windows 7."
drmo - Wednesday, October 14, 2009 - link
I read that to mean that they are the first with DirectX 11 GPUs that support DirectCompute. It seems that later press releases have made it more clear that they meant DirectCompute 11.Scali - Wednesday, October 14, 2009 - link
I don't see how you can read it to mean that. You *know* that's what it's supposed to mean if you are up to speed with the subject... but if you don't, there's no way you could read it like that, because a key piece of information was simply omitted from that statement.drmo - Wednesday, October 14, 2009 - link
Not really, after "first", you are assuming it should have GPU, whereas the subject was "DirectX 11 enabled graphics processors". However, I agree it is not absolutely clear, but I don't think it was purposeful, since the subsequent press releases clarified the statement.Scali - Wednesday, October 14, 2009 - link
"AMD's upcoming next generation ATI Radeon family of DirectX 11 enabled graphics processors are expected to be the first to support accelerated processing on the GPU through DirectCompute.""The first" refers back to "AMD's upcoming next generation ATI Radeon family of DirectX 11 enabled graphics processors".
That part is very clear.
The problem is with this: "accelerated processing on the GPU through DirectCompute."
Your suggestion doesn't make sense...
You would get:
"AMD's upcoming next generation ATI Radeon family of DirectX 11 enabled graphics processors are expected to be the first family of DirectX 11 enabled graphics processors to support accelerated processing on the GPU through DirectCompute."
(Note that 'first' now takes on a slightly different meaning, the function of the word in the sentence changes).
What you have now is a kind of pleonasm. Since AMD's GPUs are the first DX11 GPUs, they are obviously the first DX11 GPUs to support whatever feature.
I'm sure that's not what they meant to say. It's just too far-fetched.
Ryan Smith - Wednesday, October 7, 2009 - link
Jarred already said most of what I want to say, but I will add something.It may be a confusing thing to write, but I consider it a critical point none the less. From what we're seeing out of the Apple developer camp, OpenCL is going to be big on the CPU. It won't be as big as it is on the GPU (no one is going to write something in OpenCL that they only intend to run on x86 processors), but big none the less.
In the mean time we have this crazy situation where you need drivers from multiple sources in many cases to get a complete driver stack. And even with drivers, it's all a mess without the ICD.
My fundamental point right now is that in spite of having a complete spec and certification, the OpenCL situation is very, very screwed up on Windows and Linux. When most of my time talking to contacts is composed of them trying to answer "who is responsible for what", there's a problem.
For OpenCL to succeed there needs to be full GPU and CPU drivers for all platforms, and an ICD to tie them together. We're not there yet.
Scali - Wednesday, October 7, 2009 - link
Why is that a crazy situation?The same goes with OpenGL or Direct3D. You can have multiple devices, even from multiple vendors, and just enumerate through all of them.
You will ALWAYS have to have drivers from multiple sources, the ICD won't solve that. Even though AMD and Intel might package both their CPU and GPU drivers into a single downloadable package, they will STILL be two independent implementations, and two independent drivers. So from a technical point-of-view, it doesn't really matter whether CPU and GPU drivers come from the same manufacturer or not.
I suppose the best solution would be for Microsoft to offer CPU drivers through Windows Update. They already deliver GPU drivers through Windows Update, so eventually those will be updated to drivers with OpenCL support. If they also solve the CPU-part of the equation, the end-user doesn't even have to know about OpenCL.
Thing is that you make it sound like it's somehow nVidia's fault or responsibility to supply CPU drivers, and that's very confusing (and AMD has never said anything of the sort either).
tweakoz - Wednesday, October 7, 2009 - link
>>You will ALWAYS have to have drivers from multiple sources, the >>ICD won't solve that.Yeah, but right now I have to link my OpenCL/Windows program with either the ATI Stream SDK, OR the NVIDIA OpenCL SDK. I can not do both (unless I do a plugin OpenCL driver layer for my program, thats a PITA I don't want to deal with).
The ICD, with enumeration of installed drivers built in is CRITICAL for developers ease of use.
mtm
Scali - Wednesday, October 7, 2009 - link
Technically you don't.As long as you link to 'a' OpenCL.dll, it should be fine.
The problem here is that AMD uses a nonstandard calling convention in their OpenCL.dll. That's why linking to their stuff doesn't work for nVidia and vice versa.
nVidia uses the same standard as Microsoft uses, and also OpenGL and OpenAL use, so I think AMD is the one who made a mistake here.
If AMD had used the same standard calling convention, we wouldn't have this problem. Then all functions could just be automatically imported by name.
Besides, the ICD won't solve this problem. The calling convention that AMD uses also has caller stack cleanup, rather than callee stack cleanup. You'd get stack corruption. They just need to fix and recompile their code.