GPU computing Stay up to date in OpenCL, DirectCompute, CUDA, CAL and OpenGL information

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Thursday, 3 December 2009

AMD OpenCL news! (almost all..)

Posted on 12:19 by Unknown
1. In a press release AMD says working with Sisoft for OpenCL benchmarks and optimized the benchmarks.. they report more than 2X perf of 5870 vs GTX 295.. note that 8.68 Catalyst (9.12) was with OpenCL beta 4 (DLL?s) used so perhpas this includes improved CAL compiler.. (1.4.492 in leaked 9.12 Xp drivers) I can't test currently on 9.11 but DirectCompute 5.0 get 2.5Gpixels (5850 oc to similar 5870 perf) so seems optimized OpenCL backend (but not at all all seems to obtain only 72% perf vs DirectCompute...)
See it!

Working with SiSoftware, AMD has optimized the performance of the OpenCL benchmarks for its GPU implementations, and for some problems has demonstrated significant performance advantages using AMD’s ATI Stream Software Development Kit (SDK) for OpenCL. When compared to NVIDIA’s CUDA running on its GeForce GTX 295 featuring two GPUs, the ATI Radeon™ HD 5870 graphics card with one GPU delivers up to 2.7 times faster performance on certain benchmark tests. For the "native float shader" results, the ATI Radeon 5870 posted a score of 1820 megapixels per second, compared to the GTX 295 at 680 megapixels per second!

Based on native float shader results of 1820 megapixels per second for AMD compared to 680 megapixels for NVIDIA on AMD Phenom™ II X4 940 processor-based system, 3 GHz, ASUSTek M3A79-T DELUXE, 4GB DDR2-1066, Windows® 7 64-bit Enterprise operating system
Driver: 8.680.0.0, OpenCL base build, SDK 2.0 Beta 4
nvidia Driver: 8.15.11.9038

I don't like using Nvidia 190.38 as 195 drivers I think improve OpenCL numerical intensive test by 30%.. (Sisoft Sandra 2010 uses Mandelbrot calculation as test which is compute bound not mem bound if programmed right..)
Also I don't like using multiGPU as multiGPU scaling seemed to be slow on Vista/Win7 on 190 OpenCL drivers. Nvidia listed as known issue in their GPU Computing SDK..
Alotugh assuming perfect scaling with MultiGPUs in OpenCL (i get 445mpixels with GTX 275) you should get around 850-900 Mpixels which is still 2X slower than ATI

2. Some time ago I mentioned this excellent tutorial about integer multiprecision with OpenCL..
Now it's better with 5870 results and results for both AMD and Nvidia OpenCL implementations in Windows and Linux.. you can find Windows backend is always slower in Nvidia and also possibly presents perf issues for AMD GPUs..
Also I have contacted author for source code and says coming soon..
Rember the available memory bandwidth in the GTX285 is 158 GB/s, and 153 GB/s for the HD5870.
Things learned:
Memory copy
===========
Note the difference in driver efficiency between Linux and Windows for the GTX285 board: the Linux curves rises earlier, meaning the latency of a call to clEnqueueCopyBuffer is much lower on Linux. At the end of the curve, the "asymptotic speed" (pure copy speed) is the same, at 66 GB/s as seen earlier.

The last thing to note on the diagram is the lack of proper support for clEnqueueCopyBuffer in this version of the ATI driver. The Linux version reaches 8.1 GB/s while the Windows version remains under a pathetic 3 GB/s. Hopefully, the next versions of the drivers will fix this, and match the GTX285 results as they should. The host-device copy speeds for the ATI board follow the same tendency.

Zero mem set
============
One big difference is that the behaviour of the ATI board differs significantly between Linux and Windows. Under Windows, it reaches a catastrophic 585 MB/s (does it actually compute something on the CPU? Maybe I installed mixed components of the driver...) while the Linux implementation shows some signs of activity and reaches 53 GB/s.


3.Seems Pyrit OpenCL is not working on ATI OpenCL backend..
AMD engineers have reproduced it and a working on it!
See: http://forums.amd.com/forum/messageview.cfm?catid=390&threadid=123060&enterthread=y

4.Confirmed that next OpenCL Stream SDK release will have documentation
of r8xx arch and IL instructions (lowlevel CAL stuff).
Q:Any plans to document uav_raw_load_id, uav_raw_store_id? To publish R800 ISA? To answer some questions on this forum?
A:Our next release will have updated documentation that should cover all the newer hardware.

5. AMD newsletters:
Check ATI Stream Team Quarterly which is a regular newsletter to keep you up-to-date about ATI Stream.
http://amd-member.com/Newsletters/ATIStream/09Q3.html
Also check AMD Developer Central Newsletters
latest: http://amd-member.com/newsletters/DevCentral/0911.html

6. Geeks3d are providing a lot of GPU computing and OpenCL news:
latest: http://www.geeks3d.com/20091203/opencl-and-gpu-computing-industry-news/
7.Altough flash 10.1 beta and GPU acceleration on AMD with 9.11 seems not working (anandtech) seems ATI has working it. See:
Adobe Flash Player 10.1 Accelerated by ATI Stream Technology
http://www.youtube.com/watch?v=BTOOr2fQ4KA
Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

Blog Archive

  • ►  2013 (5)
    • ►  September (1)
    • ►  March (3)
    • ►  February (1)
  • ►  2012 (1)
    • ►  December (1)
  • ►  2010 (46)
    • ►  July (4)
    • ►  May (1)
    • ►  April (3)
    • ►  March (9)
    • ►  February (15)
    • ►  January (14)
  • ▼  2009 (125)
    • ▼  December (51)
      • GPU computing on AMD.. an history perspective!
      • Catalyst 9.12: hotfix (III)
      • Catalyst 9.12 Linux and Windows links and release ...
      • Source code of DirectCompute bechmark(OpenCL and D...
      • Catalyst 9.12 adds OpenGL 3.2 support (and more..)!
      • 16/12 news!
      • Catalyst 9.12 released
      • PS3 OpenCL work and AMD OpenCL ICD
      • Christmas Wish list (I): Monitors
      • 3d Stereoscopic players!
      • Today news!
      • What will I do if I have 3D Vision OpenGL QB
      • GLEW,GLUT,Freeglut, MesaGLUT and more
      • Nvidia 195 new drivers and Flash player beta 2!
      • Running ATI GPUs in Sisoft Sandra 2010!
      • Memcheck GPUs!
      • Emulate 3D kernel launch grid
      • things found in CUDA forums
      • Siggraph 2009 (Asia too..)!
      • Architecture ideas for future GPUs!
      • Dificulties in coding, achieving high perf an meas...
      • Learned from HPG09 stuff!
      • Nvidia driver 187.98 add new files!
      • What I would want to know and get from vendors par...
      • What I would want to know and get from vendors par...
      • Some news II (post #100!)
      • What I would want to know and get from vendors par...
      • physics on GPU: source code!
      • OpenCL with MingW! (and more)
      • Some news!
      • String matching on GPUs!
      • Lots of OpenCL soft coming!
      • 10 Raytracing GPU demos! (more or less)
      • New Nvidia tools and crossvendor GPU instrumentati...
      • About Catalyst 9.12 and 10.1!
      • CUDA 3.0 forums stuff!
      • Upcoming GPU tutorials!
      • News from the web! (9 December)
      • Compiling the CUDA compiler!
      • Understanding Nvidia GT200 GPU and CUDA implementa...
      • Open Source GPU Computing benchmarks
      • CUDA TopCoder contest stuff (with source code of t...
      • CUDPP news!
      • DirectCompute stuff!
      • Nvidia GPU computing news!
      • GPU Computing calendar for December 09 and January...
      • Nexus FAQ!
      • Nvidia Nexus beta1 GPU debugger shipped!
      • GPU virtualization (and what to expect in VMs)!
      • AMD OpenCL news! (almost all..)
      • News posted 2/12/2009! (megacompilation)
    • ►  November (53)
    • ►  October (21)
Powered by Blogger.

About Me

Unknown
View my complete profile