GPU computing Stay up to date in OpenCL, DirectCompute, CUDA, CAL and OpenGL information

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Friday, 19 February 2010

Parallel algorithms avaiable on CUDA,OCL,DC,CAL: status update

Posted on 08:37 by Unknown
lin alg status update:
Matmul:
CUDA: CUBLAS (no code) Volkov (code) and yesterday post (assembly fastest to date 480 gflops)
CAL: beyond3d cal 1tflop matmul post
OCL: hazeman post above uses port of cal code to propietary but similar to CL code..
DC: bernaclejunior testing with doubles doesn't worked (XNA forums)
Matvec:
CUDA: CUBLAS (closed) and some papers use custom code (magma, paper mid 2008) (as 20-50% faster)
OCL: Bealto post above (high efficient on AMD and ATI) should be easy to port DC
Sparse matvec:
CUDA: CNC,CUSP,etc..
OCL,DC: BernacleJunior post on AMD and XNA forums (working on it)..

FFT:
CUDA: CUFFT 2 papers at SC08 having higher perf 3d ftts and 2d paper ->d3dCx
DC: has lib
OCL:
Apple code is 2x-3x slower than CUFFT seems (on Nvidia Linux )(also 10.6.2 is slow go see 10.6.3..)
on AMD doesn't work for size >512^2 in 2.0 or 2.01 fixed internally seems..
AMD 2.01 sample is hard coded 1024 perf?

Sort:
CUDA: CUDPP, CUDA sample (code)
OCL,DC: BernacleJunior post on AMD and XNA forums.He claims near 400Mkeys/s on vs state of the art Nvidia sorting less 200mkeys on GTX285.
Also reportedly Lee Hows has fast code working!

also CUDPP has triangular solvers and soon graph algos and hashes..
Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Porting CUDA to OpenCL!
    Well so you want to port CUDA code to OpenCL: you are in AMD GPU competition of porting Cuda codes to opencl (see previous post) or you are ...
  • Megapost!
    Today fools{ *GTX 485 is 512 cores 3gbytes gddr5 and 850/1750 shaders.. *ati 5990 has 4 gpus in board.. *bulldozer benchmarks }end fools.. A...
  • About ATI and Nvidia drivers (OCL included)!
    Hi I have been investigating AMD and Nvidia drivers.. for 10.3 there are 3d hooks support for 120hz monitors but is d3d9 d3d10 or d3d11 enab...
  • things found in CUDA forums
    Also some CUDA news: Mandelbulb stereo angalyph -> have to port to 3D Vision http://forums.nvidia.com/index.php?showtopic=150985&st=2...
  • opencl/opengl linux interop! seen in opencl cuda 3.0 sdk samples
    Following my OpenCL/OpenGL Window interop work: now has come to Linux  for Nvidia GPU computing registered developers via 195.17 driver! Als...
  • State of the blog..
    Sorry for the delay guys of posting code of Apple OpenCL demos port.. the blog has been with no updated for more than 2 weeks in this rapid ...
  • Optix and OpenCL SDKs with Visual Studio 2010
    Optix 1.0 ========= install cg download Cmake 2.80 cmake says error dumpbin not found and it is cuda doesn't work with vc2010 so copy pt...
  • CUDA 3.0 forums stuff!
    1.Getting CUBIN instead of ELF If you need the older text format, you can disable ELF cubins in nvcc.profile by changing "CUBINS_ARE_EL...
  • News from the web!
    Some things learned in AMD forums: 1.Why 3xxx no OpenCL: Compute shader mode is a hardware feature that did not exist in the HD38XX line of ...
  • Shaders: measuring perf, source translation and parsing different languages!
    Hi, I hope to be pretty exhaustive of options for parsing and translating between graphics and compute shaders ( some open source) For DX sh...

Blog Archive

  • ►  2013 (5)
    • ►  September (1)
    • ►  March (3)
    • ►  February (1)
  • ►  2012 (1)
    • ►  December (1)
  • ▼  2010 (46)
    • ►  July (4)
    • ►  May (1)
    • ►  April (3)
    • ►  March (9)
    • ▼  February (15)
      • Reading Fermi CUDA stuff!
      • Questions about OpenCL AMD d3d9 interop!
      • News 25/2!
      • 3 new tools!
      • Ideas for porting algos to GPU:AVX SSE and MMX ports!
      • About ATI and Nvidia drivers (OCL included)!
      • Shaders: measuring perf, source translation and pa...
      • Enabling OpenCL Image support on AMD GPUs!
      • Running QT everywhere!
      • Parallel algorithms avaiable on CUDA,OCL,DC,CAL: s...
      • More news!
      • Learned from voxel rendering demo code: CUDA 3.0 h...
      • A month of news!
      • About Tesla computing driver!
      • A long report of the silence before the storm: AKA...
    • ►  January (14)
  • ►  2009 (125)
    • ►  December (51)
    • ►  November (53)
    • ►  October (21)
Powered by Blogger.

About Me

Unknown
View my complete profile