GPU computing Stay up to date in OpenCL, DirectCompute, CUDA, CAL and OpenGL information

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Thursday, 28 February 2013

What I'm expecting from GTC..

Posted on 17:37 by Unknown
Well really I think I'm expecting to much altough in form of a lot of minor improvements in his software products (so I'm no expecting new architectures info (Maxwell) etc..), but anyway I have compiled a list of things so I can check later wheter NV is doing his work or not :-)
Of course it will be good even if all these pieces come in to place say over H1 2013..

*nvfx: new effects system open source, cross vendor support, etc.. was anounced at Siggraph and has an empty github site and also there is a talk at GTC so there is no better place and moment to upload to github.. This system also uses a more efficient OGL state management ext called NV_state_object better aligned to DX10-11 state managements via objects so seems also more like DSA management..
*Only consumer HW info may be GK114/6 archs info which may bring some new things as note even Titan has no DX11.1 profile support so hoping before Maxwell support says GK114 and such new 680 replacements must have it so there is some minor arch enhancement in graphics side..
Also can have one more thing.. see next point.. well say it briefly dynamic parallelism everywhere and from anywhere to anywhere.. (ANYWHERE={CUDA,OGL})
*new ogl exts: NV_state_object (DX11 like state objects) and some kind of dynamic parallelism for graphics APIs..
Regarding  this is interesting there as there is a patent on it and it's about exposing dynamic parallelism in graphics world which implies OGL in near future i.e. graphics shaders can create new draw calls and put on the dispatch manager queue..
Also for completeness what's holding NV from exposing launch graphics from compute kernels and dispatch compute kernels from graphics shaders.. Note seems NV_state_object is much needed in two cases (CUDA->OGL and OGL->OGL dispatch draw call cases) as some state env is needed in these cases CPU apis are not useful as it's GPU work and default OGL state may not be useful..

Also please upload documentation on NV_GPU_shader5_memory_extended  shipping in 313 drivers altough I suspect is for exposing cache modifiers to load store operations supported on CUDA already like load non cached,load cached, etc..
Note my previous post asking for NV to expose all compute functionality (ISA richness in this case) to OGL compute shaders via at least now lacking PTX ISA instructions and also via some asm() function (which is reserved already in GLSL and usable in OCL kernels in NV and even AMD!(this is new for me I found last month and you can use AMDIL altough I haven't been able in exposing clock cycle counter to work yet))
*Grid SDK: well I'm interested in frame capture APIs not cloud stuff.. related I see OGL support for NVENC is being implemented so some update for NVENC will be good..
*OpenGL SDK: well one seems overdue (exposing advanced usage cases of OGL 4.x features) and a tess sample was released soon this year.. One deferred+ sample would be good..
*Cg 3.2: I want  glsl 4.3 support integrated into Cg for some things I'm working and Cg 3.1 is almost one year old.. also I think if support for cg compute shader is or not implemented ( as said in Cg language/runtime) will say much of wheter Cg is dead or not.. Also what about bindless texes in Cg?
*cuda 5.1 I suggested to NV team in late October equaling CUDA to OGL compute shader so support for compressed texes, depth textures, msaa textures (even depth ones..).. Note some of this are in OCL 1.2 exts release in SA 2012.. And also expose similar functionality to all remaining OCL 1.2 new exts  in case support avaiable in HW or easy to do by runtime like terminate kernel, out of bounds stuff, memory initialization etc..
One thing that I forgot at the time:
Expose atomic counters (now are shipping on OGL compute world) on CUDA and OCL (like AMD does on OCL) this are equivalent to atomadd(ptr,1) but an order of magnitude faster than global atomics at least on Fermi (not know in Kepler) and they are the foundation of "hardware accelerated queues" not? I remember how when NV readied OGL 4.2 beta drivers atomic counters were slow and then after some month or so they get tremendous speedup and they deserved special instruction exposed in NV OGL assembly language..
*cuda compiler sdk seems is going final and I think will bring up to date to CUDA 5.1 or 6.0 whatever they may end naming new CUDA release. (hope also gets up to date LLVM/Clang integration so 3.2 and/or 3.3)
*cuda.lang: Well I want to play with these for a long time.. motivation well bring more an offline compilation model to CUDA like OpenCL and basically avoid needing in Windows VS installed for realtime compilation of CUDA kernels: could be useful to dynamic compilation of Optix shaders (like OpenRL) and also for research software of nvidia like CUDAraster, VoxelPipe, etc..
*Shipping all Physx stuff from last GTC and GDC into production:
->apex 1.3 (bring realtime fracture support done entirely on gpu to existing RGB support)
->physx 3.3 (rigid bodies on gpu and perhaps even fracture like APEX)
Hope at least by GDC which is later we will get all of these in beta form..
One anoying thing for me at least is that Physx GPU interop with graphics APIs isn't avaiable (altough yes in APEX).. which anoys me is that APEX is Physx under the hood so please also expose GPU buffers of result simulation of GPU modules like cloth, fluid, and soon rigid bodies..
*optix 3.1 preview-> bring some gk110 perf improvements.. seems current Optix doesn't exercise all potential judging from perf numbers on Nv forums (barely better than GTX 680?)..
*cuda roadmap nda discussion: Well it was anyway a surprise to see NV invited me to a NDA discussion of future roadmap (hope saying it isn't NDA :-)) at GTC.. I can't attend but I hope they will be talking about how to expose unified CPU/GPU in CUDA and potentially new ISA sm_40?
*volume render solution
*ocl 1.2 in drivers: well with OCL 2.0 spec coming perhaps at Siggraph it's time to implement OCL 1.2 in NV drivers? In time with new CUDA support?
*nsgiht 3.0 final and 3.1 preview: After GLSL native debug I want (really more than I need right now but anyways soon will need..) (VS2012 support, OGL 4.3 support with compute shaders, and my biggest desire is for a unified host and device debugging experience like that ships in Nsight Eclipse edition)
For Eclipse edition I hope they add single GPU debugging with software preemption much like her older brother and also OGL debugging with that basically GPU debugging is perfect for me on Windows and Linux and all that remains is GPU true software preemption..
Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Porting CUDA to OpenCL!
    Well so you want to port CUDA code to OpenCL: you are in AMD GPU competition of porting Cuda codes to opencl (see previous post) or you are ...
  • Megapost!
    Today fools{ *GTX 485 is 512 cores 3gbytes gddr5 and 850/1750 shaders.. *ati 5990 has 4 gpus in board.. *bulldozer benchmarks }end fools.. A...
  • About ATI and Nvidia drivers (OCL included)!
    Hi I have been investigating AMD and Nvidia drivers.. for 10.3 there are 3d hooks support for 120hz monitors but is d3d9 d3d10 or d3d11 enab...
  • things found in CUDA forums
    Also some CUDA news: Mandelbulb stereo angalyph -> have to port to 3D Vision http://forums.nvidia.com/index.php?showtopic=150985&st=2...
  • opencl/opengl linux interop! seen in opencl cuda 3.0 sdk samples
    Following my OpenCL/OpenGL Window interop work: now has come to Linux  for Nvidia GPU computing registered developers via 195.17 driver! Als...
  • State of the blog..
    Sorry for the delay guys of posting code of Apple OpenCL demos port.. the blog has been with no updated for more than 2 weeks in this rapid ...
  • Optix and OpenCL SDKs with Visual Studio 2010
    Optix 1.0 ========= install cg download Cmake 2.80 cmake says error dumpbin not found and it is cuda doesn't work with vc2010 so copy pt...
  • CUDA 3.0 forums stuff!
    1.Getting CUBIN instead of ELF If you need the older text format, you can disable ELF cubins in nvcc.profile by changing "CUBINS_ARE_EL...
  • News from the web!
    Some things learned in AMD forums: 1.Why 3xxx no OpenCL: Compute shader mode is a hardware feature that did not exist in the HD38XX line of ...
  • Shaders: measuring perf, source translation and parsing different languages!
    Hi, I hope to be pretty exhaustive of options for parsing and translating between graphics and compute shaders ( some open source) For DX sh...

Blog Archive

  • ▼  2013 (5)
    • ►  September (1)
    • ►  March (3)
    • ▼  February (1)
      • What I'm expecting from GTC..
  • ►  2012 (1)
    • ►  December (1)
  • ►  2010 (46)
    • ►  July (4)
    • ►  May (1)
    • ►  April (3)
    • ►  March (9)
    • ►  February (15)
    • ►  January (14)
  • ►  2009 (125)
    • ►  December (51)
    • ►  November (53)
    • ►  October (21)
Powered by Blogger.

About Me

Unknown
View my complete profile