GPU computing Stay up to date in OpenCL, DirectCompute, CUDA, CAL and OpenGL information

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Thursday, 18 February 2010

A month of news!

Posted on 10:49 by Unknown
So here it goes all random news I consider interesting in this past month:
* AMD CAL libs coming to MAC? In PGI 10.2 pgaccelinfo includes -ati -amd to report ati accelerators info.. This is in Mac release too.. and says libamdcalcl.dylib not found.. so seems
is not working?
This will close the hole of having standard OpenCL in 3 OSes and also CUDA and CAL on three Oses also..
Remember related news is PGI interested in using Noveau stack as base of enabling GPU computing stack thorugh it for OpenSolaris and FreeBSD after Nvidia spoke about Solaris and demonstrated(?) in GTC08.. but now is dead..
For Windows pgaccelinfo working copy aticalcl.dll to libamdcalcl.dll in dir and also calrt and it works.. So seems really PGI has AMD CAL for MAC.. as has linked the dylib no?
I hope they don't spend too energies working on it since OpenCL is better target for PGI accelerator model..
perhaps is good mail streamdeveloper amd dot com asking it..
* Par4All allows autopar for CUDA, etc..
* After AMD assembly matmul kernel achieved 1tflop on 48xx hardware and 58xxx should be 2tflops
now we have assembly optimized matmul for Nividia having 10-20% better perf.
search "Hand-Tuned SGEMM on GT200 GPU, 10% ~ 20% improvement of SGEMM"
allows 512 gflops gtx 285-> 1tflop matmul for Fermi? (like larrabe? mira acm video)
it has code and report..
Also has trick that using asm("") in cu kernel including PTX works via nvcc due to Open64 features..
*Nvidia has released updated videos on Youtube demos of "fluid demo" for fermi launch, Parallel Nsight (nexus) and one about Sled demo talking a enginner about it..
* I still don't know if some opencl.dll from Khronos works for Nvidia and AMD cards simultaneously..
some one says 2.01 opencl.dll works for two simultaneous..
I don't know but seems AMD works with Nvidia opencl.dll if I have Tesla Computing driver
Related khronos icd released tough things to remember are now you can program compatible OpenCL ICD with the doc and also that through ICD some functions which can not be resolved to concrete platform as unload compiler are "no operation"..
Another thing is that 2.01 dll seems has d3d10,d3d9 interop functions(?) or this are getted via
ICD that supports functions not exported through it, I must see..
also Nvidia has d3d11 interop what about AMD?..
Also spec has some cvs links from Khronos for getting some code (ICD loader code?) so someone can mail khronos jon leech for ex. for khronos cvs icd password..
*cudpp has now triangular solvers from 2010 paper..
still waiting for adding sa2009 paper hash functions..
Also a survey has been released saying in which to devote more energies: double supp, graph functions etc..
*bad article by demerijan about Fermi
http://www.semiaccurate.com/2010/02/17/nvidias-fermigtx480-broken-and-unfixable/
but Nvidia seems confident and set clocks for Fermi this week and seems also mid range and other cards taped out some time ago..
*cusp progressing towards dense math(?) has matmul dense and lu solve seems.
* Still clGetGLContextInfoKHR not usable altough present in header 2.01 (was it before in 2.0?)
also some string in Khronos ICD dll but no in lib and dll's really..
*Linux news:
Catalyst 10.2 has direct2d based acceleration search phoronix
also now Noveau has Galluim 3d support in Fedora 3 (working OpenGL ES 2.0 and OpenVG state trackers?)
Heaven benchmark for Linux coming in March for GDC? new version for sure (support for Fermi seems also as now Catalyst 10.2 shows all big sm5.0 features going trough EXT as double support (ext_fp64), shader model 5.0 (ext_shader5), tesselation stuff (ext_tessaltion_shader)
still no standard ext's for HDR new tex compression shipping but no doc and also similar for radnom accest target..
*OpenGL and OpenVG demos:
Some nice code and tutorials found on web:
->OpenGL geometry shader one pass texture cubemap render (3 ways)
->OpenGL GEO culling ->from 2.1billion to 2 million works ATI and Nvidia is 3.2 code..
->Complex OpenVG demo from SA 2009 Khronos presentation (animation)
->OpenGL uniforms vs texture objects.
->Hardware Tessellation on Radeon in OpenGL (geeks3d):
says there are two tesselators in 5xxx extensions
->Mali SDK UI 2.3, Tegra Khronos SDK..
->Code from Stanford Iphone GL ES course
http://www.khronos.org/news/multimedia/optimizing-opengl-for-iphone-stanford-university
-> OpenGL 3.2 samples:
http://nopper.tv/opengl_3_2.html
g-truc ->OpenGL 3 Samples Pack 1.2.1 released
*Also seems WebGL released spec at GDC09 as some talks from Khronos.. also Firefox 3.7 will have it and roadmaps plan for mid year now at alpha 2.
*I have found on ACM video rattner sc09 shows Larrabee demo matmul and sparse math..
More videos are from AMD OpenCL PHD boy..
*Would be nice if optix gets upgraded for:
->Breadth first abd packet ray compression via sort paper EG2010.. improves kernels Timo and Aila used in Optix?
improves raytracing 2x-4x shadows kernels
->Include Sparse Voxel Raycasting I3D 2010 paper
->OpenRL compatibilty.. see diferences are small..
Regarding id3 2010 for me in only remains to be seen stocastic transparency bi Enderton..
Also OpenRL is going to Khronos similar to OpenCL by Apple was.. must check similarities to OpenRT (previous standard )
* Seems AMD drivers for Windows 7 in GDI mode has a bug:
In the same artice some info on GDI accelerated on XP and 7 but not vista..
Also in 7 is in Aero only..
gdi bug 5xxx series:
http://www.tomshardware.com/reviews/2d-windows-gdi,2547-15.html
AMD has supplied hotfix and seems 10.2 WHQL doesn't contain it so perhaps 10.3? or 10.4
good theory about gdi on Windows.. disabled in Vista..
download 2dbench de tomshardware for checking perd..
* opennl 3.0 released having CUDA numerical libraries (CNC and CUSP similar?)
* Sparse voxels octree I3D 2010 paper avaiable and extended NV tech report '10 #1 with more photos and gtx285 perf.. also video and code avaiable in google code cuda voxel raycasting project..
see realtimerendering blog post..
* tegra2 full sdk
has now Android 2.1 images and Khronos full SDK (tegra khronos sdk)..
also seems video compression via OpenMAX in Linux and Android already?..
* Current Catalyst are 10.2 (8.70.2) whql and avaiable 8.70.3( only changes OPenGL version no cal no d3d)
beta given to press 10.3 is ati 8.71.3..
Now about it
3d hooks info is needed and good if enable opengl qb stereo on radeon..
better a sdk as with sample of d3d driver hooks similar to 3d vision is used in Avatar..
*There is a gpu-z enabled opencl ati I don't know if checks correctly or only enables ok..
*Now there are GDC 2010 info from Nvidia in developer.nvidia.com and from intel gdc 2010
From Intel expect:
->GPA 3.0
You'll see in-depth, real-time demos of GPA 3.0, including the much anticipated advanced
thread/task timeline that helps optimize task-based threading. New features such as automated
summarization of your game engine’s performance on multi-core CPUs, the DirectX API, and the
GPU will have you breath-ing a sigh of relief. Platform performance analysis has finally
arrived.
->Intel C++ Compiler version 12 info
This session in-cludes a review of the new automatic vectorization features in the upcoming
Intel C++ Compiler version 12.
->Tickertape
Shows a highly-threaded particle system with orientable quads — like paper in a parade. Particles are affected not only by gravity, but also by air resistance and wind.
*Book Programing ... by Kirk released is CUDA book..
Materials are here:
http://www.elsevierdirect.com/companion.jsp?ISBN=9780123814722
There is also a 3 chapter sample..
*In Khronos I have found a OpenCL NVIDIA build of 2010-02-03
Released soon?
Also a ARM Cortex A9 one:
Samsung Electronics 2010-02-03 OpenCL_1_0
Embedded Linux System with SAMSUNG OpenCL Library with OpenCL running on a ARM Cortex-A9
MPCore CPU.
* Realistic Demo Crymod: Widet2_Benchmark_alpha.7z
*From Caustics:
"due to be released in March"

OpenRL™ SDK Public BETA Registration

Caustic Graphics is about to achieve our next major milestone in bringing cinema quality graphics to every display. We are introducing our OpenRL SDK V1.0 restricted BETA release this week, which is the first implementation of our Open Ray Tracing Language (OpenRL) specification. The OpenRL SDK also includes our new OpenRL shading language (RLSL), which is based on GLSL and provides run-time compiled programmable shaders for ray tracing.

Similar to OpenGL for rasterization, the OpenRL specification is a framework for writing ray tracing applications that execute across heterogeneous compute platforms. Today there is no open standard, cross-platform API for ray tracing. Consequently developers must program their ray tracing applications "to the metal" or accept “vendor lock-in” by using a proprietary closed standard that is limited to a specific subset of hardware.

Later this year, we will be proposing the OpenRL specification as an open standard to the non-profit technology consortium, the Khronos Group. Moreover, we will actively solicit and support the introduction of third-party implementations of OpenRL. In the meantime, we are pleased to introduce the first implementation of the OpenRL specification, which we are calling the OpenRL SDK.

Some quick facts and features slated for the OpenRL SDK:
OS support for Windows, Mac OS X, and Linux;
Uses all OpenCL-based GPUs (e.g., AMD, nVidia, S3) and x86 CPUs (AMD, Intel) simultaneously;
Adding more compute delivers an immediate and nearly linear performance boost;
Plugging in one or more CausticOne or CausticTwo cards delivers the ultimate in ray tracing acceleration.
Target markets include but are not limited to, Film, Video, Games, Transportation, Education, Consumer Products, Architecture, Engineering, and Construction.

We would like to invite you to participate in our OpenRL BETA public program, slated for release this quarter. The OpenRL SDK Public BETA program will include free access to our developer forum where you can post your questions and answers to the OpenRL SDK, RLSL, CausticOne and CausticTwo.

Fill out the form below. Upon release we will send you an email with instructions to download the OpenRL SDK.

P.S. - For those of you who signed up for the CausticRT Emulator, well don't fret. The OpenRL SDK name supersedes CausticRT and CausticGL, whose names will be retired upon release of the production version of the OpenRL SDK.
So OpenCL based and submitting to Khronos..
S3 support intigues me as no driver supports it?

*gdebugger 5.5 with new AMD support for (Catalyst 9.12 and up) performance counters
Also gdebugger cl in beta soon..
*ati OpenCl released 2.01
at least fixes pcchen 8 - knights demo ..
Still no bugs for Apple FFT code fixed but reportedly fixed internally by AMD..
Still not now if OpenCL OpenMM is fixed and about early pyrit builds that now have contermeasures..

*10.6.3 check opengl 3.2 nvidia doubles and cl ati image and ati cal

RAW:
catalyst 10.2 i 10.3 news (8.71.3) 3d qb for d3d (can enable qb 3d ogl via ocl dx ogl interop?)
58xx xbvau not work but patch similar to 4xxx card bug earlier will fix it
fglrx 10.4 ubuntu driver fixed by then..
pgi 10.2 pgaccelinfo has cal info and libamdcal.dylib not found (amd has cal for mac?)
gdc eyefinity sdk?

Catalyst 10.2 has 181 GL extensions!

3 new, 1 EXT, 2 ARB:
GL_ARB_blend_func_extended - more enhancements to blending? whats left in DX10/11 that OGL doesn't have?

GL_ARB_fragment_coord_conventions - DX9 compatibility (wasn't this in OpenGl 3.2?!? still missing transform_feedback2) no estaba en 9.12 hotfix

GL_EXT_texture_buffer_object_rgb32 - this one is interesting as GL_ARB_texture_buffer_object already lists all the RGBA32 F, I, and UI.
ojo vi en fermi 195 drivers

Also I note that 2 amd extensions have been documented:
http://www.opengl.org/registry/specs/AMD/seamless_cubemap_per_texture.txt - when did this get added?
http://www.opengl.org/registry/specs/AMD/shader_stencil_export.txt - from 10.1

Wonder how far away we are from GL 3.3. Still haven't seen DX11 stuff yet, but they must be working on it!

Can't see any sign of the rumored (or under NDA) per-game application profile support yet in CCC. Supposed to be in 10.2...

tesla computing driver released 19.628 64 bits windows 2800 r2: opencl support?, nexus with ati?compute exclusive timeout
*Still no compiler no doubles feb 2010 directx sdk
*fermi 4x slowdown doubles
Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Porting CUDA to OpenCL!
    Well so you want to port CUDA code to OpenCL: you are in AMD GPU competition of porting Cuda codes to opencl (see previous post) or you are ...
  • Megapost!
    Today fools{ *GTX 485 is 512 cores 3gbytes gddr5 and 850/1750 shaders.. *ati 5990 has 4 gpus in board.. *bulldozer benchmarks }end fools.. A...
  • About ATI and Nvidia drivers (OCL included)!
    Hi I have been investigating AMD and Nvidia drivers.. for 10.3 there are 3d hooks support for 120hz monitors but is d3d9 d3d10 or d3d11 enab...
  • things found in CUDA forums
    Also some CUDA news: Mandelbulb stereo angalyph -> have to port to 3D Vision http://forums.nvidia.com/index.php?showtopic=150985&st=2...
  • opencl/opengl linux interop! seen in opencl cuda 3.0 sdk samples
    Following my OpenCL/OpenGL Window interop work: now has come to Linux  for Nvidia GPU computing registered developers via 195.17 driver! Als...
  • State of the blog..
    Sorry for the delay guys of posting code of Apple OpenCL demos port.. the blog has been with no updated for more than 2 weeks in this rapid ...
  • Optix and OpenCL SDKs with Visual Studio 2010
    Optix 1.0 ========= install cg download Cmake 2.80 cmake says error dumpbin not found and it is cuda doesn't work with vc2010 so copy pt...
  • CUDA 3.0 forums stuff!
    1.Getting CUBIN instead of ELF If you need the older text format, you can disable ELF cubins in nvcc.profile by changing "CUBINS_ARE_EL...
  • News from the web!
    Some things learned in AMD forums: 1.Why 3xxx no OpenCL: Compute shader mode is a hardware feature that did not exist in the HD38XX line of ...
  • Shaders: measuring perf, source translation and parsing different languages!
    Hi, I hope to be pretty exhaustive of options for parsing and translating between graphics and compute shaders ( some open source) For DX sh...

Blog Archive

  • ►  2013 (5)
    • ►  September (1)
    • ►  March (3)
    • ►  February (1)
  • ►  2012 (1)
    • ►  December (1)
  • ▼  2010 (46)
    • ►  July (4)
    • ►  May (1)
    • ►  April (3)
    • ►  March (9)
    • ▼  February (15)
      • Reading Fermi CUDA stuff!
      • Questions about OpenCL AMD d3d9 interop!
      • News 25/2!
      • 3 new tools!
      • Ideas for porting algos to GPU:AVX SSE and MMX ports!
      • About ATI and Nvidia drivers (OCL included)!
      • Shaders: measuring perf, source translation and pa...
      • Enabling OpenCL Image support on AMD GPUs!
      • Running QT everywhere!
      • Parallel algorithms avaiable on CUDA,OCL,DC,CAL: s...
      • More news!
      • Learned from voxel rendering demo code: CUDA 3.0 h...
      • A month of news!
      • About Tesla computing driver!
      • A long report of the silence before the storm: AKA...
    • ►  January (14)
  • ►  2009 (125)
    • ►  December (51)
    • ►  November (53)
    • ►  October (21)
Powered by Blogger.

About Me

Unknown
View my complete profile