Sorry raw dump of my ideas:
Altough we are a month of a complete storm if we follow carefully we can hear some thunders of that storm known as Fermi and new software updates:
First the base read graphics arch (Nvidia GF100) and compute arch (Fermi arch)..
also see Deep Dive presentation having more perf chart vs PDF in noticias3d.com or ..
Also altough not kwnown there were two more Deep Dive sessions not much talked about developer relations program showing sled info about demo and Nexus graphics debugging (the first demo I have of debugging a HLSL video as CUDA video has been posted).
Search in cz page..
Tesla computing driuver
GFX cards:
4x slower doubles?
As you will know graphics arch reveal revamped geometry power via parallel rasterizers (4 so 4x perf) and 16x geo power via putting this 16 times..
also now geo buffer and stream out buffers are using L1/L2 caches (and atomics?) so much faster
and general (removing fixed funtion hardware)..
this can be seen at least a removal of fixed functions) and generalizing to work in parallel the rasterizer..
This impacts a geometry hard game as Crysis as 60% faster not bad expecting also shader power to be near to 2x increase..
and I think of GF100 as of 4 GPUs in one chip or GPC.. at has all it needs..
right now is GTX 480 and 470 has h.264 mvc support (bluray 3d by the way HDMI 1.4 3D spec is open) (will be exposed in DXVA or what? also in CUVID VDPAU and or CUVEND?..)
as you know in Mac GPU video encoding are supported by Elemental and video decoding by a shit api (QTKIT) which not exposes decoded frames as OpenGL textures or OpenCL image objects..
Elemental ships in 2.2 with her GPU decoding so have to see is a CUVID using Snow Leo APIs or using shaders..
Also I have seen HDMI 1.4 outputs in Fermi and this would be marvelous as to interop the output
of 3D Vision to Sony 3D monitors (but what glasses I use?)
Lastly 3D Vision has now tri SLI or quad SLI support and all new monitors 24 inch support (3 or 4 right now) I have seen 27inch monitor from ASUS for early June and panels with 3d Vision and touch support are being sampled I think.. but remember
Youtube 3D Vision support, windows supported and browser integration are promised soon..
There are reports that claim
SA 2009 courses things learned:
SC 2009 courses things learned:
I3D 2010 things learned:
would be perfect for a fraps grabbing 3d Vision
One thing I'm sad it will not be is this will be of use for not halting the OS and also in
I hope Nvidia are working on right at least for near future this year..
I can't understand why not would be the case..
1. altough this is not strictly Fermi related, the much needed updates of OpenCL in MacOSX and DirectCompute in Windows are coming in a month I expect..
Direct3D SDK updates are much needed after some 5 months (a 1.5 month before Windows 7 launch) )since last update something like prehistory is this rapid changing world :-)
I hope a GDC 2010 release (so 6 months later) at least with important fixes all know issues: for double support, CS library: FFT,scan, and other fixes reported on XNA forums..
Also would be good if some samples shown by Fermi Deep Dive session at CES are given as that seems DirectX samples and released as hair demo or tesselated water demo.. AMD did the same with 5xxx code (search contributed by AMD in Direct3D SDK)..
Also good demos of Ocean demos are shown by Nvidia a OpenCL code port of DirectCompute and AMD in SA 2009 OpenCL seesion.. would be good to have this..
I am also Nvidia ships more DirectCompute demos in GPU Computing SDK 3.0 final or beta2 which I hope will be released by Fermi time..
I also hope cuprintf released two months ago is integrated in CUDA Toolkit or SDK and hopefully
ported to OpenCL for GPU printf debugging support (as said AMD supports in Linux in CPU and coming to MSVC).. Anyway I expect OCL support to be somewhat restricted due to no template support, etc..)
I would port to OCL but anyway is confidential stuff right now..
See more debugging later..
I also want to talk about CUDA SDK 3.0 a lot more as about ELF, cuda memcheck, CUDA driver RT interop,etc.. but I will wait until final PTX 2.0, 1.5 (OCL) and docs are updated..
As a check point would be good to know how ECC and L1/shared cache is configured enabled..
I remember seeing in some Quadro 195 driver released seeing something about ECC in Control Panel..
but I don't know how L1/shader mem cache is going to be used (parameter to nvcc?, CUDA API fuction,etc..)
10.6.3 is coming this month and has OpenGL 3.x support (well 3.0 seems) (altough netkas claims that not complete as OpenGL extensions viewer doesn't claim GLSL 1.5.0 required support I think this is related to no info on GL 3.x context creation has been published so it's not creating an advanced context but extensions are there.. also comparing to 10.6.2 I see two more 3.2 extensions are supported not bad.. I only hope they are two interesting ones and not directx helper extensions.. give me that plus uniform_object and TBO from 3.1 and I would be more than happy..
So I hope this are at least supported as extensions in Nvidia driver or AMD 5xxx driver..
at Netkas seems is reporting software renderer extension..
oh boy if Apple cared less about a stable platform and give GPU extensions as fast as they come in Windows and Linux would be perfect I don't care about OpenGL 3.x being implemented in software seems a mad situation as much as if Microsoft cared about DirectX reference rasterizer for running actual games (ehem it has WARP..)
If not at least expect 3.1 complete by summer (=10.6.4 or 10.6.5) and perhaps 3.2 by end this year.. so seems 3.2 complete this year..
I hope by that time having also optional 3.2 ext:
GL_ARB_draw_buffers_blend
GL_ARB_sample_shading
GL_ARB_texture_cube_map_array
GL_ARB_texture_gather
GL_ARB_texture_query_lod
at least
GL_ARB_sample_shading
GL_ARB_texture_cube_map_array
GL_ARB_texture_gather
for me are good.
News are that at WWDC is showing 10.7.0 and if you remeber in 2008 had GT200 support so perhaps at least 3.2 complete and Fermi support will be for 10.7.0 WWDC seed..
Also altough a bit premature would be good if with initial 5xxx and hopefully coming this year Fermi support adds also new shader 5.0 extensions (more later)
for me would be perfect similar to Leopard having in 10.5.2 at least a lot of G80 new extensions in Nvidia supported (geo shaders, texture feedback,etc..) ..
OpenCL for MacOS: FFT library perf fixes, also expect some improvementes as double support for Nvidia on GT2xx cards, ATI image support at least this is where I will put my effort being Apple.. Still the bad thing is Apple is no 5xxx support as AMD 4xxx don't have true local mem but this can be changing fast if rumors are true of a expected MacPro shipping this or next month with 24 hardware threads (2 6 cores 32nm Westmere) and hopefully a 5xxx card as option so perhaps good..
Before leaving MACOS also I expect CUDA updates for 3.x:
Talking CUDA on MACos:
you have cuda memcheck
cuda-gdb coming soon.. will add OpenCL at that time also?
cuda 64 bit support (for 3.x)
cuda opengl efficient support (not hoped but can be)
also would be good if for hackintosh users can use Fermi on CUDA 3.0 in MAcos..
i.e. cuda.kext exposes access to that..
Also remember Fermi support will not be completed by 3.0 release well at least if not released as beta2 in march and delay 3.0 for June summer..
so expect a lot more for 3.1 and perhaps some minus things for 3.2
if you not follow gt200 intro, 2.0 had double support and shared mem atomics but until 2.2 we hadn't host pinned mem a feature of gt200..
Amongs the things said to not be present at first are support of recursion and I think also virtual fuction calls and function pointers but I could prove wrong..
Of course this hardware features are supported by here own or since beta: 8x faster double,10x faster context switching and atomics and caches by her own and concurrent kernel and dual dma in beta.. this last two using
Talking about OpenCL:
I expect Nvidia 200.x drivers to add support for DirectX extensions (see GDC 2010).
cl_nv_d3d9_sharing
cl_nv_d3d10_sharing
cl_nv_d3d11_sharing
are published in Khronos OpenCL registry
also in 196.21 I see some d3d10 fuctions..
but that seems crazy as AMD is own DirectX extensions..
would be good khr_dx..
also 3d image writes for Fermi and perhaps half extension for all cards..
Talking about OpenGL:
By the way seeing Nvidia GDC 2010 plans seems WebGL is launching (final spec) at GDC and also expect some updates to OpenGL: well I expect a bunch of EXT extensions and NV AMD extensions supporting new D3D 11 hardware..
well since now we have shipphing two extensions in 196 driver not documented: nvx_meminfo and wgl_dx_interop..
also ATI has added: GL_AMD_shader_stencil_export
GL_AMD_seamless_cubemap_per_texture supprot in 10.1 but this is documented in Khronos (
also added GLX_INTEL_swap_event)
this last is interesting for async glutswapbuffers and events for qeurying when complete not waiting for vsync or similar..
also I hope similar to that VDPAU will come with efficient GLX interop since now it has some overheads and perhaps last extensions can help..
First see GPU Computing tools:
Regarding hardware debugging-> lots of news.
See:
With all these references you know:
For Windows you have Parallel Nsight (codename Nexus) which supports GFX and Compute debugging, profiling and API tracing all integrated in Visual Studio 2008?.. (at least now support CUDA C and HLSL DirectX 9/10 seems)..
The problem is no Windows XP so this platform..
of course upcoming is Direct3D 11 and we hope OpenCL and GLSL but that can be sometime later..
Also release (beta?) is targeted for Q1 2010..
Nvidia names Pro version with Direct
On other OSs you have Visual Profilers for CUDA/OCL in Linux/MacOS..
With that you know that cuda-gdb already has Fermi hardware support and is getting soon support for MacOS and also for OpenCL.. Use it with DDD or Emacs and you have for other
Recapilutationg earlier posts:
Solaris and FreeBSD support for CUDA is working PGI using Noveau stack..
GPU Computing book and programming gems
Raytracing:
Well you have Optix 2.0 beta1 now supports Geforce and Fermi optimizations are promised soon..
CEs videos show frame reate from 0.23 to .67 for a complex demo..
Now don't pred
It's also curious how now they claim that cache helps a lot Nvidia claim 3x improvement over GT200 (well the arhictectural perf increase has to be mitigated by core count (240/512) and speed diferences if any) so seems to me no more than 30% increase in perf per core per clock due to caches in turn agrees with
CUDA multicore:
Well that's hurting me as this is one of the true strenghts of OpenCL right now and Nvidia seems to have left both as initial work was not very good (MCUDA) download it and see a lot of restrictions (texture support not)
and AMD how it is:
well see OpenGL extensions
well with this you can at least check the diff between what I claim and
Catalyst 10.2 RC2 expose that AMD is going the route of exposing extensions as EXTs ones so Fermi and AMD will interoperate and hope Heaven OpenGL demo with tesselation for Linux (windows support also?) is Fermi capable because of that but that also seems no support until March/April 2010 (10.3 or 10.4) as 10.2 has not exposed it..
GL_AMD_gpu_shader5
GL_AMD_conservative_depth
GL_EXT_texture_buffer_object_rgb32
GL_EXT_gpu_shader_fp64
GL_EXT_tessellation_shader
GL_EXT_shader_subroutine
GL_EXT_gpu_shader5
this are found on that and as you see
GL_EXT_gpu_shader5 and GL_AMD_gpu_shader5 seems similar so
no interesting AMD extensions excepting
stencil shader write GL_AMD_conservative_depth
amd random access target..
CAL is at 55x build now
in March at 6xx build final OpenCL SDK
you can find on 10.2:
Hull shader(s) were not successfully compiled before glLinkProgram() was called. Link failed.
Domain shader(s) were not successfully compiled before glLinkProgram() was called. Link failed.
gl_FragStencilRefAMD
subroutineEXT uniform
I have found some of this on Nvidia driver so seems crossvendor D3D 11 OGL extensions are
coming soon (nvidia launch day and ATI at GDC or April or May I hope)
Hopefully Ubuntu 10.4 AMD driver (fglrx 10.4 beta) ships in mid March also has adds with OpenCL in driver support so no more SDK with OpenCL.so that would be perfect if they can ship also with image support, production ogl interop and byte_addresable_store.. assuming atomics local and global are prodution quality I don't know.. also hope that as VGA arbitration is supported I can have simultanoeus AMD and Nvidia GPus working and OpenCL detecting two platforms.. A dream come true :-)
Also from GDC 10:
Nexus:NVIDIA's New Game Development Environment: NVIDIA Parallel Nsight
http://developer.nvidia.com/nsight
seems APEX tools are coming (anounced detailts at GDC 09)
and for Tegra profiling PerfHUD ES coming..
Latly not related but talking about Ipad and MacOS in general..
first MacPro said with 5xxx and also 10.7 seed in June and touch Imacs coming..
Friday, 5 February 2010
A long report of the silence before the storm: AKA a month before Fermi..
Posted on 07:29 by Unknown
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment