GPU computing Stay up to date in OpenCL, DirectCompute, CUDA, CAL and OpenGL information

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Thursday, 18 February 2010

Learned from voxel rendering demo code: CUDA 3.0 how to change cache size (for Fermi) function found!

Posted on 12:21 by Unknown
its in voxel code:
\efficient-sparse-voxel-octrees\src\framework\base\dllimport.inl
cuFuncSetCacheConfig
cuFuncSetCacheConfig, (CUfunction hfunc, CUfunc_cache config), (hfunc, config))
also other functions i didn't know in:
cuGraphicsSubResourceGetMappedArray
cuGetExportTable

Also they don't use GLEW and initialize..
other tricks:
CPU trick:
// Force the main thread to run on a single core.
SetThreadAffinityMask(GetCurrentThread(), 1);
GPU trick:
flags |= CU_CTX_SCHED_SPIN; // use sync() if you want to yield
#if (CUDA_VERSION >= 2030)
flags |= CU_CTX_LMEM_RESIZE_TO_MAX; // reduce launch overhead with large localmem
#endif
what about CU_CTX_LMEM_RESIZE_TO_MAX?

Also Voxel raycasting demo has good code supports Stereo OpenGL rendering and GUI controls!! for Quadros!
and good code multisampling..


also you can see functions added since 2.1:
#if (CUDA_VERSION >= 2020)
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuDriverGetVersion, (int *driverVersion), (driverVersion))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuMemHostAlloc, (void **pp, size_t bytesize, unsigned int Flags), (pp, bytesize, Flags))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuMemHostGetDevicePointer, (CUdeviceptr *pdptr, void *p, unsigned int Flags), (pdptr, p, Flags))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuFuncGetAttribute, (int *pi, CUfunction_attribute attrib, CUfunction hfunc), (pi, attrib, hfunc))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuTexRefSetAddress2D, (CUtexref hTexRef, const CUDA_ARRAY_DESCRIPTOR *desc, CUdeviceptr dptr, unsigned int Pitch), (hTexRef, desc, dptr, Pitch))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuWGLGetDevice, (CUdevice *pDevice, HGPUNV hGpu), (pDevice, hGpu))
#endif

#if (CUDA_VERSION >= 2030)
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuMemHostGetFlags, (unsigned int *pFlags, void *p), (pFlags, p))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuGLSetBufferObjectMapFlags, (GLuint buffer, unsigned int Flags), (buffer, Flags))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuGLMapBufferObjectAsync, (CUdeviceptr *dptr, unsigned int *size, GLuint buffer, CUstream hStream), (dptr, size, buffer, hStream))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuGLUnmapBufferObjectAsync, (GLuint buffer, CUstream hStream), (buffer, hStream))
#endif

#if (CUDA_VERSION >= 3000)
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuMemcpyDtoDAsync, (CUdeviceptr dstDevice, CUdeviceptr srcDevice, unsigned int ByteCount, CUstream hStream), (dstDevice, srcDevice, ByteCount, hStream))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuFuncSetCacheConfig, (CUfunction hfunc, CUfunc_cache config), (hfunc, config))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuGraphicsUnregisterResource, (CUgraphicsResource resource), (resource))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuGraphicsSubResourceGetMappedArray, (CUarray *pArray, CUgraphicsResource resource, unsigned int arrayIndex, unsigned int mipLevel), (pArray, resource, arrayIndex, mipLevel))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuGraphicsResourceGetMappedPointer, (CUdeviceptr *pDevPtr, unsigned int *pSize, CUgraphicsResource resource), (pDevPtr, pSize, resource))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuGraphicsResourceSetMapFlags, (CUgraphicsResource resource, unsigned int flags), (resource, flags))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuGraphicsMapResources, (CUgraphicsResource *resources, CUstream hStream), (resources, hStream))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuGraphicsUnmapResources, (unsigned int count, CUgraphicsResource *resources, CUstream hStream), (count, resources, hStream))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuGetExportTable, (const void **ppExportTable, const CUuuid *pExportTableId), (ppExportTable, pExportTableId))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuGraphicsGLRegisterBuffer, (CUgraphicsResource *pCudaResource, GLuint buffer, unsigned int Flags), (pCudaResource, buffer, Flags))
FW_DLL_IMPORT_RETV( CUresult, CUDAAPI, cuGraphicsGLRegisterImage, (CUgraphicsResource *pCudaResource, GLuint image, GLenum target, unsigned int Flags), (pCudaResource, image, target, Flags))
#endif

currently fails with CUDA Compute Cluster driver:
in CudaModule::staticInit(void)
change that:
checkError("cuGLCtxCreate", cuGLCtxCreate(&s_context, flags, s_device));
by
if(tcc)
{
checkError("cuCtxCreate", cuCtxCreate(&s_context, flags, s_device));
//res = cuGLInit();
}
else
checkError("cuGLCtxCreate", cuGLCtxCreate(&s_context, flags, s_device));
cuglinit perhaps needed but depecrated anyway
changed in cuInit(0); or after cuctxcreate?

also if tcc was more smart would work and fallback
to host interop as CUDA already does so I think directly
all CUDA GL functions return error in tcc..

anyway thanks good code change:
Buffer::Hint_CudaGLin CudaRenderer::CudaRenderer(void) to Buffer::Hint_None
so
: m_frameBuffer (NULL, 0, Buffer::Hint_None),//Buffer::Hint_CudaGL),
Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Porting CUDA to OpenCL!
    Well so you want to port CUDA code to OpenCL: you are in AMD GPU competition of porting Cuda codes to opencl (see previous post) or you are ...
  • Megapost!
    Today fools{ *GTX 485 is 512 cores 3gbytes gddr5 and 850/1750 shaders.. *ati 5990 has 4 gpus in board.. *bulldozer benchmarks }end fools.. A...
  • About ATI and Nvidia drivers (OCL included)!
    Hi I have been investigating AMD and Nvidia drivers.. for 10.3 there are 3d hooks support for 120hz monitors but is d3d9 d3d10 or d3d11 enab...
  • things found in CUDA forums
    Also some CUDA news: Mandelbulb stereo angalyph -> have to port to 3D Vision http://forums.nvidia.com/index.php?showtopic=150985&st=2...
  • opencl/opengl linux interop! seen in opencl cuda 3.0 sdk samples
    Following my OpenCL/OpenGL Window interop work: now has come to Linux  for Nvidia GPU computing registered developers via 195.17 driver! Als...
  • State of the blog..
    Sorry for the delay guys of posting code of Apple OpenCL demos port.. the blog has been with no updated for more than 2 weeks in this rapid ...
  • Optix and OpenCL SDKs with Visual Studio 2010
    Optix 1.0 ========= install cg download Cmake 2.80 cmake says error dumpbin not found and it is cuda doesn't work with vc2010 so copy pt...
  • CUDA 3.0 forums stuff!
    1.Getting CUBIN instead of ELF If you need the older text format, you can disable ELF cubins in nvcc.profile by changing "CUBINS_ARE_EL...
  • News from the web!
    Some things learned in AMD forums: 1.Why 3xxx no OpenCL: Compute shader mode is a hardware feature that did not exist in the HD38XX line of ...
  • Shaders: measuring perf, source translation and parsing different languages!
    Hi, I hope to be pretty exhaustive of options for parsing and translating between graphics and compute shaders ( some open source) For DX sh...

Blog Archive

  • ►  2013 (5)
    • ►  September (1)
    • ►  March (3)
    • ►  February (1)
  • ►  2012 (1)
    • ►  December (1)
  • ▼  2010 (46)
    • ►  July (4)
    • ►  May (1)
    • ►  April (3)
    • ►  March (9)
    • ▼  February (15)
      • Reading Fermi CUDA stuff!
      • Questions about OpenCL AMD d3d9 interop!
      • News 25/2!
      • 3 new tools!
      • Ideas for porting algos to GPU:AVX SSE and MMX ports!
      • About ATI and Nvidia drivers (OCL included)!
      • Shaders: measuring perf, source translation and pa...
      • Enabling OpenCL Image support on AMD GPUs!
      • Running QT everywhere!
      • Parallel algorithms avaiable on CUDA,OCL,DC,CAL: s...
      • More news!
      • Learned from voxel rendering demo code: CUDA 3.0 h...
      • A month of news!
      • About Tesla computing driver!
      • A long report of the silence before the storm: AKA...
    • ►  January (14)
  • ►  2009 (125)
    • ►  December (51)
    • ►  November (53)
    • ►  October (21)
Powered by Blogger.

About Me

Unknown
View my complete profile