GPU computing Stay up to date in OpenCL, DirectCompute, CUDA, CAL and OpenGL information

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Saturday, 10 July 2010

Some news!

Posted on 11:16 by Unknown
News:
*Gpu computing gems 1 or GPU gems 4 source code already avaiable in gpucomputing.net:
Book for November..
Right now:

Title


A Programmable Graphics Pipeline in CUDA for Order Independent Transparency1 new07-10-2010
High Performance Iterated Function Systems0 new07-02-2010
CUDA Implementation of the Tree-based Barnes Hut n-Body Algorithm0 new07-01-2010
Connected Component Labeling in CUDA - demo+code0 new06-30-2010
A Practical Guide toMassively ParallelMonte Carlo Simulations: The Ising Model0 new06-30-2010
Parallel LDPC Decoding using CUDA0 new06-30-2010
Path Regeneration for Random Walks0 new06-30-2010
GPU Gems 4: Deformable Volumetric Registration using B-splines Source Code0 new06-30-2010
Monte Carlo Photon Transport on the GPU0 new06-30-2010
Lattice-Boltzmann Lighting Models - Source Code0 new06-30-2010
RNA folding GPU0 new06-30-2010
Haar Classifiers for Object Detection with CUDA: Pixel-parallel processing kernel0 new06-29-2010
Multiclass Support Vector Machine0 new06-29-2010
Parallelization of the x264 encoder using OpenCL0 new06-21-2010
Cone-Beam CT image reconstruction using the Katsevich Algorithm0 new06-21-2010
Line forward projection on CUDA0 new06-11-2010

seems MareNostrum getting a rack of Fermis perhaps with IBM Power7

see now Nvidia would have to publish a PowerPC arch CUDA driver?

Or using PathScale with full open source based computing stack..
avaiable here branch from noveau:

http://github.com/pathscale/pscnv/commits/master
Seems Nvidia TCC supporting driver Fermi in IBM web site version 197.81

Catalyst 10.8 beta seems avaiable 10.7 coming 21/7..


Physx 3.0 coming with CPU improvements:
*auto threading
*sse enabled by default
Mafia has new runtimes NVIDIA PhysX driver: 10.04.02_9.10.0522.
Mueller has post paper of Fermi launch demo using water heigh fields plus particles..
Two other papers interesting from Nvidia research are:

HLBVH: Hierarchical LBVH Construction for Real-Time Ray Tracing
PantaRay: Fast Ray-traced Occlusion Caching of Massive Scenes

Hwu based course from Stanford:
http://code.google.com/p/stanford-cs193g-sp2010/wiki/ClassSchedule

Two interesting conferences program avaiable:

PACT
has intel gpu paper demystifying ..
also Revisiting Sorting for GPGPU Stream Architectures
which achieves near 500mkeys/s on gt200..



there is a workshop on gpus
http://informatik.technikum-wien.at/gpusca/
and web doesn't work.

The Nineteenth International Conference on
Parallel Architectures and Compilation Techniques (PACT)
Vienna, Austria, September 11-15, 2010
Interesting papers:
Scalable Thread Scheduling and Global Power Management for Heterogeneous Many-Core Architectures
Dynamically Managed Multithreaded Reconfigurable Architectures for Chip Multiprocessors
WAYPOINT: Scaling Coherence to Thousand-core Architectures
Scalable Hardware Support for Conditional Parallelization
Less is More: Trading off Work-Efficiency for Scalability in Irregular Programs
Revisiting Sorting for GPGPU Stream Architectures
D. Merrill, A. Grimshaw
An Integer Programming Framework for Optimizing Shared Memory Use on GPUs
W. Ma, G. Agrawal
DMATiler: Revisiting Loop Tiling for Direct Memory Access
A Software-SVM-based Transactional Memory for Multicore Accelerator Architectures with Local Memory
Automatic Vector Instruction Selection for Dynamic Compilation
An OpenCL Framework for Heterogeneous Multicores with Local Memory

SC10

I would like to review this papers:
Scalable Tile Communication-Avoiding QR Factorization on Multicore Cluster Systems
Parallel Fast Gauss Transform
Overlapping Methods of All-to-All Communication and FFT Algorithms for Torus-Connected Massively Parallel Supercomputers
The Multi-Scale Heart Simulation on Massively Parallel Computers
Using 3.5-D Blocking Optimization for Stencil Computations on Modern CPUs and GPUs
An 80-Fold Speedup, 15.0 TFlops, Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code
Exploiting 162-Nanosecond End-to-End Communication Latency on Anton
Strider: Runtime Support for Optimizing Strided Data Accesses on Multi-Cores with Explicitly Managed Memories
Multithreaded Asynchronous Graph Traversal for In-Memory and Semi-External Memory
OpenMPC: Extended OpenMP Programming and Tuning for GPUs
Scalable Graph Exploration on Multicore Processors
The 48-core SCC processor: the programmer’s view
Exploring a Novel Gathering Method for Finite Element Codes on the Cell/B.E. Architecture
Reducing Multicore Bandwidth Requirements for Combinatorial Multigrid
Diagnosis, Tuning and Redesign for Multicore Performance: A Case Study of the Fast Multipole Method
Scaling Hierarchical N-Body Simulations on GPU Clusters
Size Matters: Space/Time Tradeoffs to Improve GPGPU Applications Performance
The Sharing Tracker: Using Ideas from Cache Coherence Hardware to Reduce Off-Chip Memory Traffic with Non-Coherent Caches
Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Porting CUDA to OpenCL!
    Well so you want to port CUDA code to OpenCL: you are in AMD GPU competition of porting Cuda codes to opencl (see previous post) or you are ...
  • Megapost!
    Today fools{ *GTX 485 is 512 cores 3gbytes gddr5 and 850/1750 shaders.. *ati 5990 has 4 gpus in board.. *bulldozer benchmarks }end fools.. A...
  • About ATI and Nvidia drivers (OCL included)!
    Hi I have been investigating AMD and Nvidia drivers.. for 10.3 there are 3d hooks support for 120hz monitors but is d3d9 d3d10 or d3d11 enab...
  • things found in CUDA forums
    Also some CUDA news: Mandelbulb stereo angalyph -> have to port to 3D Vision http://forums.nvidia.com/index.php?showtopic=150985&st=2...
  • opencl/opengl linux interop! seen in opencl cuda 3.0 sdk samples
    Following my OpenCL/OpenGL Window interop work: now has come to Linux  for Nvidia GPU computing registered developers via 195.17 driver! Als...
  • State of the blog..
    Sorry for the delay guys of posting code of Apple OpenCL demos port.. the blog has been with no updated for more than 2 weeks in this rapid ...
  • Optix and OpenCL SDKs with Visual Studio 2010
    Optix 1.0 ========= install cg download Cmake 2.80 cmake says error dumpbin not found and it is cuda doesn't work with vc2010 so copy pt...
  • CUDA 3.0 forums stuff!
    1.Getting CUBIN instead of ELF If you need the older text format, you can disable ELF cubins in nvcc.profile by changing "CUBINS_ARE_EL...
  • News from the web!
    Some things learned in AMD forums: 1.Why 3xxx no OpenCL: Compute shader mode is a hardware feature that did not exist in the HD38XX line of ...
  • Shaders: measuring perf, source translation and parsing different languages!
    Hi, I hope to be pretty exhaustive of options for parsing and translating between graphics and compute shaders ( some open source) For DX sh...

Blog Archive

  • ►  2013 (5)
    • ►  September (1)
    • ►  March (3)
    • ►  February (1)
  • ►  2012 (1)
    • ►  December (1)
  • ▼  2010 (46)
    • ▼  July (4)
      • Some news!
      • DirectCompute Double precision Mandelbrot demo and...
      • A lot of things you probably don't know.. and a wo...
      • ATI Stream SDK roadmap
    • ►  May (1)
    • ►  April (3)
    • ►  March (9)
    • ►  February (15)
    • ►  January (14)
  • ►  2009 (125)
    • ►  December (51)
    • ►  November (53)
    • ►  October (21)
Powered by Blogger.

About Me

Unknown
View my complete profile