L3DT users' community
Large 3D terrain generator


It doesn't hurt to ask...


Postby metalliandy » Wed Jan 12, 2011 1:18 pm

Hi Aaron,

I was reading the other thread that mentioned this but though it might be an idea to make my own so it could be more focused :)

aaron wrote:Hi Monks,

Sorry, nothing to report yet, and I probably won't have anything for quite a while. In order of optimsations, I'll do yet more multithreading first, then SSE, then CUDA (maybe). The more reading I do on the subject, the more I've realised that CUDA is only applicable to a very narrow band of problems. Somewhat peversely, of all the calculations in L3DT, only the relatively 'fast' light-map generation algorithm might fit into the CUDA niche of massively parallel repetative non-forking operations on long vectors of floating point data. The 'slow' algorithms (erosion, water table flooding, AM generation, TX anti-aliasing) are not readily suitable to CUDA, although I should be able to make some headway here with more multithreading and/or SSE.


I was wondering if there was any u-turn on Cuda implementation?

xNormal implemented the Optix renderer for Normal and Ambient Occlusion (light) maps with some stunning results and a massive increase is speed (even when compared to a i7 920)
Another thing that might be worth a try is for faster AA on all maps is to render the maps at double the selected resolution with 0xAA and then resize by half (which gives you 4xAA).
This is a much used technique in game development and is often much faster then just rendering out 4x AA at native res.

Thanks for listening :)
Posts: 103
Joined: Tue Mar 20, 2007 11:28 am

Postby Aaron » Thu Jan 13, 2011 12:51 am

Hi Metalliandy,

I tend to agree with my earlier self. L3DT's most time-consuming calculations aren't readily suitable for CUDA. Taking the example of texture anti-aliasing; in L3DT the degree of anti-aliasing may be different for each land type in a map; rocks typically use low antialiasing to give sharp edges, whereas grass and sand have high antialiasing to give smooth transitions. The tricks used for super fast GPU anti-aliasing in games optimise away this soft of computationally expensive flexibility.

Regarding further optimisations; once I've multithreaded everything that can be multithreaded, and applied SSE to the amenable calculations, then I might look at CUDA.

Thanks for the discussion and the reminder to think about this again.

Best regards,
User avatar
Site Admin
Posts: 3325
Joined: Sun Nov 20, 2005 2:41 pm
Location: Melbourne, Australia

Re: Cuda?

Postby metalliandy » Fri Jan 14, 2011 12:35 am

It was worth a try eh? :P
Thanks again for looking :)
Posts: 103
Joined: Tue Mar 20, 2007 11:28 am

Re: Cuda?

Postby jack.dingler » Sat Mar 17, 2012 6:15 pm

With CUDA, you should still get a big increase in performance by using multiple cores.

It's true that you can't get a super big performance increase, but that doesn't mean that you won't get a significant increase.

The way it breaks down is that each core can run a separate stack with branching operations that are distinct for that core.
If each thread on that core can run non branching operations, then they'll run faster in step.
If a thread deviates with a branch (if statement), then that thread will run to completion without a context switch to another thread, unless it surrender's it's execution.

So you still get a performance boost per core, even with branching code. It's just that the speed boost is even better if the threads on a core, don't branch, or branch together.

A little trickier operation might be to get the branching operations out of the way early, perhaps with temporary maps, then have the threads process them in a non-branching operation in a second stage operation.
Contributing member
Posts: 36
Joined: Mon Oct 09, 2006 2:58 am

Return to Feature requests

Who is online

Users browsing this forum: TurnitinBot [Bot] and 2 guests