Development weblog
Large 3D terrain generator

October 2

It begins...

The launch of the v2.6 update to L3DT has begun, starting with the Standard edition (see release ann.) The Professional and L3DT for Torque editions will follow in a week or so, once I’ve finished writing their much lengthier release announcements, and once I’m slightly more satisfied that I haven’t cocked-up anything at the last minute.

Stay tuned...

September 18

The worst thing about developing L3DT...

...is - without a doubt - tweaking the design/inflate heightfield algorithm. Every time I touch the code for design/inflate, or even think about touching the code, I instantly lose a week of my life.

Here’s why:

OK = zCalcHF_InflateMosaic(zHF, SwapMap1, Name1, NULL, TileSize, hFormat, zDM, 2); // to 1/32 res
zCalcMan_AdvanceCalcStage();
if(DoErosion && OK) OK = zCalcHF_ChannelPass(SwapMap1, zDM, 10, 0, 1, 20, 0.005f, 0.2f, true);
zCalcMan_AdvanceCalcStage();
if(DoErosion && OK) OK = zCalcHF_ThermalPass(SwapMap1, zDM, 5, 0, 3, 0.05f, ThermalMaxGrad);
zCalcMan_AdvanceCalcStage();
 
OK = zCalcHF_InflateMosaic(SwapMap1, SwapMap2, Name2, NULL, TileSize, hFormat, zDM, 2); // to 1/8 res
zCalcMan_AdvanceCalcStage();
if(DoErosion && OK) OK = zCalcHF_ChannelPass(SwapMap2, zDM, 10, 0, 1, 10, 0.01f, 0.2f, true);
zCalcMan_AdvanceCalcStage();
if(DoErosion && OK) OK = zCalcHF_ThermalPass(SwapMap2, zDM, 5, 0, 3, 0.05f, ThermalMaxGrad);
zCalcMan_AdvanceCalcStage();
if(OK) OK = zCalcHF_PeakPass(SwapMap2, zDM);
zCalcMan_AdvanceCalcStage();
 
if(OK) OK = zCalcHF_InflateMosaic(SwapMap2, SwapMap1, Name1, NULL, TileSize, hFormat, zDM, 1); // to 1/4 res
zCalcMan_AdvanceCalcStage();
if(DoErosion && OK) OK = zCalcHF_ChannelPass(SwapMap1, zDM, 5, -1, 1, 10, 0.02f, 0.2f, true);
zCalcMan_AdvanceCalcStage();
if(DoErosion && OK)  OK = zCalcHF_ThermalPass(SwapMap1, zDM, 1, -1, 10, 0.05f, ThermalMaxGrad);
zCalcMan_AdvanceCalcStage();
 
if(OK) OK = zCalcHF_InflateMosaic(SwapMap1, SwapMap2, Name2, NULL, TileSize, hFormat, zDM, 1, 0.75); // to 1/2 res
zCalcMan_AdvanceCalcStage();
if(OK) OK = zCalcHF_VolcanoPass(SwapMap2, zDM);
if(OK) OK = zCalcHF_MountainPass(SwapMap2, zDM);
zCalcMan_AdvanceCalcStage();
if(DoErosion && OK)  OK = zCalcHF_ChannelPass(SwapMap2, zDM, 5, -1, 1, 10, 0.02f, 0.2f, true);
zCalcMan_AdvanceCalcStage();
if(DoErosion && OK)  OK = zCalcHF_ThermalPass(SwapMap2, zDM, 1, -2, 10, 0.02f, ThermalMaxGrad);
zCalcMan_AdvanceCalcStage();
 
if(OK) OK = zCalcHF_InflateMosaic(SwapMap2, zHF, HFname, "HF", TileSize, hFormat, zDM, 1, 0.5); // to final res
zCalcMan_AdvanceCalcStage();
if(OK) OK = zCalcHF_PlateauPass(zHF, zDM); 
zCalcMan_AdvanceCalcStage();
if(OK) OK = zCalcHF_TerracePass(zHF, zDM);
zCalcMan_AdvanceCalcStage();
if(DoErosion && OK) OK = zCalcHF_ChannelPass(zHF, zDM, 1, -2, 1, 10, 0.1f, 0.2f, false);
zCalcMan_AdvanceCalcStage();
if(OK)	OK = zCalcHF_FileOverlayPass(zHF, zDM);
zCalcMan_AdvanceCalcStage();

The above snippet of code is a small segment of the DesignInflate128M algorithm. It shows the chain of alternating calls to fractal inflation, channelling erosion, thermal erosion, peak overlays, mountain overlays, volcano overlays, terrace/cliff overlays, plateau overlays, and file overlays. Each version of the design/inflate algorithm (i.e. 16x, 32x, 64x and 128x) have similar, but slightly different, chains of subroutine calls like this one. For each one, I have to painstakingly tune and test the parameters to the subroutines, particularly those for the channelling and thermal erosion, and fractal inflation function calls. It is a bear of a job, and normally something best avoided.

Last week - against all reason, logic and experience - I decided that design/inflate needed a little update. Specifically, I thought that the fractal inflation was introducing too much noisy randomness in the 128x and 256x algorithms. A small tweak, I thought. A final spit-and-polish of the parameters before releasing L3DT version 2.6, I thought. What could possibly go wrong? I thought.

On closer inspection, I found that the fractal inflation subroutine (represented above by ‘zCalcHF_InflateMosaic’) had a dubious noise amplitude calculation that resulted in disproportionately large noise levels for long inflation chains (i.e. 128x, 256x.) Conversely, it produced too little noise for short inflation chains (i.e. 8x, 16x). The solution I took was to change the noise amplitude calculation from a crazy bounded inverse exponential function of the horizontal scale to a simple linear function of the horizontal scale. Easy. After a few short days of parameter tuning, I had all of the design/inflate algorithms working nicely - maybe even better than ever. They all looked great.

...and then I changed the horizontal scale. Disaster. Whilst all the design/inflate algorithms now worked perfectly for 10m/vertex, they looked horrible with 1m/vertex, and worse at 100m/vertex. That’s not good.

So, I’m back to the drawing board, with little to show for the last four days or so. Perhaps that crazy bounded inverse exponential function wasn’t so bad after all...

Cheerio, Aaron.

Bootnotes:

  1. For the final release of L3DT v2.6, I will remove the 8x and 256x design/inflate algorithms. They’re not worth the trouble.
  2. If all goes well, we should see L3DT version 2.6 released in late September. If not, then early October.

August 29

Light map calculation benchmarks

Hi All,

Yesterday I posted some benchmark results comparing the speed of texture generation in with the current release (v2.5c) against the forthcoming release (v2.6). Today, I’m going to compare the speed of lightmap generation.

Benchmark conditions

The conditions for this test were the same as those used yesterday, except that instead of generating the texture map, I was generating the light map.

Please note that these benchmarks are for the lighting calculation only, and do not include the shadow-casting calculation. One calculation at a time.

Part 1: L3DT v2.5c mosaic vs. non-mosaic

The graph below shows the performance of the previous release of L3DT Professional, version 2.5c, for both mosaic and non-mosaic light map generation.

:l3dt:2008:aug:benchmark_lm_v25c.png

As appalling as the benchmarks were yesterday for texture generation with L3DT 2.5c, these results for light-mapping are worse. As before, you can see that the mosaic map calculation in v2.5c is much slower than the non-mosaic calculation. What makes these results worse than yesterday’s was that the mosaic light mapping algorithm in v2.5c threw errors and aborted when run with three or four active cores, and thus the three and four core mosaic results are missing. These errors in the mosaic light map system were related to the mosaic map cache manager, which as I described yesterday, was very poor at handling concurrent requests from multiple threads. This has been fixed in v2.6.

Part 2: L3DT v2.6 mosaic vs. non-mosaic

Now fore the results from the forthcoming release, version 2.6. The graph below shows the performance v2.6 for both mosaic and non-mosaic light map generation:

:l3dt:2008:aug:benchmark_lm_v26.png

Here you can see that the light map performance scale up linearly with the number of cores (as it should), and that the mosaic map and non-mosaic light map calculations are now about the same speed (as they should be). So, whilst version 2.5c couldn’t handle multi-core generation of mosaic light maps, version 2.6 has no such problems.

Part 3: L3DT v2.5c vs. L3DT 2.6

I apologise in advance if this is a little gratuitous, but as I did yesterday, I’m now going to directly compare the results for the old release (v2.5c) with the forthcoming release (v2.6).

First up is the comparison of non-mosaic light map generation:

:l3dt:2008:aug:benchmark_lm_ram.png

There wasn’t much wrong with the non-mosaic calculations in v2.5c, so the improvements here are fairly modest. These results show that version 2.6 (light green) is about 10% faster than v2.5c (dark green) for non-mosaic lgiht map generation. This improvement is partly due to a compiler upgrade (now using MSVC 2008), and partly due to some minor optimisations in the light mapping algorithm.

As with yesterday, the big news remains the improvements in mosaic light map generation:

:l3dt:2008:aug:benchmark_lm_mosaic.png

The results show that for single-core mosaic light maps, version 2.6 is ~700% faster than v2.5c, and for dual-core, it’s 1500% faster.

No direct comparison can be made for triple and quad-core calculations, because release 2.5c simply didn’t work with more than two cores. However, if you take the fastest speed achieved with version 2.6 (which was for quad core) and compare it with the fastest working speed from v2.5c (which was for single core), you get a speed increase of two thousand four hundred percent.

Conclusion

If you like to make big maps, upgrading to L3DT v2.6 will save you a lot of time. It your computer happens to be multi-core, the time savings could be tremendous*.

The release date for L3DT 2.6 is mid-September.


* Your mileage may vary. The actual time savings depends on how you’re using L3DT, your choice of settings, and how your computer is configured.

August 28

Texture calculation benchmarks

Hi All,

With L3DT release 2.6 only a few weeks away, I thought this might be a good time to run some head-to-head comparisons of the new release (v2.6) against the previous release (v2.5c). In particular, I’m going to show you just how far we have come in optimising the multi-core, mosaic-mapped calculations in L3DT Professional Edition.

Benchmark conditions

Before we get to the numbers, I should describe how the tests were conducted. Basically, I generated a complete map (heightmap, lightmap, etc.), and then I ran the texture generation algorithm repeatedly with 1, 2, 3, and 4 cores enabled, using mosaic and non-mosaic texture generation.

The results will be presented in terms of the texture pixel throughput, measured in pixels per millisecond. Larger values are better, and imply faster texture generation.

All of the benchmark tests were conducted on the same map, with the following settings:

Heightfield settings 1024×1024, 10m/pixel, Design/Inflate 64x algorithm.
Attributes map settings 2048×2048 (2x res), temperate climate.
Normal map settings 4096×4096 (4x res), bump-mapping enabled.
Light map settings 4096×4096 (4x res), shadows & water effects enabled.
Texture map settings 4096×4096 (4x res), 2x anti-aliasing, lighting & strata effects enabled.

For the mosaic tests, the attributes, normal, light map and texture map were all mosaics, with a tile size of 512×512 pixels. For the non-mosaic tests, all these maps were in-RAM, with mosaic mapping disabled.

The benchmarking system was setup as follows:

CPU Intel Core2 Quad Q6600 @ 2.4GHz
RAM 4GB Kingston DDR2
HDD 500 GB WD SATAII
OS Microsoft Windows Vista Business SP1 (64-bit)

Part 1: L3DT v2.5c mosaic vs. non-mosaic

The graph below shows the performance of the previous release of L3DT Professional, version 2.5c, for both mosaic and non-mosaic texture generation.

:l3dt:2008:aug:benchmark_v25c.png

These results show two very bad things about the mosaic system in v2.5c:

  1. The first problem was that the speed of the mosaic map calculation (blue bars) simply did not increase when more cores were added. This indicates that the the mosaic map cache manager was very inefficient at handling concurrent requests from multiple threads, and was making threads wait for one-another far too often.
  2. The second problem was that even for a single-core calculation, the mosaic map calculation was only ~35% as fast as the non-mosaic (RAM only) texture calculation. This implies that there was a huge overhead in reading/writing pixels in a mosaic map, even when the tiles were already loaded into RAM.

If you described these results for the mosaic map system “appalling”, I would be inclined to agree. Fortunately, I’m very pleased to say that these two problems have now been solved in version 2.6, as shown by the next section.

Part 2: L3DT v2.6 mosaic vs. non-mosaic

The graph below shows the performance of the forthcoming release of L3DT Professional, version 2.6, for both mosaic and non-mosaic texture generation:

:l3dt:2008:aug:benchmark_v26.png

Here you can see that the mosaic map results (blue bars) scale up with the number of cores. In fact, the mosaic map calculation is now faster than the non-mosaic calculation (red bars), by about 10% for any number of cores (up to four, anyway). This huge improvement in the mosaic system was brought about by a smarter mosaic cache manager with far fewer thread locks, and by optimising the texture calculation to bypass some of the overheads in the mosaic system.

Part 3: L3DT v2.5c vs. L3DT 2.6

Now I’m going to directly compare the results for the old release (v2.5c) with the forthcoming release (v2.6).

First up is the comparison of non-mosaic texture generation:

:l3dt:2008:aug:benchmark_ram.png

Across the board, v2.6 (light green) is about 25% faster than v2.5c (dark green) for non-mosaic texture generation. Not bad.

However, the big news is the improvements in mosaic map texture generation:

:l3dt:2008:aug:benchmark_mosaic.png

Version 2.6 (light green) is monumentally faster than v2.5c (dark green) for mosaic texture generation, ranging from an impressive 275% improvement for single-core calculations, to an absurd 1100% improvement for quad core calculations (yes, one thousand and one hundred percent faster).

Conclusion

The mosaic map system in L3DT 2.6 is a colossal improvement over that of v2.5c, providing a speed-up of better than a thousand percent for texture generation on quad-core processors. Speaks for itself, really.

The release date for L3DT 2.6 is mid-September.


Disclaimer: These benchmark results apply only to the texture generation algorithm. You should not expect the heightfield, water-mapping or attributes map algorithms to improve by comparable amounts. Similar improvements may be occur with the normal map and light map calculations, as these algorithms are structured identically to the texture algorithm. However, this is not guaranteed, as the benchmark tests have not yet been conducted for those algorithms.

older entries >>

 
start.txt · Last modified: 2008/10/02 07:12 by aaron
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki