Hi Dreamora,
Dreamora wrote:Didn't think of cases were blocks actually get that large that without paging nothing would be possible anymore.
Whoa, hang on a minute. Are you running these calculations
without mosaic tiling enabled? If mosaic tiling is disabled, and you still see a big difference in tile processing speed, then I have a more serous problem on my hands. If all the maps are in RAM, the threads should all run at approximately the same speed. Oops…wait a minute. The light map calculation uses a temporary mosaic map to hold the shadow state, so that calculation will always have the tiling problem I described earlier.
Anyway, regarding the tile cache problem, which certainly won’t help the matter even if you’re using non-mosaic maps, I’ve updated the tile management code to allow the cache size to grow as required. This would mean that you could theoretically throw hundreds of threads at the map and process all the tiles at once without overloading the cache (though you’d fry your RAM). However, I still need to modify each of the tile calculations to ‘lock’ the required mosaic tiles. The next release may not have these changes applied since we’re so late in the dev cycle (RC1 right now), but this change will be one of the first cabs of the rank for v2.5a.
Just FYI: When changing the tile manager, I was horrified to find that cache size was still hard-coded at 9 tiles per map, just as it was when I first wrote the mosaic algorithm some time at the start of the millennium. This means that if you have enough threads running that they need more than nine tiles at any one time, the least frequently used tiles will be loaded/saved to disk very often, which is probably why your fourth thread is running really slowly. Anyway, this sort of problem will be fixed when I update the tile calculations.
Dreamora wrote:Any chance that in a case as described above a thread that has finished his block could go to the next "block in work" and starts to work on it from back to front at least on textures where this is possible?...
I assume thats why each thread has its own block.
Yep, each thread gets its own tile so that I don’t have to check whether the threads are running over one another (this requires thread synch, and it’s slow). However, what I can do is assign the threads to sub-tiles in each tile (say, split a tile into 2x2 sub-tiles), and this would mean that each core is substantially calling on the same tile cache. That would certainly be more memory-efficient, and I’ll put it on the to-do list as well. Thanks for the suggestion!
Cheers,
Aaron.