Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Two Million Particles at 25 Frames Per Second on an iPad

DZone's Guide to

Two Million Particles at 25 Frames Per Second on an iPad

· Java Zone
Free Resource

Build vs Buy a Data Quality Solution: Which is Best for You? Gain insights on a hybrid approach. Download white paper now!


Following on from my last post where I managed to calculate and render over 1,000,000 particles in realtime, I've done some pretty effective tweaking of the code to create an app that calculates and renders (with blur and trails) over 2,000,000 particles at around 25 frames per second on my iPad Air 2.

The main change is to reuse the compute shader not only to do the calculation and first render but also do the post-processing. 

In Swift, I set the thread groups and thread group count based on particleCount which is 221 or 2,097,152:

    particle_threadGroupCount = MTLSize(width:32,height:1,depth:1)
    particle_threadGroups = MTLSize(width:(particleCount + 31) / 32, height:1, depth:1)

Because my image 1,204 x 1,024 which is 1,048,576 pixels, I can reuse the kernel function to execute code on each pixel by converting the one dimensional thread_position_in_grid to a two dimension coordinate namedtextureCoordinate:

    const float imageWidth = 1024;
    uint2 textureCoordinate(fast::floor(id / imageWidth),id % int(imageWidth));

    if (textureCoordinate.x < imageWidth && textureCoordinate.y < imageWidth)
    {
        float4 outColor = inTexture.read(textureCoordinate);
        
        // do some work...
        
        outTexture.write(outColor, textureCoordinate);
    }

Having the single shader gave a significant speed improvement. Furthermore, because I'm now passing in a read access texture, I can composite the particles over each other which makes for a better looking render:

    const Particle inParticle = inParticles[id];
    const uint2 particlePosition(inParticle.positionX, inParticle.positionY);
    
    const int type = id % 3;
    
    const float3 thisColor = inTexture.read(particlePosition).rgb;

    const float4 outColor(thisColor.r + (type == 0 ? 0.15 : 0.0),
                          thisColor.g + (type == 1 ? 0.15 : 0.0),
                          thisColor.b + (type == 2 ? 0.15 : 0.0),

                          1.0);

One downside was that I was getting some artefacts  when reading and writing to the same texture. I've overcome this by using a ping-pong technique with two textures in the Swift code that toggle between being the input and output textures with each frame.

I use a flag Boolean to decide which texture to use:

        if flag
        {
            commandEncoder.setTexture(particlesTexture_1, atIndex: 0)
            commandEncoder.setTexture(particlesTexture_2, atIndex: 1)
        }
        else
        {
            commandEncoder.setTexture(particlesTexture_2, atIndex: 0)
            commandEncoder.setTexture(particlesTexture_1, atIndex: 1)

        }

        [...]

        if flag
        {
            particlesTexture_1.getBytes(&imageBytes, bytesPerRow: bytesPerRowInt, fromRegion: region, mipmapLevel: 0)
        }
        else
        {
            particlesTexture_2.getBytes(&imageBytes, bytesPerRow: bytesPerRowInt, fromRegion: region, mipmapLevel: 0)
        }

        flag = !flag

My last version of the code didn't write the image from Metal directly to the UIImageView component, rather, it used an intermediate UIImage instance. I found that by removing this variable could squeeze out an extra few frames per second. 

I've set the Metal optimisations to the maximum in the compiler settings and also prefixed my call to distance() with the fast namespace:

        const float dist = fast::distance(float2(inParticle.positionX, inParticle.positionY), float2(inGravityWell.positionX, inGravityWell.positionY));

For this demonstration, I've removed the touch handlers. There's one gravity well which orbits around the centre of the screen. It gives some nice effects while I plan how to productize my particle system.

All the source code for this project is available in my GitHib repository here.


Build vs Buy a Data Quality Solution: Which is Best for You? Maintaining high quality data is essential for operational efficiency, meaningful analytics and good long-term customer relationships. But, when dealing with multiple sources of data, data quality becomes complex, so you need to know when you should build a custom data quality tools effort over canned solutions. Download our whitepaper for more insights into a hybrid approach.

Topics:

Published at DZone with permission of Simon Gladman, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}