Tuesday, July 18, 2017

Adaptive Mixed-Resolution Particles


In the previous post I introduced to you the first version of adaptive offscreen particles. I also mentioned there a number of drawbacks. In the new version of AOP, which I want to talk about further, I tried to get rid of these disadvantages.

Motivation


In GPU Gems's Offscreen particles article was talked about mixed-resolution rendering. This approach is well suited to effects which have low resolution textures, such as smoke. But, what if we want to keep the details, such as sparks, fire, stones, and at the same time get a profit from mixed-resolution rendering? The answer is color contrast filter with overdraw prediction.

Prediction


The same approach is used as described in the post below. The result can be used to disable optimization in the case when the overhead is greater than the gain from its usage.

I used green color for buildings to hide content details.

Lower resolution particles rendering


Offscreen particles can be rendered in 1/4, 1/16, or smaller screen size.
The scene color buffer and the offscreen buffers should have the same bitness in order to insure the similar result after alpha blending. I would recommend A16B16G16R16F format.

Figure 1. Offscreen particles accumulated in a separate render target.

Depth preparing

For correct offscreen particle rendering we have to downscale the original depth buffer getting maximum depths.

Figure 2. Depth downsampling stages.


Full resolution detail particles rendering and edges fixing


Of course detail of offscreen particles on some places doesn't suite as. Therefore these places should be found with the color contrast filter and the depth detection filter and finally replaces by fullscreen particles.


Building the stencil mask

In performance reason the stencil mask should be used to separate the offscreen final apply and the fullscreen particles pass.

Color contrast filter

According to its name this filter searches the contrast in color. If the contrast high enough the detail is required. This filter is pretty cool thing because it depends on information on the screen. It means if camera is closely to particles less details will be found which leads to better performance.


Figure 3. Color contrast filter working result.

A - max component value of center color
B - max component value of neighbors colors
C - contrast, 15 by default
should_be_detailed = abs( B - A ) > A / C;


Edge detection filter

In addition this filter should be used to cover edges. My filter based on depth discontinues.

Figure 4. Edge detection filter working result.

Figure 5. Stencil mask for combining.


Combining


Based on different values from stencil mask, the offscreen particles are blended and the detailing particles are rendered into the scene buffer. How to apply offscreen particles into scene color buffer I wrote in the previous article. (I used green color for buildings to hide content details).

Figure 6. Applying low resolution offscreen particles into scene color buffer according to stencil mask.

Figure 7. Rendering detail particles into scene color buffer according to stencil mask.

 Figure 8. Final result.

As a result we saved details and got great performance boost! Particles with optimization rendered 2-3 times faster than without it in this particular example.

(article is not finished and will be updated)

Tuesday, April 11, 2017

Adaptive Offscreen Particles

Particle System is known of its sometimes huge overdraw and fill rate which lead to performance issues. Sometimes it can be critical when camera inside an explosion. Then lots of particles cause huge overdraw and dramatic drop of fps and pleasure of gaming. Particularly the problem can be solved through using lower fixed resolution offscreen render target, which was perfectly described in GPU Gems 3 here.

Conventional offscreen particles solutions still can't solve continuously growing of overdrawn pixels and have constant quality due to fixed dimension of the render target. Therefore that solution didn’t suit me.

I wanted to have Particle System which would be capable to predict the expected overdraw amount on the same current frame and scale itself quality down if it is necessary according to a budget of the system. I’ve found an elegant solution of problems mentioned above which I called Adaptive Offscreen Particles. The technique works on GPU only. There are no any readback to CPU required. The key of this technique is overdraw prediction.

Overdraw prediction

How to know the particles overdraw on the current frame? Just render them. It requires special render stage when all particles render to small render target N*N (N is 64 for example) with additive blending and writing the weight, which represents relative execution complexity. By default the written weight can be 1.0 for all shaders. Depth buffer on this stage also can be used in order to avoid accounting of invisible pixels. It gives rough but acceptable result. At the end we calculate the sum of all written weights which also can be treated as a number of overdrawn pixels. Calculation of the overdrawn pixels can be implemented in several ways. One of them in the compute shader by just summation of weights into global shared variable. Another one is downsample few times with summation of four neighbors till 1x1 result is got.

Figure 1. Accumulated by particles in 64x64 render target. Intensity was reduced for clarity.


Figure 2. Most left - source 64x64. Next - downsampled with summation. Most right - the final sum in 1x1 render target.

For convenience we are going to operate by a number of redrawn screens which is a very close approximation of actual number of redrawn full screens on the color pass. For instance usual explosion in a game can have up to hundreds of redrawn screens.
number of redrawn screens = overdrawn pixels / N * N

Particles rendering

Prediction calculation is finished. Now particles should be rendered in color render target with doing alpha blending. Keeping the color render target and the scene depth buffer in fixed full-screen size. Instead of changing their dimension like it is done in other approaches, only the virtual rect is going to be changed. This process is being done for the depth buffer (Figure 4) in order to particles have correct depth testing and for the particles which move into left top corner of the screen (Figure 5). To make such kind of shifting we have to get scale factor which will keep particles performance in scope of set budget.

Calculating rect scaling factor

// budget - Budget of the particle system. Screens allowed to be overdrawn without scaling.
// redrawn - Runtime calculated number of redrawn screens.
// min_scale - 0.25 usually. 16 times smaller rect
// floor helps to avoid frequent resolution changing
scale = clamp( 1.0 / sqrt( floor( redrawn ) / budget ), min_scale, 1.0 );


Figure 3. Graph of scale factor function for the four redrawn screens budget.

Depth buffer runtime scaling pixel shader code

// read N depths around, make the max of them
if ( any( screen_uv > scale ) ) discard;
float depth0 = read_depth( screen_uv / scale + offset0 );
...
float depthN = read_depth( screen_uv / scale + offsetN );
return max( depth0...depthN );


Figure 4. Scaling the scene depth buffer into particles depth buffer according to scale factor. Left - original scene depth buffer. Right - particles depth buffer with copied depth information into actual rect.


The positioning of the particles


// p - final output projection coordinates of particle’s vertex
// scale = 0 is full screen resolution
// scale = 0.5 is half screen resolution
// scale = 0.25 is quarter screen resolution

p.x = p.x * scale - ( 1.0 - scale ) * p.w;
p.y = p.y * scale + ( 1.0 - scale ) * p.w;


Figure 5. Scaling a particle’s quad into left top corner by the vertex shader.


Blending state for particles shader

Blend Operation = Additive
Source Blend Color = One
Destination Blend Color = Inverted Source Alpha
Blend Operation Alpha = Additive
Source Blend Alpha = Inverted Destination Alpha
Destination Blend Alpha = One


Bilateral Upsampling


Final stage is bilateral upsampling of the particles color buffer with special blending.

Figure 6. Left - buffer of accumulated particles in scope of the actual rect. Right - upsampled result blended with the scene color.


Figure 7. Comparison of upsampling without and with bilateral filter. Left - particles rendered with scale factor 0.25. Right - bilateral upsampling result.


Bilateral filter shader

float fine_scene_depth = read_scene_depth( screen_uv );
float4 result = 0.0f;

if ( scale == 1.0f )
 return read_particles_color( screen_uv );

int radius = scale >= 0.5f ? 1 : 2;

for ( int x = -radius; x <= radius; ++x )
{
 for ( int y = -radius; y <= radius; ++y )
 {
  float2 uv = screen_uv * scale + float2( x, y ) * pixel_size;

  float coarse_depth = read_particles_depth( uv );
  float4 particles_color = read_particles_color( uv );

  float weight = 1.0f / ( depth_epsilon + abs( fine_scene_depth - coarse_depth ) );

  result += particles_color * weight;

  weights += weight;
 }
}

return result / weights;

Blending state upscaling shader

Blend Operation = Additive
Source Blend Color = One
Destination Blend Color = Inverted Source Alpha
Blend Operation Alpha = Reverse Subtract
Source Blend Alpha = OneDestination Blend Alpha = Additive


Performance

Here is performance comparison result of typical explosion in a game. Shaders support lighting and shadowing. 300 particles drawn, 50-60 screens redrawn. 1920x1080, NVidia GTX 680.




Props
  • Budget for the particle system
  • Scalable quality
Cons
  • Overhead for the scaling, but most of times covered by its profit

Links

Wednesday, February 10, 2016

Screen-Space Shadows

Dynamic shadows is the one of the most slowest passes in any renderer. Especially when it comes to cascading shadows that cover huge visible region and even more. The problem is in the big number of draw calls and overdraw to the depth buffer, which is especially critical in the case of flora and grass in particular. I was thinking about ways to effectively solve this problem. As the result I developed technique called Screen-space shadows, which allows us to get soft, stable and fast shadows from grass. Also Screen-space shadows could be used for building shadows from other small objects as well as from far mountains. The technique in already has been using in one great game.





Description

Because of performance reasons the calculations can be done in half screen size.
Jitter P to E marching distance using screen-space random or random texture. 16-32 samples. Result = 1 Foreach S in (E-P) If ShouldReceive(P) and ShouldCast(S) VtoS = DistanceToEye(S) VtoT = DistanceToEye(T) If VtoS < VtoT and Abs(VtoS - VtoT) < Tolerance(S) and not OutOfScreen(T) Result = ShadowAttenuation(S) or zero End If End If End Foreach

Tolerance: function of distance from V to S
ShadowAttenuation: function of distance from P to T 

Next stages: 
Temporal stabilization, which is the key of stabilization Bilateral upscale, depends on quality settings Directional blur, depends on quality settings


Results


Figure 1. Typical shadows


Figure 2. Screen-space shadows

 Figure 3. World of Tanks. No any shadows on the grass

Figure 3. World of Tanks. Screen-space shadows on the grass



Friday, April 16, 2010

Editor overview: SoundMix Editor

The SoundMixEditor is a visual editor for mixing sounds and effects.

+ Using Audiere.
+ Many nodes for sound tree modification.
+ Controlled nodes from script and database.
+ Supports 3D sound.
+ Many more.

Screen short:

Friday, April 9, 2010

Thursday, March 4, 2010

Editor overview: PostProcessEditor

Supports next post-effects:

+ Bloom
+ DepthOfField
+ MotionBlur
+ ColorCorrection
+ SSAO
+ Material
+ Haze
+ And more.

Screen shots:

Wednesday, March 3, 2010

Editor overview: SkeletalMeshEditor

+ Animations.
+ Setting animation segments directly by hands.
+ Sockets with previews.
+ Feedbacks (play sound, call script function, attach mesh, attach partices system, more) on setted frame or time.
+ Lods. Change of amount of bones and triangles, animation update frequency, depending on ScreenFactor.
+ And more.

Screen shots:


Sunday, February 28, 2010

Editor overview: MaterialEditor updates

Post about MaterialEditor: http://whatdreamswaycome.blogspot.com/2010/01/editor-overview-nodes-based.html

New:
1. Built-in CurvesEditor to flexible control shader input - Float and the Float4 variables.
2. Added Diffuseness, Ambient, LightingMask material inputs.
3. Added new properties to material.
4. Original appearance is a bit changed.
5. Material expressions can change a size.
6. Added new material expressions: If, For, While, Arrays, etc.
7. Arrow heads to the connections lines.
8. New menu items.
9. Faster compilation.
10. The generated code can be viewed and extremely changes by hands.
11. And more.

Screenshot:

Sunday, February 14, 2010

Static Lighting with GI

I'm currently working on implementing a static lighting with global illumination. Here are some results, baked high resolution shadows from multiple light sources of different types plus Radiosity Normal Mapping. GI comming soon.

Sunday, February 7, 2010

Editor overview: LensFlaresEditor

+ Any materials can be used in the one flare.
+ Flexible options that depend on flare offsets/distance (color/alpha/rotation/size/etc)
+ Not limited number of flares in one effect.
+ And more.

Screenshots:


Tuesday, February 2, 2010

Editor overview: ParticesEditor

Features:

+ Built-in CurvesEditor allows to modify any values how you want.
+ You can create how many you want emittors with any materials and mashes.
+ Editor support PhysX fluids, and dynamic particles collision.
+ And more.

Screenshots:


Wednesday, January 27, 2010

Editor overview: nodes-based MaterialEditor

Can be setted
+ Phong, Cook-Torrance, Anisotropic lighting models.
+ Different blending modes.
+ Back face material.
+ Motion blur amount.
+ Radiosity color.
+ and more.

Unlike ue3, it can be generates also vertex and geometry(dx10) shaders.

Screenshots: