Author: eilemann@gmail.com
State:
- Implemented in 0.6
- view_equalizer in 0.9 (not described here)
- Boundaries: implemented in 0.9.1
Overview
Compound load balancing tries to optimally use all child resources assigned to a compound. Higher-level load balancing will dynamically change the resource assigned to a compound, based on the overall system load.
Compound load balancing either adapts the viewport or the range of the children to make sure all resource are busy all the time so that optimal scalability can be achieved.
The first implementation will be fully transparent and uses the timing values from the last finished frame. Later versions should be able to get load information for the current frame from the application, to allow a more accurate approach.
Implementation
The statistics interface does provide us with the timing values from the last finished frame. This timing values, together with the 2D viewport or DB range give us a load distribution. The ROI readback interface will be used to refine the load distribution.
The goal of the load balancer is that all channels finish at the same time, since channels might be ahead due to the compound latency. Note that this differs from the typical approach of giving all channels the same work for the next frame, but will achieve the same result for a latency of 0.
Assume frame will take the same time as the last finished frame, i.e. sum( channel end time - channel start time ).
Generate 2D load grid. Data per cell: viewport, load per square unit. Split existing cells as necessary when adding new, overlapping data. Cells are never overlapping. Default cell has full(-fixed) viewport with load 0.
Each channel has a time budget for the next frame. Time budget is derived from the frame time and the 'early finished time' of this channel:
- T_c = T_c_early + T_remaining/n_channels
- T_remaining = max( 0, T_frame - sum( T_c_early ))
- T_c_early = T_last - T_c_finish
Load-balancing approach is:
- Compute total work = sum( cell vp * load ). If total work is 0 set all cell's load to 1 and recompute (initial state).
- Normalize cell load so total work = frame time, i.e., each cell's load is now estimated time needed to render the cell.
- Generate/get two-dimensional kd-split-tree of all children. Tree node content is T_node = T_node_left + T_node_right for non-leafs or T_node = T_c for leafs.
- Find split position of node by iterating over cell list, where sum( load cell ) == T_node.
- Generate two new cell lists for left and right subtree while finding the position. Repeat until leaf reached.
- Compute VP/range for each leaf from cell list of leaf node, based on the split position.
RFE: Tile and Range Boundaries
See also RFE 2894535: Restrict 2D loadbalancer to tile boundaries
Implementation
Implement file format as below
Add:
void LoadEqualizer::setBoundary( const Vector2i& boundary );
void LoadEqualizer::setBoundary( const float boundary );
const Vector2i& LoadEqualizer::getBoundary2i() const;
float LoadEqualizer::getBoundaryf() const;
Vector2i _boundary2i; // default: 1 1
float _boundaryf // default: numeric_limits<float>::epsilon
Add 'Vector2i boundary' to LoadEqualizer::Node
In _assignTargetTimes
set boundary on leaf nodes
compute boundaries on non-leaf on up traversal
Note: add values for split axis, max values for non-split axis
In _computeSplit
replace current epsilon/MIN_PIXELS code with boundary computation
File Format
compound
{
load_equalizer
{
mode [ 2D | VERTICAL | HORIZONTAL | DB ]
boundary [ x y ] | float # x,y: tile boundary for 2D, float: range
}
}
API
Open Issues
Channels within the same thread.
