loadReducedWGSL¶
Overview¶
Loads data (usually from main memory) with multiple values per thread, but reduces them into a single value per thread. Supports multiple orders of data (both in access order and storage order).
grainSize controls how many items are loaded per thread.
For each thread, it will essentially load the first value, and then combine that with subsequently loaded values.
@author Jonathan Olson <jonathan.olson@colorado.edu>
Type loadReducedWGSLOptions¶
CASE: if commutative reduce, we want to load coalesced, keep striped, so we can skip extra workgroupBarriers and rearranging. We'll use convergent reduce anyway CASE: if non-commutative reduce, we want to ... load blocked (?), reverseBits into convergent, and convergent-reduce? CASE: if non-commutative reduce on striped data, we want to load striped, morph into convergent, and convergent-reduce CASE: scan: load how the data is stored (blocked/striped), NO storeOrder, then scan.
- value: WGSLVariableName
the "output" variable name - binaryOp: BinaryOp<T>
- loadExpression?: ( ( index: WGSLExpressionU32 ) => WGSLExpressionT ) | null
wrap with parentheses as needed TODO: should we always do this to prevent errors? - loadStatements?: ( ( varName: WGSLVariableName, index: WGSLExpressionU32 ) => WGSLStatements ) | null
( varName: string, index ) => statements setting varName: T, - inputOrder?: "blocked" | "striped"
The actual order of the data in memory (needed for range checks, not required if range checks are disabled) - inputAccessOrder?: "blocked" | "striped"
The order of access to the input data (determines the "value" output order also) - sequentialReduceStyle?: "factored" | "unfactored" | "nested"
Whether local variables should be used to factor out subexpressions (potentially more register usage, but less computation), or also whether to nest the combine calls, e.g. combine( combine( combine( a, b ), c ), d ) - useSelectIfOptional?: boolean
- orderOverride?: boolean
(WARNING: only use this if you know what you are doing) If true, we will not check that the binaryOp is commutative if the order does not match. - & RakedSizable & GlobalIndexable & WorkgroupIndexable & LocalIndexable & OptionalLengthExpressionable
Source Code¶
See the source for loadReducedWGSL.ts in the alpenglow repository.