- Notifications
You must be signed in to change notification settings - Fork449
Port adjacent difference into CUB#331
Port adjacent difference into CUB#331
Conversation
10db20a
to891c10c
CompareThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is a great first pass 👍 Let me know when you're ready for a re-review.
3c13360
to53ffc37
Compare370bee7
to88a4ae8
CompareThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
LGTM -- just some comments on documentation, otherwise this is ready to start testing 👍
* `{ ...3], [4,2,1,1], [1,1,1,1], [2,3,3,3], [3,4,1,4] }`. | ||
* and that `valid_items` is `507`. The corresponding output `result` in | ||
* those threads will be | ||
* `{ ..., [-1,2,1,0], [0,0,0,-1], [-1,0,3,3], [3,4,1,4] }`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
(continuing from#331 (comment))
Gotcha, so the last 5 values are unchanged random values that are ignored by the algorithm.
There's another convention that uses-
to stand in for such values, e.g.
cub/cub/device/device_histogram.cuh
Lines 181 to 182 ind6ba3cf
* float* d_samples; // e.g., [2.2, 6.1, 7.1, 2.9, 3.5, -, -, | |
* // 0.3, 2.9, 2.1, 6.1, 999.5, -, -] |
I think it's a little easier to parse with the non-numeric characters, since it's immediately clear that theycan't be involved in the calculation. If you agree, let's change this to
* `{ ...3], [4,2,1,1], [1,1,1,1], [2,3,3,-], [-,-,-,-] }`.* `{ ..., [-1,2,1,0], [0,0,0,-1], [-1,0,3,-], [-,-,-,-] }`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Gotcha, so the last 5 values are unchanged random values that are ignored by the algorithm.
These values aren't ignored by the algorithm:
As the documentation states, the input value is copied without modifications if the neighbour value is out of bounds.
Ifinput
andoutput
parameters point to different memory locations, these values will be copied without modification. I'm afraid that non-numeric characters would mean that the output values are unchanged at all, for example:
input { 1, 2, 3 };output { 4, 5, 6 };valid_items = 0;adjacent_difference(input, output, valid_items);// output should be { 1, 2, 3 } and not { 4, 5, 6 }
The same withstd::adjacent_difference
documentation. The output for:
std::vector v {2, 4, 6, 8, 10, 12, 14, 16, 18, 20};
is
2 2 2 2 2 2 2 2 2 2
rather than:
- 2 2 2 2 2 2 2 2 2
Do you think it'll be clear from the-
notation that the values are copied?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
That's a good point -- I'm fine leaving this as-is.
* __global__ void ExampleKernel(...) | ||
* { | ||
* // Specialize BlockAdjacentDifference for a 1D block of | ||
* // 128 threads on type int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
s/on/of/
(this is repeated in a few other docstrings)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
There are quite a few places with this typo:
rg "threads on type" | wc -l33
I'll fix this in theblock_scan
,block_discontinuity
andblock_reduce
documentation as well.
28c79e4
toa64d24a
Comparef83d279
to6d052db
ComparegpuCI:NVIDIA/thrust/pull/1577 |
The PR contains:
thrust::adjacent_difference
algorithm;FlagHeads
andFlagTails
methods fromBlockAdjacentDifference
structure;BlockAdjacentDifference
along with documentation and tests.Note that the PR is based on some of the features introduced in
MergeSort
porting.