"Verifying stream compaction using the prescan specification"

This experiment demonstrates that GPUVerify can scale to very large
thread-counts using modular verification: replacing the Blelloch prescan
implementation in the stream compaction kernel with its monotonic
specification.

---+ FILES
  - kernel.cl
  contains the stream compaction kernel with the prescan algorithm replaced
  with its monotonic specification.

---+ REPRODUCING THE EXPERIMENTAL RESULTS

Verify this test with different numbers of threads N like so:

> gpuverify --num_groups=1 --local_size=N kernel.cl

For example, to verify with 1024 threads:

> gpuverify --num_groups=1 --local_size=1024 kernel.cl
