Add sublayer compute function and example project for dense
This is a PR to fix the memory problem (issue #59 (closed)) when unrolling large loops.
The idea is to break up the loop by partitioning the output array for each layer call.
This PR only addresses the fully connected layer.