12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182 |
- //========================================================================================================================================================================================================200
- // INFO
- //========================================================================================================================================================================================================200
- //======================================================================================================================================================150
- // UPDATE
- //======================================================================================================================================================150
- // 2009.12 Lukasz G. Szafaryn
- // -- converted from MATLAB to CUDA
- // 2010.01 Lukasz G. Szafaryn
- // -- arranged, commented
- // 2012.05 Lukasz G. Szafaryn
- // -- arranged, commented
- // 2012.05 Lukasz G. Szafaryn
- // -- converted from CUDA to OpenCL
- //======================================================================================================================================================150
- // DESCRIPTION
- //======================================================================================================================================================150
- // The Heart Wall application tracks the movement of a mouse heart over a sequence of 104 609x590 ultrasound images to record response to the stimulus.
- // In its initial stage, the program performs image processing operations on the first image to detect initial, partial shapes of inner and outer heart walls.
- // These operations include: edge detection, SRAD despeckling (also part of Rodinia suite), morphological transformation and dilation. In order to reconstruct
- // approximated full shapes of heart walls, the program generates ellipses that are superimposed over the image and sampled to mark points on the heart walls
- // (Hough Search). In its final stage (Heart Wall Tracking presented here), program tracks movement of surfaces by detecting the movement of image areas under
- // sample points as the shapes of the heart walls change throughout the sequence of images.
- // Tracking is the final stage of the Heart Wall application. It takes the positions of heart walls from the first ultrasound image in the sequence as determined by the
- // initial detection stage in the application. Tracking code is implemented in the form of multiple nested loops that process batches of 10 frames and 51 points in each
- // image. Displacement of heart walls is detected by comparing currently processed frame to the template frame which is updated after processing a batch of frames.
- // There is a sequential dependency between processed frames. The processing of each point consist of a large number of small serial steps with interleaved control
- // statements. Each of the steps involves a small amount of computation performed only on a subset of entire image. This stage of the application accounts for almost
- // all of the execution time (the exact ratio depends on the number of ultrasound images).
- //======================================================================================================================================================150
- // PAPERS
- //======================================================================================================================================================150
- // L. G. Szafaryn, K. Skadron, and J. J. Saucerman. "Experiences Accelerating MATLAB Systems Biology Applications." In Proceedings of the Workshop on Biomedicine
- // in Computing: Systems, Architectures, and Circuits (BiC) 2009, in conjunction with the 36th IEEE/ACM International Symposium on Computer Architecture (ISCA),
- // June 2009. <http://www.cs.virginia.edu/~skadron/Papers/BiC09.pdf>
- //======================================================================================================================================================150
- // DOWNLOAD
- //======================================================================================================================================================150
- // Rodinia Benchmark Suite page
- //======================================================================================================================================================150
- // IMPLEMENTATION-SPECIFIC DESCRIPTION (OpenCL)
- //======================================================================================================================================================150
- // This is the OpenCL version of Tracking code.
- // OpenCL implementation of this code is a classic example of the exploitation of braided parallelism. Processing of sample points is assigned to multiprocessors (TLP),
- // while processing of individual pixels in each sample image is assigned to processors inside each multiprocessor. However, each GPU multiprocessor is usually
- // underutilized because of the limited amount of computation at each computation step. Large size of processed images and lack temporal locality did not allow for
- // utilization of fast shared memory. Also the GPU overhead (data transfer and kernel launch) are significant. In order to provide better speedup, more drastic GPU
- // optimization techniques that sacrificed modularity (in order to include code in one kernel call) were used. These techniques also combined unrelated functions and
- // data transfers in single kernels.
- //======================================================================================================================================================150
- // RUNNING THIS CODE
- //======================================================================================================================================================150
- // The code takes the followint input files that need to be located in the same directory as the source files:
- // 1) video file (input.avi)
- // 2) text file with parameters (input.txt)
- // The following are the command parameters to the application:
- // 1) Number of frames to process. Needs to be integer <= to the number of frames in the input file.
- // Example:
- // ./a.out 104
- //======================================================================================================================================================150
- // End
- //======================================================================================================================================================150
- //========================================================================================================================================================================================================200
- // End
- //========================================================================================================================================================================================================200
|