README 6.8 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182
  1. //========================================================================================================================================================================================================200
  2. // INFO
  3. //========================================================================================================================================================================================================200
  4. //======================================================================================================================================================150
  5. // UPDATE
  6. //======================================================================================================================================================150
  7. // 2009.12 Lukasz G. Szafaryn
  8. // -- converted from MATLAB to CUDA
  9. // 2010.01 Lukasz G. Szafaryn
  10. // -- arranged, commented
  11. // 2012.05 Lukasz G. Szafaryn
  12. // -- arranged, commented
  13. // 2012.05 Lukasz G. Szafaryn
  14. // -- converted from CUDA to OpenCL
  15. //======================================================================================================================================================150
  16. // DESCRIPTION
  17. //======================================================================================================================================================150
  18. // The Heart Wall application tracks the movement of a mouse heart over a sequence of 104 609x590 ultrasound images to record response to the stimulus.
  19. // In its initial stage, the program performs image processing operations on the first image to detect initial, partial shapes of inner and outer heart walls.
  20. // These operations include: edge detection, SRAD despeckling (also part of Rodinia suite), morphological transformation and dilation. In order to reconstruct
  21. // approximated full shapes of heart walls, the program generates ellipses that are superimposed over the image and sampled to mark points on the heart walls
  22. // (Hough Search). In its final stage (Heart Wall Tracking presented here), program tracks movement of surfaces by detecting the movement of image areas under
  23. // sample points as the shapes of the heart walls change throughout the sequence of images.
  24. // Tracking is the final stage of the Heart Wall application. It takes the positions of heart walls from the first ultrasound image in the sequence as determined by the
  25. // initial detection stage in the application. Tracking code is implemented in the form of multiple nested loops that process batches of 10 frames and 51 points in each
  26. // image. Displacement of heart walls is detected by comparing currently processed frame to the template frame which is updated after processing a batch of frames.
  27. // There is a sequential dependency between processed frames. The processing of each point consist of a large number of small serial steps with interleaved control
  28. // statements. Each of the steps involves a small amount of computation performed only on a subset of entire image. This stage of the application accounts for almost
  29. // all of the execution time (the exact ratio depends on the number of ultrasound images).
  30. //======================================================================================================================================================150
  31. // PAPERS
  32. //======================================================================================================================================================150
  33. // L. G. Szafaryn, K. Skadron, and J. J. Saucerman. "Experiences Accelerating MATLAB Systems Biology Applications." In Proceedings of the Workshop on Biomedicine
  34. // in Computing: Systems, Architectures, and Circuits (BiC) 2009, in conjunction with the 36th IEEE/ACM International Symposium on Computer Architecture (ISCA),
  35. // June 2009. <http://www.cs.virginia.edu/~skadron/Papers/BiC09.pdf>
  36. //======================================================================================================================================================150
  37. // DOWNLOAD
  38. //======================================================================================================================================================150
  39. // Rodinia Benchmark Suite page
  40. //======================================================================================================================================================150
  41. // IMPLEMENTATION-SPECIFIC DESCRIPTION (OpenCL)
  42. //======================================================================================================================================================150
  43. // This is the OpenCL version of Tracking code.
  44. // OpenCL implementation of this code is a classic example of the exploitation of braided parallelism. Processing of sample points is assigned to multiprocessors (TLP),
  45. // while processing of individual pixels in each sample image is assigned to processors inside each multiprocessor. However, each GPU multiprocessor is usually
  46. // underutilized because of the limited amount of computation at each computation step. Large size of processed images and lack temporal locality did not allow for
  47. // utilization of fast shared memory. Also the GPU overhead (data transfer and kernel launch) are significant. In order to provide better speedup, more drastic GPU
  48. // optimization techniques that sacrificed modularity (in order to include code in one kernel call) were used. These techniques also combined unrelated functions and
  49. // data transfers in single kernels.
  50. //======================================================================================================================================================150
  51. // RUNNING THIS CODE
  52. //======================================================================================================================================================150
  53. // The code takes the followint input files that need to be located in the same directory as the source files:
  54. // 1) video file (input.avi)
  55. // 2) text file with parameters (input.txt)
  56. // The following are the command parameters to the application:
  57. // 1) Number of frames to process. Needs to be integer <= to the number of frames in the input file.
  58. // Example:
  59. // ./a.out 104
  60. //======================================================================================================================================================150
  61. // End
  62. //======================================================================================================================================================150
  63. //========================================================================================================================================================================================================200
  64. // End
  65. //========================================================================================================================================================================================================200