//========================================================================================================================================================================================================200 // INFO //========================================================================================================================================================================================================200 //======================================================================================================================================================150 // UPDATE //======================================================================================================================================================150 // 2009.12 Lukasz G. Szafaryn // -- converted from MATLAB to CUDA // 2010.01 Lukasz G. Szafaryn // -- arranged, commented // 2012.05 Lukasz G. Szafaryn // -- arranged, commented // 2012.05 Lukasz G. Szafaryn // -- converted from CUDA to OpenCL //======================================================================================================================================================150 // DESCRIPTION //======================================================================================================================================================150 // The Heart Wall application tracks the movement of a mouse heart over a sequence of 104 609x590 ultrasound images to record response to the stimulus. // In its initial stage, the program performs image processing operations on the first image to detect initial, partial shapes of inner and outer heart walls. // These operations include: edge detection, SRAD despeckling (also part of Rodinia suite), morphological transformation and dilation. In order to reconstruct // approximated full shapes of heart walls, the program generates ellipses that are superimposed over the image and sampled to mark points on the heart walls // (Hough Search). In its final stage (Heart Wall Tracking presented here), program tracks movement of surfaces by detecting the movement of image areas under // sample points as the shapes of the heart walls change throughout the sequence of images. // Tracking is the final stage of the Heart Wall application. It takes the positions of heart walls from the first ultrasound image in the sequence as determined by the // initial detection stage in the application. Tracking code is implemented in the form of multiple nested loops that process batches of 10 frames and 51 points in each // image. Displacement of heart walls is detected by comparing currently processed frame to the template frame which is updated after processing a batch of frames. // There is a sequential dependency between processed frames. The processing of each point consist of a large number of small serial steps with interleaved control // statements. Each of the steps involves a small amount of computation performed only on a subset of entire image. This stage of the application accounts for almost // all of the execution time (the exact ratio depends on the number of ultrasound images). //======================================================================================================================================================150 // PAPERS //======================================================================================================================================================150 // L. G. Szafaryn, K. Skadron, and J. J. Saucerman. "Experiences Accelerating MATLAB Systems Biology Applications." In Proceedings of the Workshop on Biomedicine // in Computing: Systems, Architectures, and Circuits (BiC) 2009, in conjunction with the 36th IEEE/ACM International Symposium on Computer Architecture (ISCA), // June 2009. //======================================================================================================================================================150 // DOWNLOAD //======================================================================================================================================================150 // Rodinia Benchmark Suite page //======================================================================================================================================================150 // IMPLEMENTATION-SPECIFIC DESCRIPTION (OpenCL) //======================================================================================================================================================150 // This is the OpenCL version of Tracking code. // OpenCL implementation of this code is a classic example of the exploitation of braided parallelism. Processing of sample points is assigned to multiprocessors (TLP), // while processing of individual pixels in each sample image is assigned to processors inside each multiprocessor. However, each GPU multiprocessor is usually // underutilized because of the limited amount of computation at each computation step. Large size of processed images and lack temporal locality did not allow for // utilization of fast shared memory. Also the GPU overhead (data transfer and kernel launch) are significant. In order to provide better speedup, more drastic GPU // optimization techniques that sacrificed modularity (in order to include code in one kernel call) were used. These techniques also combined unrelated functions and // data transfers in single kernels. //======================================================================================================================================================150 // RUNNING THIS CODE //======================================================================================================================================================150 // The code takes the followint input files that need to be located in the same directory as the source files: // 1) video file (input.avi) // 2) text file with parameters (input.txt) // The following are the command parameters to the application: // 1) Number of frames to process. Needs to be integer <= to the number of frames in the input file. // Example: // ./a.out 104 //======================================================================================================================================================150 // End //======================================================================================================================================================150 //========================================================================================================================================================================================================200 // End //========================================================================================================================================================================================================200