BookRags.com Literature Guides Literature
Guides
Criticism & Essays Criticism &
Essays
Questions & Answers Questions &
Answers
Lesson Plans Lesson
Plans
My Bibliography Periodic Table U.S. Presidents Shakespeare Sonnet Shake-Up
Research Anything:        
History | Encyclopedias | Films | News | Create a Bibliography | More... Login | Register | Help
Not What You Meant?  There are 3 definitions for Tiling.

Loop tiling

Print-Friendly
About 1 pages (368 words)

Bookmark and Share Questions on this topic? Just ask!

Loop tiling or loop blocking is a loop optimization used by compilers to make the execution of certain types of loops more efficient. Loop tiling partitions a loop's iteration space into smaller chunks or blocks, so as to help ensure data used in a loop stays in the cache until it is reused. The partitioning of loop iteration space leads to partitioning of large array into smaller blocks, thus fitting accessed array elements into cache size, enhancing cache reuse and eliminating cache size requirements. Original: matrix vector multiplication

for (i=0; i<N; i++)
    for (j=0; j<N; j++)
        c[i] = c[i]+ a[i,j]*b[j];

After loop tiling 2*2:

for (i=0; i<N; i+=2)
    for (j=0; j<N; j+=2)
        for (ii=i; ii<min(i+2,N); ii++)
            for (jj=j; jj<min(j+2,N); jj++)
                c[ii] =c[ii]+ a[ii,jj]*b[jj];

The original loop iteration space is N by N. The accessed chunk of array a[i,j] is also N by N. When N is too large and the cache size of the machine is too small, the accessed array elements in one loop iteration (for example, i=1, j=1 to N) may cross cache lines, causing cache misses. Another example using an algorithm for matrix multiplication: Original matrix multiplication:

DO I = 1, M                               
 DO K = 1, M                             
   DO J = 1, M                              
      Z(J,I) = Z(J,I) + X(K,I) * Y(J,K)       

After loop tiling B*B:

DO K2 = 1, M, B
  DO J2 = 1, M, B                                     
    DO I = 1, M
      DO K1 = K2, MIN(K2+B-1,M)
         DO J1 = J2, MIN(J2+B-1,M)    
            Z(J1,I) = Z(J1,I) + X(K1,I) * Y(J1,K1)   

It is not always easy to decide what value of tiling size is optimal for one loop because it demands an accurate estimate of accessed array regions in the loop and the cache size of the target machine. The order of loop nests (loop interchange) also plays an important role in achieving better cache performance.

Further reading

  1. Wolfe, M. More Iteration Space Tiling. Supercomputing'89, pages 655 -- 664, 1989.
  2. Wolf, M. E. and Lam, M. A Data Locality Optimizing Algorithm. PLDI'91, pages 30 -- 44, 1991.
  3. Irigoin, F. and Triolet, R. Supernode Partitioning. POPL'88, pages 319 -- 329, 1988.
  4. Xue, J. Loop Tiling for Parallelism. Kluwer Academic Publishers. 2000.

View More Summaries on Loop tiling
 
Ask any question on Loop tiling and get it answered FAST!
Answer questions in BookRags Q&A and earn points toward
discounted or even FREE Study Guides and other BookRags products!
Learn more about BookRags Q&A
Copyrights
Loop tiling from Wíkipedia. ©2006 by Wíkipedia. Licensed under the GNU Free Documentation License. View a list of authors or edit this article.

Article Navigation
Join BookRagslearn moreJoin BookRags




About BookRags | Customer Service | Report an Error | Terms of Use | Privacy Policy