How is Perfect shuffle a better interconnect scheme for parallel processing? For example if we consider a problem of sum reduction, I want to understand how this scheme is useful when implementing sum reduction in parallel , for example on a GPU?
Asked By : gpuguy
Answered By : Massimo Cafaro
The perfect shuffle alone has never been used as an interconnection network; it always included exchange links, in order to allow for a worst-case $O(\lg n)$ routing algorithm. Algorithms for reduction, broadcast, parallel prefix, transpose etc are given in
Yosi Ben-Asher , David Egozi , Assaf Schuster, SIMD Algorithms for 2-D Arrays in Shuffle Networks.
Stone was the first researcher to provide algorithms for the perfect shuffle network, including algorithms for the FFT, matrix transpose etc.
This network has been used in several parallel machines, but mostly in multi-stage shuffle-exchange networks such as the Omega network (which uses the perfect shuffle between stages). An example is the IBM SP3 machine.
Note that these are message-passing algorithms, whilst you refer to a GPU, which is instead a shared-memory device.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/10572
0 comments:
Post a Comment
Let us know your responses and feedback