Advertisment

Pico, Impulse demo FPGA cluster acceleration of C-Language apps

author-image
CIOL Bureau
Updated On
New Update

RENO, USA: Pico Computing and Impulse Accelerated Technologies announced that the two companies will be presenting joint demonstrations at the International Conference for High Performance Computing (SC07) being held November 10-16, 2007 in Reno, Nevada.

Advertisment

The demonstrations planned for SC07 include a 'Where’s Waldo' image-recognition algorithm and an N-body astrophysics simulation. Both of these demonstration applications have been implemented using the Impulse C tools and run on a Pico Computing SuperCluster 84-FPGA array. Pico Computing will also be demonstrating an FPGA-accelerated, brute-force WPA cracking algorithm.

"Impulse C allows algorithm developers to more quickly generate software/hardware applications for our SC3 SuperCluster," said Dr. Robert Trout, President and founder of Pico Computing. "For high-performance computing applications, access to tools like Impulse C is a critical enabler."

According to Dr. Trout, the power/performance of the SuperCluster is stunning. A single FPGA module (featuring a relatively modest Virtex-5 LX50 FPGA device) can demonstrate random number generation speed improvement of 12X or better over a standard dual-core processor.

Advertisment

In the Pico Computing SC3 SuperCluster, 84 of these modules demonstrate performance comparable to a cluster of 1,000 dual-core processors, while using a comparable amount of power. In fact, the entire 84-FPGA SuperCluster is capable of operating at-speed using a standard 600W PC power supply.

Finding Waldo

The 'Where’s Waldo' demonstration highlights the potential of FPGAs for acceleration of complex image processing tasks, using a cluster of FPGAs. This demonstration, using the popular children’s book as the target, is an excellent example of the challenges associated with identifying someone or something among a crowd of similar images. The SC3 SuperCluster with 84 FPGAs capitalizes on the parallel nature of the algorithm. Impulse C was used to develop the required image processing filters.

The demonstration algorithm extracts distinctive features of the target image using a Scale Invariant Feature Transform (SIFT) method. The algorithm searches for corresponding features in a video stream, while enforcing the consistency across all feature matches.

Advertisment

To ensure precision the algorithm provides a measure of the certainty for each match, for example reporting an 85 percent chance that Waldo is at a specific location. The success of this demonstration has clear implications for low-power, real-time defense and security applications.

Accelerating astrophysical simulations

The goal of the N-body simulation is to model and calculate the gravitational forces between thousands of planets, stars, galaxies, and other objects. N-body simulations are computationally intensive but are regularly used by scientists to understand how galaxies and planets are formed and evolve over time.

The computation required is N2, with N representing the number of bodies being modeled. The gravitational force on each body is calculated by summing the force between that body and every other body in the system. For example in a simulation of the solar system, the movement of Earth is calculated by summing the gravitational pull of the Sun, other planets, comets, the Earth’s Moon, etc).

Advertisment

This is a complex, floating-point problem that is highly parallelizable, and hence a perfect candidate for FPGA acceleration. Impulse C was also used to develop this algorithm, using the streaming, multiple-process features of the Impulse C programming tools.

Cracking WPA aecurity

WPA is common security algorithm used to secure wireless access points. WPA employs PBKDF2, which runs the SHA1 algorithm thousands of times to convert a password into a key. The key is then used to encrypt the wireless network.

To crack passwords on the WPA network, an authentication session must first be captured. Once this is captured, different passwords can be tried by running through the PBKDF2 (brute-force) function and verifying if the password is correct by verifying against the captured data. This method requires an enormous number of iterations but is highly parallelizable.

The result is WPA cracking that is hundreds or thousands of times faster than would be achievable using software-only methods.

semicon