Advertisment

Eight hours at a time=Eka

author-image
CIOL Bureau
New Update

N Seetha Rama Krishna, project manager, CRL was one of the key captains of the Eka mission. He was also a part and witness at the C-DAC core team of HPC during the PARAM days.

Advertisment

He talks about many issues in supercomputing, the rough times through this journey and the challenges.

There are divergent views on capability computing and capacity computing when it comes to supercomputers. Between the former i.e. maximum computing power to solve a large problem in shortest time and the latter i.e. efficient cost-effective computing power to solve large problems or many small problems, what is your take?

So shall we use one single supercomputer for high performance or two or many to solve the one problem, this is always a moot point. No single organization can afford to have one super power for solving only a specific problem and that's where we combine smaller problems. I guess it's the use of both capacity and capability computing, both co-exist in one site than mutually exclusive. It's always a blend.

Advertisment

Eka has tried something new by going for a circular layout; can you share something on that?

This is fundamental for cooling purposes vis-à-vis bottoms-up cooling in comparison to traditional rows and columns. The flow distribution becomes more effective and it also falls in line with the plan of high-density computing.

Eka has also kind of taken the bull by the horns with issues like scalability, heat requirements etc that have always accompanied supercomputers. Did you also address latency between components that is another major supercomputing issue?

Advertisment

There was a lot endeavored but I feel there was nothing we could do here. However, we worked closely with our international partners and to some extent we did handle some latency. I guess we are amongst the first ones to look at the low latency side with innovations in hardware and further with customized implementation. What we could do was 10 to 15 per cent better latency. This is an area that calls for continuous R&D. It's not just a hardware level problem but also a network and application level one.

Usability too has been a major thrust of Eka. Can you elaborate how you managed it for the application user of the supercomputer to be completely untouched with the complexity that lies underneath?

Usability scenario will see automation of standardized processes and a possibility where the user can pick phone and have a scientist who acts as the user facilitator. We are planning a user portal of Eka too. The attempts are on but my vision is of creating a single user interface where the user can just come on the site from where he registers, runs his applications on the supercomputer, modify his programs, take the data, do the problem solving, everything from one interface.

Advertisment

You have been associated with the legacy of supercomputing in India with PARAM? Any comparisons that you can reckon with the IBM and HP counterparts or PARAM generation per se?

No, there are no such comparisons. Every supercomputer generation is made for its own class of problems. IBM for instance promotes many low performing and low power processors based on MPP architectures that make networking highly complex and needs rewriting your application. The middle end is cluster computing where Param, Eka and 90 per cent of the world's HPC falls. There are standard blocks over which architectural innovations can be done with collection of management suites. This improvises efficiency but user takes more time to use. At the most customized level are vector computing architectures from CRAY and NEC etc. Every class of system has its own advantages and disadvantages in terms price Vs performance for a select class of applications. Open source in interestingly making indoors in supercomputing too. With Eka too having some key constituents from open source breed, what's your assessment on this trend? I am a big fan of open source; all my knowledge is effectively from open source. It allows lots of things; you can learn, experiment, modify and improvise. Moreover, it is enabling co-operative platforms where pains and gains can be shared openly.

From scalar processes (which handles one element at a time) to vector processes (which runs mathematical operations on multiple data elements simultaneously) to massively parallel processing (which has many individual nodes, each as a computer in itself), the CPU design has definitely evolved and is still changing. Your comments

Advertisment

Vector has been used in designing a super powerful processor but innovation at that price is less. On the contrary, in cluster computing, innovation is more. Same with massively parallel processing as probably used in IBM BlueGene. Even the good old Cray is coming down to cluster computing. Cluster has taken over specific high-end processor designs. Eka in that light is a co-ordinated orchestra Vs a one single super performer like Vector based Cray, which is a once in a generation piece.

What is the equation, if any, between grid computing and supercomputing?

Grid computing, contrary to the general myth, is not computing in the first place, it's an access technology. It's more about virtualization and access than computing. Grid enables access, collaboration and solution of geographically challenged problems. Grid is a multi-purpose infrastructure, which is powered by resources like Eka computers.

As you look back at the making of Eka, what were the tough times that stand out in your memory? Can you share something on the challenges and lessons you took?

My first lesson was that in a complex target of a stature like Eka, one should be ever ready for unexpected accidents. But not to be disheartened. There can be reservations from people around you that can pull you down or your own doubts rising when obstacles start hitting but just keeping at it makes you go through it. And then from somewhere, somehow, solutions come in. I had the bigger picture in mind, but I never thought beyond eight hours at a time. Even when my peers or seniors asked me on my plan for the next day, I honestly had none more than that of the next eight hours. Problems like a water tank rising up due to heavy rains or material delayed due to customs are ones that cannot be predicted at all. Situations were so dynamic. I remember that once we had a power problem, just when we were getting ready for the big run. For one day, I was clueless and helpless. Arranging 1 MW takes two to three months and we in no way, could afford that much time. Next day, somehow, we got up, dug all networks we had and put together six to eight generators overnight. Next was UPS, got a 550 KW UPS in two days. And then came the cooling got three precession Air conditioners in four days and rewired the whole data center power cabling in a record time of two to three days. All this work while the rest of the computer is on. Looking back, all that seems so impossible, but we pulled through that time. The passion for Eka overwhelmed everything.

tech-news