Advertisment

Biz and technology challenges in the age of participation

author-image
CIOL Bureau
Updated On
New Update

Anil Valluri , Country Director, Client Solutions Organisation , Sun Microsystems India Pvt. Ltd.

Advertisment

Businesses today are

increasingly defined by their applications, and now more than ever, an

organization's prospects for success are increasingly fixed to its ability to

deploy technology in an agile and effective fashion. The risks are extreme. In

today's competitive and highly regulated business environment, the cost of

technology failure can be rapid and severe. Even small lapses in IT competence

can result in wide spread damage and loss.

Business

Requirements

Advertisment

Increasing

the pressure, an endless variety of new networked devices and users are

demanding ever-higher levels of performance, capacity, availability, and

security from the applications and services that serve them. Real estate

concerns along with very real and rising energy costs for both power and cooling

are now significant factors that discourage merely adding endless racks of

traditional servers. The cost and complexity of managing very large numbers of

systems is another pressing concern, especially when coupled with the very low

levels of utilization typically found in traditional infrastructure.






To respond to these myriad challenges, business must:

Â¥

Increase application throughput along with capacity and performance to address

pressing business needs as well as capture new customers and opportunities

Advertisment

Â¥

Reduce power, cooling, and real estate costs both to save money and to enable

necessary growth and scalability

Â¥

Maintain application compatibility and enhance security across the organization

to preserve investments and limit risks to the firm and its clientele

Advertisment

Beyond

mere packaging, these issues drive to the very technology used to design

processors, systems, and applications. Processor design in particular can have

enormous ramifications for business-level issues and solutions. Unfortunately,

traditional high frequency, single-threaded processors are increasingly yielding

diminishing returns.






Even with ever-higher clock rates, these processors are producing only small
improvements in real-world application performance. At the same time, these

high-frequency processors generate escalating costs in the form of higher levels

of power consumption, and significantly higher levels of heat load that must be

addressed by multiple large and expensive HVAC systems. With economic and

competitive pressures at an all-time high, most understand that significant

change is needed.

The

Diminishing Returns of Complex Processor Design

Advertisment





While optimistic marketing statements constantly call attention to
presumably impressive multiple-gigahertz frequencies and high levels of cache

for new generations of processors, corresponding small gains in real-world

system performance and productivity continue to frustrate IT professionals.






Throughput Computing, along with Sun's focus on optimizing real workload
performance is designed to help resolve these divergent trends. This approach

provides higher levels of delivered performance and computational

throughput while greatly simplifying the data center. Understanding the

importance of throughput computing requires a look at how both processors and

systems have been designed in the past, and the trends that are defining better

ways forward.









The oft-quoted tenet of Moore's Law states that the number of transistors that
will fit in a square inch of integrated circuitry will approximately double

every two years. For over three decades the pace of Moore's law has held,

driving processor performance to new heights. Processor manufacturers have long

exploited these chip real estate gains to build increasingly complex processors,

with instruction-level parallelism (ILP) as a goal.






Today these traditional processors employ very high frequencies along with a
variety of sophisticated tactics to accelerate a single instruction pipeline,

including:



Advertisment

Â¥

Large caches

Â¥

Superscalar designs

Advertisment

Â¥

Out-of-order execution

Â¥

Very high clock rates

Â¥

Deep pipelines

Â¥

Speculative pre-fetches

While

these techniques have produced faster processors with impressive-sounding

multiple-gigahertz frequencies, they have largely resulted in complex, hot, and

power-hungry processors that don't serve many modern applications, or the

constraints of today's data centers. In fact, many of today's data center

workloads are simply unable to take advantage of the hard-won ILP provided in

these processors. As shown in Table 1, applications with high shared memory and

data requirements are typically more focused on processing a large number of

simultaneous threads (thread-level parallelism) rather than running a single

thread as quickly as possible (ILP).



 













Figure

1. Increasing single-threaded processor performance by 100 percent (a 50-percent

reduction in compute-time) provides only a small relative gain in application

performance due to memory latency.>

Figure

1 illustrates how even doubling processor performance (frequency) often provides

only a small relative increase in application performance. In this example,

though the compute time is reduced by half, only a small overall improvement in

execution time results, due to the constant and dominant influence of memory

latency.







Complicating matters, the disparity between processor speeds and memory access
speeds means that memory latency dominates application performance, erasing even

very impressive gains in clock rates. While processor speeds continue to double

every two years, memory speeds have typically doubled only every six years. This

growing disconnect is the result of memory suppliers focusing on density and

cost as their design center, rather than speed. Unfortunately, this relative gap

between processor and memory speeds leaves ultra-fast processors idle as much as

85 percent of the time, waiting for memory to return required data.






Ironically, as traditional processor execution pipelines get faster and more
complex, the effect of memory latency grows fast, expensive processors spend

more cycles doing nothing. Worse still, idle processors continue to draw power

and generate heat. It's easy to see that frequency (gigahertz) is truly a

misleading indicator of real performance



While

some vendors have seemingly awakened to the inherent limitations of traditional,

frequency-based processor designs, they are now attempting to graft power-saving

technologies and multiple cores onto old, once discarded architectures.

Unfortunately, these efforts represent stop-gap measures at best. Effective

Throughput Computing can only be realized with fundamentally new processor

designs that deliver truly compelling benefits to customers while leaving legacy

approaches behind.