PDA

View Full Version : Redmond Magazine



Nflight
03-07-2006, 02:03 PM
http://redmondmag.com/columns/article.asp?editorialsid=1252

What was once old is new again. High-performance computing (HPC) has returned as one of the biggest trends in computing -- with a big difference. Back in the day (the early 1990s) you could drop $40 million on a Cray Y-MP supercomputer.

Now, thanks to cheap, off-the-shelf components (COTS), new Intel- and AMD-based HPC servers make sense from both a financial and technological perspective. For example, you can pick up a four-way, 2.2GHz AMD Athlon64 server with 4GB of RAM for about $4,000. As far as the technology goes, the point of HPC these days is to rely less on a single massive machine and more on compute clusters -- groups of interconnected machines that divide the workload among themselves.

In fact, universities and research institutions have been using Linux-based supercomputing clusters for years. The Beowulf Project can give you some guidance on building clusters of Linux-based servers.

It's little wonder that Microsoft is looking for a piece of the HPC action. I got a good look at Windows Compute Cluster Server 2003 (CCS2003) at a recent Microsoft briefing. Remember that the "C" in COTS stands for cheap. CCS2003 (which is based on Windows Server 2003, hence the name) will actually cost less per socket than other editions of Windows. This won't be a bargain-basement version of Windows, however. It's being put together specifically to address HPC concerns.

As a result, you won't be able to install this special version of Windows on any computer that isn't part of a dedicated computational cluster. It's also only available in an x64 edition -- the theory being that nobody would want to build a computational cluster out of legacy 32-bit hardware.

What Is a Compute Cluster?
A compute cluster is a single-head node that accepts computing jobs and distributes the workload across at least two attached nodes. CCS2003 won't support high availability for the head node, so make sure it's already running on highly available hardware. This is the brains of your HPC operation, so it has to stay up.

You can have as many attached compute nodes as you can afford. As we've learned from distributed computing projects like SETI@home (which is an excellent real-world example of how you would use a compute cluster), the more compute nodes, the merrier.

To avoid bottlenecks that can limit the number of nodes in your compute cluster, you'll want to use switched gigabit Ethernet as a minimum -- a 10 gigabit Ethernet or Myrinet network is even better. CCS2003 includes Windows Sockets Direct Interface, which is specifically designed to take advantage of these types of high-speed connections.

You'll have to tune your applications to run on a cluster. To give you an idea of the old-school, hardcore nature of this type of computing, look at the programming languages that CCS2003's components support out of the box: Fortran77, Fortran90 and C. Yikes. Configure the system to submit applications to the cluster's scheduler on the head node, and to run completely unattended using only data files (and not keyboard commands or mouse clicks) for input.

You'll also have to be fluent in several new acronyms if you're going to set up a compute cluster. MPI (Message Passing Interface) is an industry-standard application programming interface designed for rapid data exchange between compute nodes in HPC environments. Microsoft's MPI (MSMPI) is a version of the Argonne National Labs Open Source MPI2 implementation that supports more than 160 function calls. Applications submitted to CCS2003's job scheduler need to support this.

As you might expect, CCS2003 makes heavy use of Microsoft's infrastructure components. For example, all nodes have to belong to the same Active Directory domain so you can manage them as a unit and share security information.

What It Isn't
CCS2003 is not the same kind of clustering as Windows Cluster Service. While CCS2003 is designed to have several computers interconnected, those computers work together to solve computationally intensive problems, rather than provide failover or fault tolerance. You won't run Exchange Server on CCS2003. In fact, unless you have some heavy-duty number crunching to do, CCS2003 probably isn't for you.

The thought of deploying and managing a dozen or so compute nodes sends a chill down my spine, and not just because the data center housing them is going to need heavy-duty air conditioning to avoid a meltdown. In an era when everyone's downsizing the data center, CCS2003 heads in the opposite direction.

Microsoft feels your pain. CCS2003 includes a command-line interface to help you to create and submit jobs. You can use Remote Installation Services (RIS) to deploy compute nodes, so deployment to bare-metal machines is easier (CCS2003 includes RIS). Standard backup and restore techniques apply, so whatever you're already using should work fine. Of course, the usual MMC snap-ins will let you control the entire cluster. The setup process for Compute Cluster is also straightforward, using a standard Wizard-based interface.

CCS2003 loves networks and wants to connect to as many as possible. A private network for administrative traffic, the MSMPI network for exchanging cluster communications and data, and a public network like your corporate intranet. This last conduit also lets applications like Systems Management Server (SMS) and Microsoft Operations Manager (MOM) get into the compute cluster's head node for management purposes. So you could have each CCS2003 machine connected to as many as three networks at once.

Too Much Horsepower?
Unless you have to do some serious number crunching, such as simulating nuclear explosions, modeling fluid dynamics or assessing potential oil deposits, CCS2003 may not be for you. Still, CCS2003 makes HPC accessible to organizations that never would have considered it before.Redmond - AMDusers.com Team effort needs all this techno wizardry to stay on top

Don Jones is the owner and operator of ScriptingAnswers.com, a speaker at national technical IT conferences, and the author of nearly twenty books on information technology. His latest book is "Windows Administrator's Automation Toolkit" (Microsoft Press). You can contact Don about "Beta Man: Windows Goes High Performance" at don@scriptinganswers.com.

NeoGen
03-07-2006, 02:21 PM
I wanna manage a cluster someday! :D

(But I need to learn about it first... ;))

Nflight
03-07-2006, 03:05 PM
http://daugerresearch.com/pooch/gettingstarted.shtml

Learning to Cluster; Finding the right place to start! I too am a First timer. :shock:

PcManiac
03-07-2006, 04:32 PM
This is pretty cool! One of the projects that I will have to do for school, is build a computer cluster :D
(I am taking a 2-year computer network technology course)

Nflight
03-07-2006, 08:36 PM
You know I just went through a 2 year course for Computer Network Technology course. All I have to do now is finish the Certs Tests.
But they never offered a Cluster course, :cry:

AMDave
03-08-2006, 09:51 AM
Brilliant skills by the Redmond marketing engine! Yet another set of revenue streams (product, training and support) for M$ doing what you can already do in Linux and Solaris (and various other *nix releases) FOR FREE!

My take is that it's like rolling out a device that can tell you the time on an uninhabited planet. "Congratulations. What else have you got?"

What they should be addressing is the problem the market has in understanding how certain commercial problems can be paralellised by different means to suite this kind of technology.

Although we have seen the emergence of "render walls" and resurgence in ASP servers (no, not active server pages - Application Service Providers) and the like, problem parallellism is a skill set that is incredibly lacking in business and mostly a domain of scientific and mathematic research.

Unless they can make in-roads to helping business to identify opportunities for parallelism in solving their business intelligence questions or otherwise get the most out of the "cluster", (and re-threading their own software maybe this millenium), then it will eventuate to something like Window$ME.

Otherwise, we might be looking at the future of the leased software desktop, served via a cluster that runs your "hired" apps and allows you to see them in a browser on your screen, while they charge for the service by the hour. The market is already there for that service and there are is money it, although heavily offset by massive investment in infrastructure, licensing and expertise.

I guess I am saying that this ain't for the masses.

Thanks for the post Nflight. I know I am coming down hard on their sales hype, but...I AM coming down hard on their sales hype. However, I will definitely be looking into this for my boss and our architect, but I expect that the tie-ins are going to amount to a Corporate Enterprise class Budget :D

An important distinction is that clusters generally require the problem solution to be re-coded to a cluster implementation to utilise the shared memory & disk space features whereas grids lend themselves directly to problems which can be reseolved by the Distributed Computing methodology.

In the mean time, if you are interested in "farming" some of your own machines, try your hand at LTSP for Windows and Linux and you can build your own grid, here now and (in many cases) free.

LTSP was developed mainly for ASP type arrangements, but you can configure it to run the O/S and the Apps locally on the client machines/workstations, with the OS and apps being downloaded from the central server and run locally. Effectively, through some foxy configuration you can craft an environment that amounts to a powerfull grid, yet which is all served from a single machine - which is a DREAM for development and administration.

If clusters still make you drool, Solaris10 is also now free and has a well developed platform that has been doing this for years.

As far as I know there's at least half a dozen or more clustering interfaces to choose from if you look at Linux.

Heck! Even Webmin built in a cluster management interface (although I get the impression this should be called a Grid Management Interface) as a standard feature. If I get a couple more machines I'll look into that some more and give that a spin too.

It's not so much that I don't like M$ (and I don't) it is just that there are so many cheaper working alternatives available already. They want more pie.

AMDave
03-08-2006, 10:21 AM
Goodness, I managed all that without once mentioning Apple :!:

Nflight
03-09-2006, 10:33 AM
I found this information about clustering to be quite eye appealing when it comes to understanding Clustering. This thanks to Blackheath

http://manuals.fujitsu-siemens.com/softbooks/software/us/clus_us.htm

Strongbow
03-09-2006, 10:52 AM
Nflight,

That is about my company's PRIMECLUSTER suite, it really does load balanced and failover clustering. It competes with the likes of Veritas Cluster Server.

If you want info on compute clusters then have a look at our HPCline which is at http://www.hpcline.com/

They split away from my company last year but we still manufacture the kit for them.

Have a read of this whitepaper on SynfiniWay as well...
http://www.fujitsu.com/downloads/EU/uk/whitepapers/hpc-synfiniway.pdf

and a paper on using 10G Eth in HPC clusters...
http://www.fcpa.fujitsu.com/download/download/ethernet-switches/Intel_Fujitsu_WP_FNL.pdf


Or if money is no object then you could go for the ultimate which is 128 crossbar connected HPC2500s which have each 128xSPARC64 2.08GHz/4MB SLC and 512GB Ram. That's 16384 SPARC 2.08GHz processors running in parallel. I haven't sold many of these mind you! :oops:

Nflight
03-09-2006, 11:00 AM
Blackheath I wanted to remind you I play the Lottery and if I win, my main purchase will not be a fancy car, or boat or even wild women. I want it all in computing power so I may exist in a world of crunching for the progress of solving the numerous solutions sought out by big business and Data mining for medical cures!

Your posting just adds to the future of my desires whether practical or not I love computing and pushing my systems to the maximum potential.

No where in the minds of normal people do the citizens of AMDuser.com find our life normal when we discuss for hours our intentions of pursuit of faster and more crunching capability.

Be warned if I do win your getting a pm as fast as I can access the net about your future sale. :)

Strongbow
03-09-2006, 11:03 AM
LOL :lol:

I know exactly where you're coming from Nflight!

...although I would spare a bit for the wild women! ;)

AMDave
03-09-2006, 11:45 AM
...and Air Conditioners Nflight.
hey everyone, if you catch Nflight buying more lottery tickets let me know so I can plunder the share market :lol:

Excellent brain food on those links. Thanks lads. I have to admit that I have always associated Primergy with data centres rather than compute clusters because of the ads in the Oracle magazine. It seems there has been a lot more development in commercialising compute clusters than I have been aware of.

blackheath, do you ever lend those things out for a "free trial" ?
(worth asking I think - hehe :twisted: ).

Strongbow
03-09-2006, 12:27 PM
AMDave,

Firstly, if you have an account with us then sure we can loan you the equipment on eval terms, else buy a lottery ticket! ;)

In answer to your original statement then you should check out the Enterprise Grid Alliance consortium site at http://www.gridalliance.org/. I know quite a lot of the individual members and although it is moving quite slowly they seem to be going in the right direction for enterprise grids.

Here's quite a good whitepaper on reference model guidelines for GRIDs in the enterprise... http://www.gridalliance.org/en/documents/TWGDocs/05198r01EGA_RefMod-EGA%20Reference%20Model%20v1.0%20_English.pdf

Nflight
03-09-2006, 06:22 PM
I don't speak or read German, but this really is universal! The dimensions of my ultimate desired machine. Oh Yes I will need a lot of Air Conditioners (can you say 54 Degrees Fahrenheit year round).

Eigenschaft Beschreibung
Mainboard 1x Tyan 4881 (Dual Core ready)
1x M4881
Prozessor 8x Dual AMD Opteron 800
Max. Taktfrequenz 2.60 GHz / 2.20 GHz (Dual Core)
Front-Side Bus Hypertransport
Max. Speicher 128 GB ECC, reg. DDR333/DDR400
Chipsatz nVidia nForce K08-04 Pro / AMD 8131
Onboard Controller
Seriell 1x 9-pin UART
Parallel ---
USB USB 2.0: 2x front, 2x rear
LAN 1x Broadcom BCM5704 dual channel
10/100/1000 Mbps
SATA 2x dual-Port SATA II
Grafik ATI RAGE XL 8MB PCI
Steckplätze 2x PCI-X x16 slot one with x4 signal
2 independent 64-bit PCI-X busses
1x 133 MHz max. 3.3 V PCI-X slot from BrdgB
2x 100 MHz max. 3.3V PCI-X slots from BrdgA
Laufwerkschächte up to 8x SATA
optional CD-ROM, DVD, Floppy
Abmessungen (HxBxT) 220 x 425 x 680 (mm)
Elektrische Anschlußwerte 1300 W

Strongbow
03-10-2006, 05:59 AM
I know that machine, no fancy case just a metal chassis with vents built solely for compute performance in an HPC environment.

That's the one most European & F1 car manufacturers use as a physics engine for their crash modelling. Although all of them do have quite a few (well several actually) of them linked together in an HPC cluster.

For some reason enterprises like to buy servers which are aesthetically pleasing for their data centers whilst the real number crunchers would rather spend money on the raw performance. :roll:

NeoGen
03-10-2006, 12:29 PM
I don't care about the looks of the machines really...
If I were to buy a monster cruncher I would ask "What's the ugliest, meanest, monster cruncher you got there for sale??" :lol:

Strongbow
03-10-2006, 01:49 PM
Then can I interest you in one of these then sir...

http://www.marfir.rhcs.de/witze/liquid_cooled.jpg

:lol:

NeoGen
03-10-2006, 02:38 PM
That's similar to the kitchen oil cooled machine that was shown around the net sometime ago... but that's a dual! :lol:
I can't see the details on the monitor, but I bet it's running overclocked and would crunch awesome! :)

But nah... you wouldn't interest me on that particular one. Not because it looks bad, but because it takes waaay too much space! :roll: