Guest blog post by Bryan R. Cote, Sr. Product Manager, Terascala
If only IT resources were like the golf course. In golf, if the group ahead of you on the course is playing too slow, you can ask to “Play Through.” In other words, can you hold your play so we can get ahead of you? In the IT world, with contention for premium compute resources such as High Performance Storage and Compute at an all-time high, it would be nice to ask your co-workers running that big analysis job to take a break and let you “Play Through.”
Unfortunately, in the world of High Performance Computing, the time it takes to move data in and out of the cluster environment means that the cost of stopping and restarting jobs would only make the situation worse for the organization as a whole. The key to getting the most out of your premium computing resources is to manage the resources properly and ensure the necessary data and resources are available where and when you need them.
The concept of job schedulers and workload managers has been around since the earliest timesharing systems. In today’s large scale compute environments, more than a simple tool to start a job in the middle of the night is required; we need the ability to orchestrate not only job execution, but the transfer of data in and out of the cluster environment in an efficient and timely manner.
Like the computer operators of the past, who would wait for jobs to finish before unmounting and mounting new hard disk packs (yes kids, it was a physical chore in the old days) much of today’s data movement happens manually or with complex scripts. Often with these methods problems can arise affecting when the data is available. In turn, this affects when the job can start anywhere from minutes, to hours, and sometimes even days. An idle cluster is an expensive waste of resources that translates to lost productivity and budget. And even when things go right, you may not be taking full advantage of the bandwidth and capabilities of your network and storage. Translation: more lost productivity.
Fortunately, there are new classes of tools emerging that are helping to address this need. Sophisticated workload managers such as Grid Engine, Moab, Torque and others are addressing the problem on the compute side. On the storage side, products such as Terascala’s Intelligent Storage Bridge (ISB) are helping customers ensure that the necessary data sets are available to their jobs when they need it and where they need it as reliably as possible. By minimizing human intervention and ensuring high availability, the ISB ensures that the slightest interruption doesn’t become a big delay. And best of all, tools like the ISB make sure you get the most out of your advanced HPC tools like NetApp’s HPS Rack, ensuring that data moves quickly in and out, taking advantage of every drop of throughput.
So the next time you’re waiting to get access to your cluster, take a break. Go play a round of golf. Then call NetApp and tell them you’re tired of asking to “Play Through!”