Sponsored by:

Visit AMD Visit Supermicro

Performance Intensive Computing

Capture the full potential of IT

Supermicro H13 Servers Maximize Your High-Performance Data Center

Featured content

Supermicro H13 Servers Maximize Your High-Performance Data Center

Learn More about this topic
  • Applications:
  • Featured Technologies:
  • Featured Companies:
  • AMD

The modern data center must be both highly performant and energy efficient. Massive amounts of data are generated at the edge and then analyzed in the data center. New CPU technologies are constantly being developed that can analyze data, determine the best course of action, and speed up the time to understand the world around us and make better decisions.

With the digital transformation continuing, a wide range of data acquisition, storage and computing systems continue to evolve with each generation of  a CPU. The latest CPU generations continue to innovate within their core computational units and in the technology to communicate with memory, storage devices, networking and accelerators.

Servers and, by default, the CPUs within those servers, form a continuum of computing and I/O power. The combination of cores, clock rates, memory access, path width and performance contribute to specific servers for workloads. In addition, the server that houses the CPUs may take different form factors and be used when the environment where the server is placed has airflow or power restrictions. The key for a server manufacturer to be able to address a wide range of applications is to use a building block approach to designing new systems. In this way, a range of systems can be simultaneously released in many form factors, each tailored to the operating environment.

The new H13 Supermicro product line, based on 4th Generation AMD EPYC™ CPUs, supports a broad spectrum of workloads and excels at helping a business achieve its goals.

Get speeds, feeds and other specs on Supermicro’s latest line-up of servers

Featured videos


Follow


Related Content

Manage Your HPC Resources with Supermicro's SuperCloud Composer

Featured content

Manage Your HPC Resources with Supermicro's SuperCloud Composer

Learn More about this topic
  • Applications:
  • Featured Technologies:
  • Featured Companies:
  • GigaIO

Today’s data center has numerous challenges: provisioning hardware and cloud workloads, balancing the needs of performance-intensive applications across compute, storage and network resources, and having a consistent monitoring and analytics framework to feed intelligent systems management. Plus, you may have the need to deploy or re-deploy all these resources as needs shift, moment to moment.

Supermicro has created its own tool to assist with these decisions to monitor and manage this broad IT portfolio, called the SuperCloud Composer (SCC). It combines a standardized web-based interface using an Open Distributed Infrastructure Management interface with a unified dashboard based on the RedFish message bus and service agents.

SCC can track the various resources and assign them to different pools with its own predictive analytics and telemetry. It delivers a single intelligent management solution that covers both existing on-premises IT equipment as well as a more software-defined cloud collection. Additional details can be found in this SuperCloud Composer white paper.

SuperCloud Composer makes the use of a cluster-level PCIe network using the FabreX software from GigaIO Networks. It has the capability to flexibly scale up and out storage systems while using the lowest latency paths available.

It also supports Weka.IO cluster members, which can be deployed across multiple systems simultaneously. See our story The Perfect Combination: The Weka Next-Gen File System, Supermicro A+ Servers and AMD EPYC™ CPUs.

SCC can create automated installation playbooks in Ansible, including a software boot image repository that can quickly deploy new images across the server infrastructure. It has a fast-deploy feature that allows a new image to be deployed within seconds.

SuperCloud Composer offers a robust analytics engine that collects historical and up-to-date analytics stored in an indexed database within its framework. This data can produce a variety of charts, graphs and tables so that users can better visualize what is happening with their server resources. Each end-user is provided with analytic capable charting represented by IOPS, network, telemetry, thermal, power, composed node status, storage allocation and system status.

Last but not least, SCC also has both network provisioning and storage fabric provisioning features where build plans are pushed to data or fabric switches either as single-threaded or multithreaded operations, such that multiple switches can be updated simultaneously by shared or unique build plan templates.

For more information, watch this short SCC explainer video. Or schedule an online demo of SCC and request a free 90-day trial of the software.

Featured videos


Follow


Related Content

Supermicro Debuts New H13 Server Solutions Using AMD’s 4th-Gen EPYC™ CPUs

Featured content

Supermicro Debuts New H13 Server Solutions Using AMD’s 4th-Gen EPYC™ CPUs

Learn More about this topic
  • Applications:
  • Featured Technologies:

Last week, Supermicro announced its new H13 A+ server solutions, featuring the latest fourth-generation AMD EPYC™ processors. The new AMD “Genoa”-class Supermicro A+ configurations will be able to handle up to 96 Zen4 CPU cores running up to 6TB of 12-channel DDR5 memory, using a separate channel for each stick of memory.

The various systems are designed to support the highest performance-intensive computing workloads over a wide range of storage, networking and I/O configuration options. They also feature tool-less chassis and hot-swappable modules for easier access to internal parts as well as I/O drive trays on both front and rear panels. All the new equipment can handle a range of power conditions, including 120 to 480 AC volt operation and 48 DC power attachments.

The new H13 systems have been optimized for AI, machine learning and complex calculation tasks for data analytics and other kinds of HPC applications. Supermicro’s 4th-Gen AMD EPYC™ systems employ the latest PCIe 5.0 connectivity throughout their layouts to speed data flows and provide high network and cluster internetworking performance. At the heart of these systems is the AMD EPYC™ 9004 series CPUs, which were also announced last week.

The Supermicro H13 GrandTwin® systems can handle up to six SATA3 or NVMe drive bays, which are hot-pluggable. The H13 CloudDC systems come in 1U and 2U chassis that are designed for cloud-based workloads and data centers that can handle up to 12 hot-swappable drive bays and support the Open Compute Platform I/O modules. Supermicro has also announced its H13 Hyper configuration for dual-socketed systems. All of the twin-socket server configurations support 160 PCIe 5.0 data lanes.

There are several GPU-intensive configurations for another series of both 4U and 8U sized servers that can support up to 10 GPU PCIe accelerator cards, including the latest graphic processors from AMD and Nvidia. The 4U family of servers support both AMD Infinity Fabric Link and NVIDIA NVLink Bridge technologies so users can choose the right balance of computation, acceleration, I/O and local storage specifications.

To get a deep dive on H13 products, including speeds, feeds and specs, download this whitepaper from the Supermicro site: Supermicro H13 Servers Enable High-Performance Data Centers.

Featured videos


Follow


Related Content

AMD Announces Fourth-Generation EPYC™ CPUs with the 9004 Series Processors

Featured content

AMD Announces Fourth-Generation EPYC™ CPUs with the 9004 Series Processors

AMD announces its fourth-generation EPYC™ CPUs. The new EPYC 9004 Series processors demonstrate advances in hybrid, multi-die architecture by decoupling core and I/O processes. Part 1 of 4.

Learn More about this topic
  • Applications:
  • Featured Technologies:
AMD very recently announced its fourth-generation EPYC™ CPUs.This generation will provide innovative solutions that can satisfy the most demanding performance-intensive computing requirements for cloud computing, AI and highly parallelized data analytic applications. The design decisions AMD made on this processor generation strirke a good balance among specificaitons, including higher CPU power and I/O performance, latency reductions and improvements in overall data throughput. This lets a single CPU socket address an increasingly larger world of complex workloads. 
 
The new AMD EPYC™ 9004 Series processors demonstrate advances in hybrid, multi-die architecture by decoupling core and I/O processes. The new chip dies support 12 DDR5 memory channels, doubling the I/O throughput of previous generations. The new CPUs also increase core counts from 64 cores in the previous EPYC 7003 chips to 96 cores in the new chips using 5-nanometer processes. The new generation of chips also increases the maximum memory capacity from 4TB of DDR4-3200 to 6TB of DDR5-4800 memory.
 
 
 
There are three major innovations evident in the AMD EPYC™ 9004 processor series:
  1. A  new hybrid multi-die chip architecture coupled with multi-processor server innovations and a new and more advanced Zen 4 instruction set along with support for an increase in dedicated L2 and shared L3 cache storage
  2. Security enhancements to AMD’s Infinity Guard
  3. Advances to system-on-chip designs that extend and enhance AMD Infinity switching fabric technology,
Taken together, the new AMD EPYC™ 9004 series processors can offer plenty of innovation and performance advantage. The new processors offer better performance per watt of power consumed and better per core performance, too.
 

Featured videos


Follow


Related Content

Are Your App Workloads Running in Parallel?

Featured content

Are Your App Workloads Running in Parallel?

Learn More about this topic
  • Applications:

To be effective at delivering performance-intensive applications, it pays to split up your workloads and run them simultaneously, a.k.a., in parallel. In the past, we didn’t really think about the resources required to run workloads, because many business computers were all-purpose machines. There was also a tendency to run loads serially to avoid bogging down due to heavy CPU utilization, heavy I/0 and so on.

 

But computers have become much more capable of late. What were once thought of as “desktop” computers have approached the arena once occupied by minicomputers and mainframes. Like the larger systems, they serve multiple concurrent users and higher-demanding applications. As a result, we need to think more carefully about how their various components – processor, memory, storage and network connections – interact, find and eliminate the bottlenecks between these components to make them useful for higher-end workloads.
 

Straighten out Bottlenecks


One way to eliminate bottlenecks is to break your apps into smaller, more digestible pieces that can run concurrently. As the new processors employ more cores and more sophisticated components, this means that more of your code can be consumed by the entire CPU package. This is the inherent nature of parallel processing, and why the world’s fastest supercomputers now routinely span thousands (and some in the millions) of cores.


A company called Weka has developed a file system designed to provide higher-speed data ingestion and more appropriate for machine learning and advanced mathematical modeling applications. Understanding the particular type of data storage – whether it is a parallel file system such as Weka, more scratch space for computations or better backups – can make a big difference in overall performance.


But it is also important how your apps work across the network. Is there a lot of back-and-forth between clients and servers, or sending a small chunk of data and waiting for a reply? This introduces a lot of downtime for the app, and these “wait states” should be identified and potentially eliminated.
 

Offload Workloads


Does your application do a lot of calculation? As discussed in an earlier story appearing on Performance-Intensive Computing, complementary processors, such as co-processors and GPUs, can be a big performance boost so long the processor can move on to its next task, working in parallel, instead of waiting for data returned from the offloaded computation.

 

Working in parallel can be a challenge when your apps frequently pause to wait for data from another process or are highly monolithic designed to run in a serial fashion. Such apps may be challenging to rewrite to take advantage cloud native or parallel operations. At some point, you are going to have to make that break and put in the programming effort to modernize your apps, but only you or your company can decide when it’s right to do that.

 

But if you can modify your workloads for this parallel structure and your hardware was designed to support it, you will see big benefits.

Featured videos


Follow


Related Content

Perspective: Looking Back on the Rise of Supercomputing

Featured content

Perspective: Looking Back on the Rise of Supercomputing

Learn More about this topic
  • Applications:
  • Featured Technologies:

We’ve come a long way on the development of high performance computing. Back in 2004, I attended an event held in the gym at the University of San Francisco. The goal was to crowd-source computing power by connecting the PCs of volunteers who were participating in the first “Flash Mob Computing” cluster computing event. Several hundred PCs were networked together in the hope that they would create one of the largest supercomputers, albeit for a few hours.

 

I brought two laptops for the cause. The participation rules stated that the  data on our hard drives would remain intact. Each computer would run a specially crafted boot CD that ran a benchmark called Linpack, a software library for performing numerical linear algebra running on Linux. It was used to measure the collective computing power.

 

The event attracted people with water-cooled overclocked PCs, naked PCs (no cases, just the boards and other components) and custom-made rigs with fancy cases. After a few hours, we had roughly 650 PCs on the floor of the gym. Each PC was connected to a bunch of Foundry BigIron super-switches that were located around the room.

 

The 2004 experiment brought out several industry luminaries, such as Gordon Bell, who was the father of the Digital Equipment Corporation VAX minicomputer, and Jim Gray, who was one of the original designers behind the TPC benchmark while he was at Tandem. Both men at the time were Microsoft fellows. Bell was carrying his own laptop but had forgotten to bring his CD drive, so he couldn’t connect to the mob.

 

Network shortcomings

 

What was most interesting to me, and what gave rise to the mob’s eventual undoing, were the networking issues involved with assembling and running such a huge collection of gear. The mob used ordinary 100BaseT Ethernet, which was a double-edged sword. While easy to set up, it was difficult to debug when network problems arose. The Linpack benchmark requires all the component machines to be running concurrently during the test, and the organizers had trouble getting all 600-plus PCs to operate online flawlessly. The best benchmark accomplished was a peak rate of 180 gigaflops using 256 computers, but that wasn’t an official score as one node failed during the test.

 

To give you an idea of where this stood in terms of overall supercomputing prowess, it was better than the Cray supercomputers of the early 1990s, which delivered around 16 gigaflops.If you lo

 

At the website top500.org (which tracks the fastest supercomputers around the globe), you can see that all the current top 500 machines are measured in petaflops (1 million gigaflops). The Oak Ridge National Laboratory’s Frontier machine, which has occupied the number one spot this year, weighs in at more than 1,000 petaflops and uses 8 million cores. To make the fastest 500 list back in 2004, the mob would have had to achieve a benchmark of over 600 gigaflops. Because of the networking problems, we’ll never know for sure.Still, it was an impressive achievement, given the motley mix of machines. All of the world’s top 500 supercomputers are custom built and carefully curated and assembled to attain that level of computing performance.

 

Another historical note: back in 2004, one of the more interesting entries came in third on the top500.org list: a collection of several thousand Apple Macintoshes running at Virginia Polytechnic University. Back in the present, as you might imagine, almost all the fastest 500 supercomputers are based on a combination of CPU and GPU chip architectures.

 

Today, you can buy your own supercomputer on the retail market, such as the Supermicro SuperBlade® models. And of course, you can routinely run much faster networking protocols than 100-megabit Ethernet.

Featured videos


Follow


Related Content

Register to Watch Supermicro's Sweeping A+ Launch Event on Nov. 10

Featured content

Register to Watch Supermicro's Sweeping A+ Launch Event on Nov. 10

Join Supermicro online Nov. 10th to watch the unveiling of the company’s new A+ systems -- featuring next-generation AMD EPYC™ processors. They can't tell us any more right now. But you can register for a link to the event by scrolling down and signing-up on this page.
Learn More about this topic
  • Applications:
  • Featured Technologies:

Featured videos


Follow


Related Content

The Perfect Combination: The Weka Next-Gen File System, Supermicro A+ Servers and AMD EPYC™ CPUs

Featured content

The Perfect Combination: The Weka Next-Gen File System, Supermicro A+ Servers and AMD EPYC™ CPUs

Weka’s file system, WekaFS, unifies your entire data lake into a shared global namespace where you can more easily access and manage trillions of files stored in multiple locations from one directory.

Learn More about this topic
  • Applications:
  • Featured Technologies:
  • Featured Companies:
  • Weka.io

One of the challenges of building machine learning (ML) models is managing data. Your infrastructure must be able to process very large data sets rapidly as well as ingest both structured and unstructured data from a wide variety of sources.

 

That kind of data is typically generated in performance-intensive computing areas like GPU-accelerated applications, structural biology and digital simulations. Such applications typically have three problems: how to efficiently fill a data pipeline, how to easily integrate data across systems and how to manage rapid changes in data storage requirements. That’s where Weka.io comes into play, providing higher-speed data ingestion and avoiding unnecessary copies of your data while making it available across the entire ML modeling space.

 

Weka’s file system, WekaFS, has been developed just for this purpose. It unifies your entire data lake into a shared global namespace where you can more easily access and manage trillions of files stored in multiple locations from one directory. It works across both on-premises and cloud storage repositories and is optimized for cloud-intensive storage so that it will provide the lowest possible network latencies and highest performance.

 

This next-generation data storage file system has several other advantages: it is easy to deploy, entirely software-based, plus it is a storage solution that provides all-flash level performance, NAS simplicity and manageability, cloud scalability and breakthrough economics. It was designed to run on any standard x86-based server hardware and commodity SSDs or run natively in the public cloud, such as AWS.

 

Weka’s file system is designed to scale to hundreds of petabytes, thousands of compute instances and billions of files. Read and write latency for file operations against active data is as low as 200 microseconds in some instances.

 

Supermicro has produced its own NVMe Reference Architecture that supports WekaFS on some of its servers, including the Supermicro A+ AS-1114S-WN10RT and AS-2114S-WN24RT using the AMD EPYC™ 7402P processors with at least 2TB of memory, expandable to 4TB. Both servers support hot-swappable NVMe storage modules for ultimate performance. Also check out the Supermicro WekaFS A/I and HPC Solution Bundle.

 

 

Featured videos


Follow


Related Content

Eliovp Increases Blockchain-Based App Performance with Supermicro Servers

Featured content

Eliovp Increases Blockchain-Based App Performance with Supermicro Servers

Eliovp, which brings together computing and storage solutions for blockchain workloads, rewrote its code to take full advantage of AMD’s Instinct MI100 and MI250 GPUs. As a result, Eliovp’s blockchain calculations run up to 35% faster than what it saw on previous generations of its servers.

Learn More about this topic
  • Applications:
  • Featured Technologies:
  • Featured Companies:
  • Eliovp

When you’re building blockchain-based applications, you typically need a lot of computing and storage horsepower. This is the niche that Belgium-based Eliovp fills. They have developed a line of extremely fast cloud-based servers designed to run demanding blockchain workloads.

 

Eliovp has been recognized as the top Filecoin storage provider in Europe. This refers to a decentralized blockchain-based protocol that lets anyone rent spare local storage and is a key Web3 component.

 

To satisfy the compute  and storage needs, Eliiovp employs Supermicro’s A+ AS-1124US® and AS-4124GS® servers, running quad-core AMD EPYC 7543 and 7313 CPUs and as many as 8 AMD Instinct MI100 and MI250 GPUs to further boost performance.

 

What makes these servers especially potent is that Eliovp rewrote its code to run on this specific AMD Instinct GPU family. As a result, Eliovp’s blockchain calculations run up to 35% faster than what it saw on previous generations of its servers.

 

One of the attractions of the Supermicro servers is the capability to leverage the high-density core count and higher clock speeds as well as the 32 memory slots. And it comes packaged in a relatively small form factor.

 

“By working with Supermicro, we get new generations of servers with AMD technology earlier in our development cycle, enabling us to bring our products to market faster," said Elio Van Puyvelde, CEO of Eliovp. The company was able to take advantage of new CPU and GPU instructions and memory management to make its code more efficient and effective. Eliovp was also able to reduce overall server power consumption, which is always important in blockchain applications that span dozens of machines.

 

Featured videos


Follow


Related Content

Microsoft Azure’s More Capable Compute Instances Take Advantage of the Latest AMD EPYC™ Processors

Featured content

Microsoft Azure’s More Capable Compute Instances Take Advantage of the Latest AMD EPYC™ Processors

Azure HBv3 series virtual machines (VMs) are optimized for HPC applications, such as fluid dynamics, explicit and implicit finite element analysis, weather modeling, seismic processing, and various simulation tasks. HBv3 VMs feature up to 120 Third-Generation AMD EPYC™ 7v73X-series CPU cores with more than 450 GB of RAM.

Learn More about this topic
  • Applications:
  • Featured Technologies:
  • Featured Companies:
  • Azure

Increasing demands for higher-performance computing mean that the cloud-based computing needs to ratchet up its performance too. Microsoft Azure has introduced more capable compute virtual machines (VMs) that take advantage of the latest from AMD EPYC™ processors. This means that developers can easily spin up VMs that normally cost thousands of dollars if they were to purchase their physical equivalents.

 

This story's focus is on two of Azure's series: HBv3 and NVv4. In most cases, a single virtual machine is used to take advantage of all its resources. High-performance examples of Azure HBv3 series VMs are optimized for HPC applications, such as fluid dynamics, explicit and implicit finite element analysis, weather modeling, seismic processing, and various simulation tasks. HBv3 VMs feature up to 120 Third-Generation AMD EPYC™ 7v73X-series CPU cores with more than 450 GB of RAM. This series of VMs has processor clock frequencies up to 3.5GHz. All HBv3-series VMs feature 200Gb/sec HDR InfiniBand switches to enable supercomputer-scale HPC workloads. The VMs are connected and optimized to deliver the most consistent performance. Get more information about AMD EPYC and Microsoft Azure virtual machines.

 

A Dutch construction company, TBI, is using the Azure NVv4 to run computer-aided design and building modeling tasks on a series of virtual Windows desktops. The NVv4 VMs are only available running Windows powered by from four to 32 AMD EPYC™ vCPUs and offering a partial to full AMD Instinct™ M125 GPU with memory ranging from 2GB to 17GB. Previous generations of NV instances used Intel CPUs and NVIDIA GPUs that offer less performance.

 

TBI chose this solution because it was cheaper, easier to support and keep its software collection updated. Using virtual desktops meant that no client data was stored on any laptops, making things more secure. Also, these instances delivered equivalent performance, taking advantage of the SR-IOV technology.

 

Supermicro offers a wide range of servers that incorporate the AMD EPYC™ CPU and a number of servers optimized for applications that use GPUs. These servers range from 1U rackmount servers to high end 4U GPU optimized systems. Whether you’re using it on-prem or you’re building your own cloud, Supermicro’s Aplus servers are optimized for performance and technical computing applications and they run Azure and other systems well. Get more information about Supermicro servers with AMD’s EPYC™ CPUs.

Featured videos


Follow


Related Content

Pages