Sponsored by:

Visit AMD Visit Supermicro

Performance Intensive Computing

Capture the full potential of IT

Which media server should you use when you absolutely can’t lose data?

Featured content

Which media server should you use when you absolutely can’t lose data?

A new Linus Tech Tip video shows a real-world implementation of Supermicro storage servers powered by AMD EPYC processors to provide super-high reliability.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Are your customers looking for a top-performing media server? And are you looking for a surprisingly entertaining video review of the best one? Then look no further. You’ll find both in the latest Linus Tech Tip video.

This episode, sponsored by Supermicro, is entitled “This Server CANNOT Lose Data.” That gives you an idea of its primary focus: high reliability.

And that reliability is delivered courtesy of a sophisticated server/storage cluster featuring Supermicro GrandTwin A+ multinode servers.

Myriad redundancies

What makes the GrandTwin so reliable? Redundancy. As video host Linus Sebastian exclaims, “Inside this 2U are 4 independent computers!”

Each computer, or node, is powered by a 2.45GHz AMD EPYC processor with up to 128 cores and a 256MB L3 cache. Each node also has 4 front hot-swap 2.5-inch drive bays that can hold petabytes of either NVMe or SATA storage.

The GrandTwin’s nodes can handle up to 3TB of DDR5 ECC server memory. They also have dual M.2 slots for boot drives and 6 PCIe Gen 5 x16 slots for networking, graphics and other expansion cards.

GrandTwin’s high-availability design extends all the way down to its dual power supplies. To ensure the system always has a reliable flow of power to all its vital components, Supermicro added two redundant 2200-watt titanium-level PSUs.

Handling the heat generated by this monster machine is paramount. The GrandTwin takes care of all that hot air via 4 high-speed fans—one fan in each PSU, plus 2 dedicated heavy-duty 8-cm. fans spinning at more than 17,000 RPM.

Prime processing

At the core of each of the GrandTwin’s 4 nodes is an AMD 9004-series processor. Linus’ prized media server, known as “Whonnock 10,” sports an AMD EPYC 9534 CPU in each node.

The EPYC 9534’s cores—there are 64 of them—operate at 2.45GHz and can boost up to 3.7GHz. And because each EPYC processor boasts 12 memory channels, the GrandTwin can address up to 12TB of memory systemwide.

Don’t call it overkill

As Linus says with unbridled enthusiasm, when it comes to redundancy, the name of the game is avoiding “split brain.”

The dreaded split brain can occur when redundant servers have their own object storage. The failure of even a single system can lead to a situation in which each server believes it has the correct data.

If there are only 2 servers, proving which system is correct is impossible. On the other hand, operating 3 or more servers allows the system to resolve the argument and determine the correct data.

Linus and company installed 2 GrandTwin A+ servers. That gives them the 8 redundant systems recommended by their preferred NVMe file system, WEKA.

A multitude of use cases

Your customers may have to contend with thousands of hours of high-resolution videos, like Linus and his cohorts. Or they may develop AI-enabled applications, provide cloud gaming, or host mission-critical web applications.

Whatever the use case, they can benefit from high-reliability servers designed with built-in redundancies. When failure is not an option, your customers need a server that, as the video puts it, “CANNOT lose data.”

That means helping your customers deploy Supermicro GrandTwin A+ servers powered by AMD EPYC processors. It’s the ultimate high-reliability system.

After all, as Linus says, “You only server once.”

Do more:

 

 

Featured videos


Follow


Related Content

How CSPs can accelerate the data center

Featured content

How CSPs can accelerate the data center

A new webinar, now available on demand, offers cloud service providers an overview of new IDC research, outlines roadblocks, and offers guidelines for future success.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Are you a cloud services provider—or a CSP wannabe—wondering how to expand your data center in ways that will both keep your customers happy and help you turn a profit?

If so, a recent webinar sponsored by Supermicro and AMD can help. Entitled Accelerate Your Cloud: Best Practices for CSPs, it was moderated by Wendell Wenjen, director of storage market development at Supermicro. Best of all, you can now view this webinar on demand.

Here’s a taste of what you’ll see:

IDC research on CSP buying plans

The webinar’s first speaker is Ashish Nadkarni, group VP and GM of worldwide infrastructure research at IDC. He summarizes new IDC research on technology adoption trends and strategies among service providers.

Sales growth, IDC says, is coming mainly in 4 areas: Infrastructure as a Service (IaaS), hardware (both servers and storage), software and IT services. The good news, Nadkarni adds, is that all 4 can be offered by service providers.

Data centers remain important, Nadkarni says. Not everyone wants to use the public cloud, and not every workload belongs there.

IDC expects that 5 key technologies will be immune to budget cuts:

  • AI and automation
  • Security, risk and compliance
  • Optimization of IT infrastructure and IT operations
  • Back-office applications (HR, SCM and ERP)
  • Customer experience initiatives (for example, chatbots)

Generative AI dominates the conversation, Nadkarni said, and for good reason: IDC expects that this year, GenAI will double the productive use of unstructured data, helping workers discover new insights and knowledge.

Supply-chain issues remain a daunting challenge, IDC finds. Delays can hurt a CSP’s ability to deliver projects, increase the cost of delivering services, and even impair service quality. Owning the supply chain will remain vital.

Other tactics for change, Nadkarni said, include offering a transformation road map; working with a full-stack portfolio provider; and developing a long-term vision for why customers will want to do business with you.

10 steps to data-center scaling

Next up in the webinar is Sim Upadhyayula, VP of solutions enablement at Supermicro. He offered a list of 10 essential steps for scaling a CSP data center.

Topping his list: standardize and scale. There’s no way you can know exactly which workloads will dominate in the future. So be modular. That way, you can scale in smaller increments, keeping customers happy while controlling your costs.

Next on the list: optimize for applications. Unlike big enterprises, most CSPs cannot afford to build application silos. Instead, leading providers will develop an architecture that can cater to all. That means using standard hardware that can later be optimized for specific workloads.

Common challenges

Suresh Andani, AMD’s senior director of product management for server cloud, is up next. He discusses 3 key CSP challenges:

  • Market disruption: Caused by a changing ISV landscape, and by increasing power and cooling costs.
  • Aging infrastructure: Service providers with older systems find them costly to maintain, unable to keep pace with customers’ business demands, and vulnerable to increasingly dangerous security threats.
  • Expanding demands: Customers keep raising the bar on core workloads AI, cloud-native applications, digital transformation, the hybrid workforce and security enhancements.

During the webinar’s concluding roundtable discussion, Andani also emphasized the importance of marrying the right infrastructure with your workloads. That way, he said, CSPs can operate efficiently, making the most of their power and compute cycles.

“Work with your vendors to provide the best compute solutions,” Andani of AMD advised. “Later you can offer a targeted infrastructure for high performance compute, another for enterprise workloads, another for gaming, and another for rendering.”

Lean on your providers, he added, to provide the right solution, whether your target is performance or cost.

Do more:

 

Featured videos


Follow


Related Content

Tech Explainer: What’s the difference between AI training and AI inference?

Featured content

Tech Explainer: What’s the difference between AI training and AI inference?

AI training and inference may be two sides of the same coin, but their compute needs can be quite different. 

Learn More about this topic
  • Applications:
  • Featured Technologies:

Artificial Intelligence (AI) training and inference are two sides of the same coin. Training is the process of teaching an AI model how to perform a given task. Inference is the AI model in action, drawing its own conclusions without human intervention.

Take a theoretical machine learning (ML) model designed to detect counterfeit one-dollar bills. During the training process, AI engineers would feed the model large data sets containing thousands, or even millions, of pictures. And tell the training application which are real and which are counterfeit.

Then inference could kick in. The AI model could be uploaded to retail locations, then run to detect bogus bills.

A deeper look at training

That’s the high level. Let’s dig in a bit deeper.

Continuing with our bogus-bill detecting workload, during training, the pictures fed to the AI model would include annotations telling the AI how to think about each piece of data.

For instance, the AI might see a picture of a dollar bill with an embedded annotation that essentially tells the model “this is legal tender.” The annotation could also identify characteristics of a genuine dollar, such as the minute details of the printed iconography and the correct number of characters in the bill’s serial number.

Engineers might also feed the AI model pictures of counterfeit bills. That way, the model could learn the tell-tale signs of a fake. These might include examples of incomplete printing, color discrepancies and missing watermarks.

On to inference

One the training is complete, inference can take over.

Still with our example of counterfeit detection, the AI model could now be uploaded to the cloud, then connected with thousands of point-of-sale (POS) devices in retail locations worldwide.

Retail workers would scan any bill they suspect might be fake. The machine learning model, in turn, would then assess the bill’s legitimacy.

This process of AI inference is autonomous. In other words, once the AI enters inference, it’s no longer getting help from engineers and app developers.

Using our example, during inference the AI system has reached the point where it can reliably discern both legal and counterfeit bills. And it can do so with a high enough success percentage to satisfy its human controllers.

Different needs

AI training and inference also have different technology requirements. Basically, training is far more resource-intensive. The focus is on achieving low-latency operation and brute force.

Training a large language model (LLM) chatbot like the popular ChatGPT often forces its underlying technology to contend with more than a trillion parameters. An AI parameter is a variable learned by the LLM during training. These parameters include configuration settings and components that define the LLM’s behavior.)

To meet these requirements, IT operations must deploy a system that can bring to bear raw computational power in a vast cluster.

By contrast, inference applications have different compute requirements. “Essentially, it’s, ‘I’ve trained my model, now I want to organize it,’” explained AMD executive VP and CTO Mark Papermaster in a recent virtual presentation.

AMD’s dual-processor solution

Inferencing workloads are both more concise and less demanding than those for training. Therefore, it makes sense to run them on more affordable GPU-CPU combination technology like the AMD Instinct MI300A.

The AMD Instinct MI300A is an accelerated processing unit (APU) that combines the facility of a standard AI accelerator with the efficiency of AMD EPYC processors. Both the CPU and GPU elements can share memory, dramatically enhancing efficiency, flexibility and programmability.

A single AMD MI300A APU packs 228 GPU compute units, 24 of AMD’s ‘Zen 4’ CPU cores, and 128GB of unified HBM3 memory. Compared with the previous-generation AMD MI250X accelerators, this translates to approximately 2.6x the workload performance per watt using FP32.

That’s a significant increase in performance. It’s likely to be repeated as AI infrastructure evolves along with the proliferation of AI applications that now power our world.

Do more:

 

 

Featured videos


Follow


Related Content

AMD and Supermicro: Pioneering AI Solutions

Featured content

AMD and Supermicro: Pioneering AI Solutions

In the constantly evolving landscape of AI and machine learning, the synergy between hardware and software is paramount. Enter AMD and Supermicro, two industry titans who have joined forces to empower organizations in the new world of AI with cutting-edge solutions.

Learn More about this topic
  • Applications:
  • Featured Technologies:

Bringing AMD Instinct to the Forefront

In the constantly evolving landscape of AI and machine learning, the synergy between hardware and software is paramount. Enter AMD and Supermicro, two industry titans who have joined forces to empower organizations in the new world of AI with cutting-edge solutions. Their shared vision? To enable organizations to unlock the full potential of AI workloads, from training massive language models to accelerating complex simulations.

The AMD Instinct MI300 Series: Changing The AI Acceleration Paradigm

At the heart of this collaboration lies the AMD Instinct MI300 Series—a family of accelerators designed to redefine performance boundaries. These accelerators combine high-performance AMD EPYC™ 9004 series CPUs with the powerful AMD InstinctTM MI300X GPU accelerators and 192GB of HBM3 memory, creating a formidable force for AI, HPC, and technical computing.

Supermicro’s H13 Generation of GPU Servers

Supermicro’s H13 generation of GPU Servers serves as the canvas for this technological masterpiece. Optimized for leading-edge performance and efficiency, these servers integrate seamlessly with the AMD Instinct MI300 Series. Let’s explore the highlights:

8-GPU Systems for Large-Scale AI Training:

  • Supermicro’s 8-GPU servers, equipped with the AMD Instinct MI300X OAM accelerator, offer raw acceleration power. The AMD Infinity Fabric™ Links enable up to 896GB/s of peak theoretical P2P I/O bandwidth, while the 1.5TB HBM3 GPU memory fuels large-scale AI models.
  • These servers are ideal for LLM Inference and training language models with trillions of parameters, minimizing training time and inference latency, lowering the TCO and maximizing throughput.

Benchmarking Excellence

But what about real-world performance? Fear not! Supermicro’s ongoing testing and benchmarking efforts have yielded remarkable results. The continued engagement between AMD and Supermicro performance teams enabled Supermicro to test pre-release ROCm versions with the latest performance optimizations and publicly released optimization like Flash Attention 2 and vLLM. The Supermicro AMD-based system AS -8125GS-TNMR2 showcases AI inference prowess, especially on models like Llama-2 70B, Llama-2 13B, and Bloom 176B. The performance? Equal to or better than AMD’s published results from the Dec. 6 Advancing AI event.

Image - Blog - AMD and Supermicro Pioneering AI Solutions

Charles Liang’s Vision

In the words of Charles Liang, President and CEO of Supermicro:

“We are very excited to expand our rack scale Total IT Solutions for AI training with the latest generation of AMD Instinct accelerators. Our proven architecture allows for fully integrated liquid cooling solutions, giving customers a competitive advantage.”

Conclusion

The AMD-Supermicro partnership isn’t just about hardware and software stacks; it’s about pushing boundaries, accelerating breakthroughs, and shaping the future of AI. So, as we raise our virtual glasses, let’s toast to innovation, collaboration, and the relentless pursuit of performance and excellence.

Featured videos


Follow


Related Content

10 best practices for scaling the CSP data center — Part 1

Featured content

10 best practices for scaling the CSP data center — Part 1

Cloud service providers, here are best practices—courtesy of Supermicro—to help you design and deploy rack-scale data centers. 

Learn More about this topic
  • Applications:
  • Featured Technologies:

Cloud service providers, here are 10 best practices—courtesy of Supermicro—that you can follow for designing and deploying rack-scale data centers. All are based on Supermicro’s real-world experience with customers around the world.

Best Practice No. 1: First standardize, then scale

First, select a configuration of compute, storage and networking. Then scale these configurations up and down into setups you designate as small, medium and large.

Later, you can deploy these standard configurations at various data centers with different numbers of users, workload sizes and growth estimates.

Best Practice No. 2: Optimize the configuration

Good as Best Practice No. 1 is, it may not work if you handle a very wide range of workloads. If that’s the case, then you may want to instead optimize the configuration.

Here’s how. First, run the software on the rack configuration to determine the best mix of CPUs, including cores, memory, storage and I/O. Then consider setting up different sets of optimized configurations.

For example, you might send AI training workloads to GPU-optimized servers. But a database application on a standard 2-socket CPU system.

Best Practice No. 3: Plan for tech refreshes 

When it comes to technology, the only constant is change itself. That doesn’t mean you can just wait around for the latest, greatest upgrade. Instead, do some strategic planning.

That might mean talking with key suppliers about their road maps. What are their plans for transitions, costs, supply chains and more?

Also consider that leading suppliers now let you upgrade some server components without having to replace the entire chassis. That reduces waste. That could also help you get more power from your current racks, servers and power requirements.

Best Practice No. 4: Look for new architectures

New architectures can help you increase power at lower cost. For example, AMD and Supermicro offer data-center accelerators that let you run AI workloads on a mix of GPUs and CPUs, a less costly alternative to all-GPU setups.

To find out if you could benefit from new architectures, talk with your suppliers about running proof-of-concept (PoC) trials of their new technologies. In other words, try before you buy.

Best Practice No. 5: Create a support plan

Sure, you need to run 24x7, but that doesn’t mean you have to pay third parties for all of that. Instead, determine what level of support you can provide in-house. For what remains, you can either staff up or outsource.

When you do outsource, make sure your supplier has tested your software stack before. You want to be sure that, should you have a problem, the supplier will be able to respond quickly and correctly.

Do more:

 

Featured videos


Follow


Related Content

10 best practices for scaling the CSP data center — Part 2

Featured content

10 best practices for scaling the CSP data center — Part 2

Cloud service providers, here are more best practices—courtesy of Supermicro—that you can follow for designing and deploying rack-scale data centers. 

Learn More about this topic
  • Applications:
  • Featured Technologies:

Cloud service providers, here are 5 more best practices—courtesy of Supermicro—that you can follow for designing and deploying rack-scale data centers. All are based on Supermicro’s real-world experience with customers around the world.

Best Practice No. 6: Design at the data-center level

Consider your entire data center as a single unit, complete with its range of both strengths and weaknesses. This will help you tackle such macro-level issues as the separation of hot and cold aisles, forced air cooling, and the size of chillers and fans.

If you’re planning an entirely new data center, remember to include a discussion of cooling tech. Why? Because the physical infrastructure needed for an air-cooled center is quite different than that needed for liquid cooling.

Best Practice No. 7: Understand & consider liquid cooling

We’re approaching the limits of air cooling. A new approach, one based on liquid cooling, promises to keep processors and accelerators running within their design limits.

Liquid cooling can also reduce a data center’s Power Usage Effectiveness (PUE) ratio, a measure of how much energy is used by a center’s computing equipment. This cooling tech can also minimize the need for HVAC cooling power.

Best Practice No. 8: Measure what matters

You can’t improve what you don’t measure. So make sure you are measuring such important factors as your data center’s CPU, storage and network utilization.

Good tools are available that can take these measurements at the cluster level. These tools can also identify both bottlenecks and levels of component over- or under-use.

Best Practice No. 9: Manage jobs better

A CSP’s data center is typically used simultaneously by many customers. That pretty much means using a job-management scheduler tool.

One tricky issue is over-demand. That is, what do you do if you lack enough resources to satisfy all requests for compute, storage or networking? A job scheduler can help here, too.

Best Practice No. 10: Simplify your supply chain

Sure, competition across the industry is a good thing, driving higher innovation and lower prices. But within a single data center, standardizing on just a single supplier could be the winning ticket.

This approach simplifies ordering, installation and support. And if something should go wrong, then you’ll have only the proverbial “one throat to choke.”

Can you still use third-party hardware as appropriate? Sure. And with a single main supplier, that integration should be simpler, too.

Do more:

 

Featured videos


Follow


Related Content

Data-center service providers: ready for transformation?

Featured content

Data-center service providers: ready for transformation?

An IDC researcher argues that providers of data-center hosting services face new customer demands that require them to create new infrastructure stacks. Key elements will include rack-scale integration, accelerators and new CPU cores. 

Learn More about this topic
  • Applications:
  • Featured Technologies:

If your organization provides data-center hosting services, brace yourself. Due to changing customer demands, you’re about to need an entirely new infrastructure stack.

So argues Chris Drake, a senior research director at market watcher IDC, in a recently published white paper sponsored by Supermicro and AMD, The Power of Now: Accelerate the Datacenter.

In his white paper, Drake asserts that this new data center infrastructure stack will include new CPU cores, accelerated computing, rack-scale integration, a software-defined architecture, and the use of a micro-services application environment.

Key drivers

That’s a challenging list. So what’s driving the need for this new infrastructure stack? According to Drake, changing customer requirements.

More specifically, a growing need for hosted IT requirements. For reasons related to cost, security and performance, many IT shops are choosing to retain proprietary workloads on premises and in private-cloud environments.

While some of these IT customers have sufficient capacity in their data centers to host these workloads on prem, many don’t. They’ll rely instead on service providers for a range of hosted IT requirements. To meet this demand, Drake says, service providers will need to modernize.

Another driver: growing customer demand for raw compute power, a direct result of their adoption of new, advanced computing tools. These include analytics, media streaming, and of course the various flavors of artificial intelligence, including machine learning, deep learning and generative AI.

IDC predicts that spending on servers ranging in price from $10K to $250K will rise from a global total of $50.9 billion in 2022 to $97.4 billion in 2027. That would mark a 5-year compound annual growth rate of nearly 14%.

Under the hood

What will building this new infrastructure stack entail? Drake points to 5 key elements:

  • Higher-performing CPU cores: These include chiplet-based CPU architectures that enable the deployment of composable hardware architectures. Along with distributed and composable hardware architectures, these can enable more efficient use of shared resources and more scalable compute performance.
  • Accelerated computing: Core CPU processing will increasingly be supplemented by hardware accelerators, including those for AI. They’ll be needed to support today’s—and tomorrow’s—increasingly diverse range of high-performance and data-intensive workloads.
  • Rack-scale integration: Pre-tested racks can facilitate faster deployment, integration and expansion. They can also enable a converged-infrastructure approach to building and scaling a data center.
  • Software-defined data center technology: In this approach, virtualization concepts such as abstraction and pooling are extended to a data center’s compute, storage, networking and other resources. The benefits include increased efficiency, better management and more flexibility.
  • A microservices application architecture: This approach divides large applications into smaller, independently functional units. In so doing, it enables a highly modular and agile way for applications to be developed, maintained and upgraded.

Plan for change

Rome wasn’t built in a day. Modernizing a data center will take time, too.

To help service providers implement a successful modernization, Drake of IDC offers this 6-point action plan:

1. Develop a transformation road map: Aim to strike a balance between harnessing new technology opportunities on the one hand and being realistic about your time frames, costs and priorities on the other.

2. Work with a full-stack portfolio vendor: You want a solution that’s tailored for your needs, not just an off-the-rack package. “Full stack” here means a complete offering of servers, hardware accelerators, storage and networking equipment—as well as support services for all of the above.

3. Match accelerators to your workloads: You don’t need a Formula 1 race car to take the kids to school. Same with your accelerators. Sure, you may have workloads that require super-low latency and equally high thruput. But you’re also likely to be supporting workloads that can take advantage of more affordable CPU-GPU combos. Work with your vendors to match their hardware with your workloads.

4. Seek suppliers with the right experience: Work with tech vendors that know what you need. Look for those with proven track records of helping service providers to transform and scale their infrastructures.

5. Select providers with supply-chain ownership: Ideally, your tech vendors will fully own their supply chains for boards, systems and rack designs such as liquid-cooling systems. That includes managing the vertical integration needed to combine these elements. The right supplier could help you save costs and get to market faster.

6. Create a long-term plan: Plan for the short term, but also look ahead into the future. Technology isn’t sitting still, and neither should you. Plan for technology refreshes. Ask your vendors for their road maps, and review them. Decide what you can support in-house versus what you’ll probably need to hand off to partners.

Now do more:

 

Featured videos


Follow


Related Content

At MWC, Supermicro intros edge server, AMD demos tech advances

Featured content

At MWC, Supermicro intros edge server, AMD demos tech advances

Learn what Supermicro and AMD showed at the big mobile world conference in Barcelona. 

Learn More about this topic
  • Applications:
  • Featured Technologies:

This year’s MWC Barcelona, held Feb. 27 - 29, was a really big show. Over 101,000 people attended from 205 countries and territories. More than 2,700 organizations either exhibited, partnered or sponsored. And over 1,100 subject-matter experts made presentations.

Among those many exhibitors were Supermicro and AMD.

Supermicro showed off the company’s new AS -1115SV, a cost-optimized, single-AMD-processor server for the edge data center.

And AMD offered demos on AI engines, cryptography for quantum computing and more.

Supermicro AS -1115SV

Okay, Supermicro’s full SKU for this system is A+ Server AS -1115SV-WTNRT. That’s a mouthful, but the essence is simple: It’s a 1U short-depth server, powered by a single AMD processor, and designed for the edge data center.

The single CPU in question is an AMD EPYC 8004 Series processor with up to 64 cores. Memory maxes out at 576 GB of DDR5, and you also get 3 PCIe 5.0 x16 slots and up to 10 hot-swappable 2.5-inch drive bays.

The server’s intended applications include virtualization, firewall, edge computing, cloud services, and database/storage. Supermicro says the server’s high efficiency and low power envelope make it ideal for both telco and edge applications.

AMD’s MWC demos

AMD gave a slew of demos AMD from its MWC booth. Here are three:

  • 5G advanced & AI integrated on the same device: To meet today’s requirements, both 5G advanced and 6G wireless communication systems require that intensive signal processing and novel AI algorithms can be implemented on the same device and AI engine. AMD demo’d its AI Engines, power-efficient, general-purpose processors that can be programmed to address both signal-processing and AI requirements in future wireless systems.
  • High-performance quantum safe cryptography​: Quantum computing threatens the security of existing asymmetric or public-key cryptographic algorithms. This demo showed some powerful alternatives on AMD devices: Kyber, Dilithum and PQShield.
  • GreenRAN 5G on EPYC 8004 Series processors: GreenRAN is an open RAN (radio access network) solution from Parallel Wireless. It’s designed to operate seamlessly across various general-purpose CPUs—including, as this demo showed, the AMD 8004 EPYC family.

Do more:

 

Featured videos


Follow


Related Content

Supermicro Adds AI-Focused Systems to H13 JumpStart Program

Featured content

Supermicro Adds AI-Focused Systems to H13 JumpStart Program

Supermicro is now letting you validate, test and benchmark AI workloads on its AMD-based H13 systems right from your browser. 

Learn More about this topic
  • Applications:
  • Featured Technologies:

Supermicro has added new AI-workload-optimized GPU systems to its popular H13 JumpStart program. This means you and your customers can validate, test and benchmark AI workloads on a Supermicro H13 system right from your PC’s browser.

The JumpStart program offers remote sessions to fully configured Supermicro systems with SSH, VNC, and web IPMI. These systems feature the latest AMD EPYC 9004 Series Processors with up to 128 ‘Zen 4c’ cores per socket, DDR5 memory, PCIe 5.0, and CXL 1.1 peripherals support.

In addition to previously available models, Supermicro has added the H13 4U GPU System with dual AMD EPYC 9334 processors and Nvidia L40S AI-focused universal GPUs. This H13 configuration is designed for heavy AI workloads, including applications that leverage machine learning (ML) and deep learning (DL).

3 simple steps

The engineers at Supermicro know the value of your customer’s time. So, they made it easy to initiate a session and get down to business. The process is as simple as 1, 2, 3:

  • Select a system: Go to the main H13 JumpStart page, then scroll down and click one of the red “Get Access” buttons to browse available systems. Then click “Select Access” to pick a date and time slot. On the next page, select the configuration and press “Schedule” and then “Confirm.”
  • Sign In: log in with a Supermicro SSO account to access the JumpStart program. If you or your customers don’t already have an account, creating a new account is both free and easy.
  • Initiate secure access: When the scheduled time arrives, begin the session by visiting the JumpStart page. Each server will include documentation and instructions to help you get started quickly.

So very secure

Security is built into the program. For instance, the server is not on a public IP address. Nor is it directly addressable to the Internet. Supermicro sets up the jump server as a proxy, and this provides access to only the server you or your customer are authorized to test.

And there’s more. After your JumpStart session ends, the server is manually secure-erased, the BIOS and firmware are re-flashed, and the OS is reinstalled with new credentials. That way, you can be sure any data you’ve sent to the H13 system will disappear once the session ends.

Supermicro is serious about its security policies. However, the company still warns users to keep sensitive data to themselves. The JumpStart program is meant for benchmarking, testing and validation only. In their words, “processing sensitive data on the demo server is expressly prohibited.”

Keep up with the times

Supermicro’s expertly designed H13 systems are at the core of the JumpStart program, with new models added regularly to address typical workloads.

In addition to the latest GPU systems, the program also features hardware focused on evolving data center roles. This includes the Supermicro H13 CloudDC system, an all-in-one rackmount platform for cloud data centers. Supermicro CloudDC systems include single AMD EPYC 9004 series processors and up to 10 hot-swap NVMe/SATA/SAS drives.

You can also initiate JumpStart sessions on Supermicro Hyper Servers. These multi-use machines are optimized for tasks including cloud, 5G core, edge, telecom and hyperconverged storage.

Supermicro Hyper Servers included in the company’s JumpStart program offer single or dual processor configurations featuring AMD EPYC 9004 processors and up to 8TB of DDR5 memory in a 1U or 2U form factor.

Helping your customers test and validate a Supermicro H13 system for AI is now easy. Just get a JumpStart.

Do more:

 

Featured videos


Follow


Related Content

AMD CTO: ‘AI across our entire portfolio’

Featured content

AMD CTO: ‘AI across our entire portfolio’

In a presentation for industry analysts, AMD chief technology officer Mark Papermaster laid out the company’s vision for artificial intelligence everywhere — from PC and edge endpoints to the largest hypervisor servers.

Learn More about this topic
  • Applications:
  • Featured Technologies:

The current buildout of the artificial intelligence infrastructure is an event as big as the original launch of the internet.

AI, now mainly an expense, will soon be monetized. Thousands of AI applications are coming.

And AMD plans to embed AI across its entire product portfolio. That will include components and software on everything from PCs and edge sensors to the largest servers used by the big cloud hypervisors.

These were among the comments of Mark Papermaster, AMD’s executive VP and CTO, during a recent fireside chat hosted by stock research firm Arete Research. During the hour-long virtual presentation, Papermaster answered questions from moderator Brett Simpson of Arete and attending stock analysts. Here are the highlights.

The overall AI market

AMD has said it believes the total addressable market (TAM) for AI through 2027 is $400 billion. “That surprised a lot of people,” Papermaster said, but AMD believes a huge AI infrastructure is needed.

That will begin with the major hyperscalers. AWS, Google Cloud and Microsoft Azure are among those looking at massive AI buildouts.

But there’s more. AI is not only in the domain of these massive clusters. Individual businesses will be looking for AI applications that can drive productivity and enhance the customer experience.

The models for these kinds of AI systems are typically smaller. They can be run on smaller clusters, too, whether on-premises or in the cloud.

AI will also make its way into endpoint devices. They’ll include PCs, embedded devices, and edge sensors.

Also, AI is more than just compute. AI systems also require robust memory, storage and networking.

“We’re thrilled to bring AI across our entire product portfolio,” Papermaster said.

Looking at the overall AI market, AMD expects to see a compound annual growth rate of 70%. “I know that seems huge,” Papermaster said. “But we are investing to capture that growth.”

AI pricing

Pricing considerations need to take into account more than just the price of a GPU, Papermaster argued. You really have to look at the total cost of ownership (TCO).

The market is operating with an underlying premise: Demand for AI compute is insatiable. That will drive more and more compute into a smaller area, delivering more efficient power per FLOP, the most common measure of AI compute performance.

Right now, the AI compute model is dominated by a single player. But AMD is now bringing the competition. That includes the recently announced MI300 accelerator. But as Papermaster pointed out, there’s more, too. “We have the right technology for the right purpose,” he said.

That includes using not only GPUs, but also (where appropriate) CPUs. These workloads can include AI inference, edge computing, and PCs. In this way, user organizations can better manage their overall CapEx spend.

As moderator Simpson reminded him, Papermaster is fond of saying that customers buy road maps. So naturally he was asked about AMD’s plans for the AI future. Papermaster mainly deferred, saying more details will be forthcoming. But he also reminded attendees that AMD’s investments in AI go back several years and include its ROCm software enablement stack.

Training vs. inference

Training and inference are currently the two biggest AI workloads. Papermaster believes we’ll see the AI market bifurcate along their two lines.

Training depends on raw computational power in a vast cluster. For example, the popular ChatGPT generative AI tool uses a model with over a trillion parameters. That’s where AMD’s MI300 comes into play, Papermaster said, “because it scales up.”

This trend will continue, because for large language models (LLMs), the issue is latency. How quickly can you get a response? That requires not only fast processors, but also equally fast memory.

More specific inferencing applications, typically run after training is completed, are a different story, Papermaster said, adding: “Essentially, it’s ‘I’ve trained my model; now I want to organize it.’” These workloads are more concise and less demanding of both power and compute, meaning they can run on more affordable GPU-CPU combinations.

Power needs for AI

User organizations face a challenge: While running an AI system requires a lot of power, many data centers are what Papermaster called “power-gated.” In other words, they’re unable to drive up compute capacity to AI levels using current technology.

AMD is on the case. In 2020, the company committed itself to driving a 30x improvement in power efficiency for its products by 2025. Papermaster said the company is still on track to deliver that.

To do so, he added, AMD is thinking in terms of “holistic design.” That means not just hardware, but all the way through an application to include the entire stack.

One promising area involves AI workloads that can use AI approximation. These are applications that, unlike HPC workloads, do not need incredible levels of accuracy. As a result, performance is better for lower-precision arithmetic than it is for high-precision. “Not all AI models are created equally,” Papermaster said. “You’ll need smaller models, too.”

AMD is among those who have been surprised by the speed of AI adoption. In response, AMD has increased its projection of AI sales this year from $2 billion to $3.5 billion, what Papermaster called the fastest ramp AMD has ever seen.

Do more:

 

Featured videos


Follow


Related Content

Pages