Project Description
A ready high-performance platform was required for training and inference of neural networks, capable of handling large language models (LLM), generative AI, and heavy HPC workloads.
- Client task: Maximum computational density in limited rack space with industrial-grade 24/7 reliability.
- Key requirement: High-speed cluster networking support with InfiniBand NDR and full subsystem compatibility out of the box.
What was done
- Delivery of the server fully factory-ready (rails, cables, cooling systems)
- Verification of the configuration’s operability
- Commissioning (optionally on the client’s site, with setup and testing)
Tasks Solved and Applications
- Training large language models (LLM) and generative networks
Unified memory across 8 HGX H200 GPUs (141 GB per accelerator) allows placement and training of models with hundreds of billions of parameters on a single node without complex parallelization. High throughput within the HGX platform accelerates gradient synchronization. - Real-time inference
With 2 TB of RAM and high-frequency Intel Xeon Platinum CPUs, the server handles peak loads for AI services (chatbots, recommendation systems, content generation) with minimal latency. - Scientific and engineering calculations (HPC)
Hopper architecture with FP64 and transformer engine support delivers high performance in simulations, computational fluid dynamics, molecular modeling, and other resource-intensive tasks. - Working as part of an AI cluster
NVIDIA ConnectX‑7 NDR 400G and BlueField‑3 adapters allow the server to be integrated into a high-speed InfiniBand network. This enables horizontal scaling by combining multiple nodes into a single distributed training cluster. - Enterprise analytics and virtualization
2 TB of memory and powerful CPUs make the platform suitable for workload consolidation, large dataset processing, and deployment of high-load databases.
Why This Configuration Was Chosen
The client required an all-in-one solution with balanced computational power, memory bandwidth, storage speed, and network capabilities. Lenovo ThinkSystem SR680a V3 with 8× HGX H200 is an NVIDIA-certified AI and HPC platform, ensuring stable operation of drivers, frameworks (PyTorch, TensorFlow, Megatron), and orchestration tools (Kubernetes with GPU support).
Result for the Client
A ready-to-use platform was delivered, reducing time-to-production for AI services. Models that previously required multi-node clusters can now fit on a single server. Prepared for scaling: the NDR 400G network subsystem is ready for cluster integration without hardware replacement.
About Our AMPERE Server Line
In addition to supplying ready-made solutions from leading vendors, we develop our own AMPERE server line for AI, HPC, and enterprise infrastructure. AMPERE servers are based on the same key components as NVIDIA HGX, Intel, and AMD reference platforms but offer several advantages:
- Configuration flexibility — servers are built precisely for client needs: number of GPUs (1–8), accelerator types (H200, H100, A100, L40S, RTX), memory size and speed, disk subsystem composition, network adapters (InfiniBand, Ethernet, RoCE).
- Optimal delivery times — our manufacturing base allows faster delivery compared to long supply chains of major vendors.
- Full quality control — each server undergoes extended burn-in testing, including GPU, memory, NVMe, and network interface checks under conditions close to real operation.
- Adaptation to client infrastructure — we can implement custom power, cooling, form factors, and supply servers without extraneous components (e.g., no brand stickers) for integration into existing environments.
The AMPERE line includes models from compact single-CPU inference platforms to powerful 8-GPU nodes, fully equivalent in performance and reliability to enterprise-grade solutions. All servers come with warranty, technical support, and post-warranty service options.
Thus, the client has the choice: use a certified Lenovo solution (as in this case) or rely on our AMPERE development for unique configurations, shorter delivery times, or specific integration requirements.
Need a similar AI infrastructure?
We deliver certified vendor servers and custom AMPERE solutions for AI, HPC, and enterprise workloads. Contact us to discuss your target configuration and deployment timeline.




