Cloudflare has announced the deployment of its 12th generation servers, powered by AMD EPYC 9684X Genoa-X processors, delivering improved performance and efficiency across its infrastructure.
The new processor has 96 cores, 192 threads, and a massive 1152MB of L3 cache – three times that of AMD’s standard Genoa processors.
This substantial cache boost helps reduce latency and improve performance in data-intensive applications, with Cloudflare saying Genoa-X delivers a 22.5% improvement over other AMD EPYC models.
Updated AI developer products
According to the cloud provider, the new Gen 12 servers can handle up to 145% more requests per second (RPS) and offer a 63% increase in power efficiency compared to the previous Gen 11 models. The updated thermal-mechanical design and expanded GPU support offer enhanced capabilities for AI and machine learning workloads.
The new servers are equipped with 384GB of DDR5-4800 memory across 12 channels, 16TB of NVMe storage, and dual 25 GbE network connectivity. This configuration enables Cloudflare to support higher memory throughput and faster storage access, optimizing performance for a range of computationally intensive tasks. Additionally, each server is powered by dual 800W Titanium-grade power supply units, providing greater energy efficiency across its global data centers.
Cloudflare is keen to stress these improvements are not just about raw power but also about delivering more efficient performance. The company says the move from a 1U to a 2U form factor, along with improved airflow design, reduced fan power consumption by 150W, contributing to the server’s overall efficiency gains. The Gen 12 server’s power consumption is 600W at typical operating conditions, a notable increase from the Gen 11’s 400W but justified by the significant performance improvements.
The new generation also includes enhanced security features with hardware root of trust (HRoT) and Data Center Secure Control Module (DC-SCM 2.0) integration. This setup ensures boot firmware integrity and modular security, protecting against firmware attacks and reducing vulnerabilities.
The Gen 12 servers are designed with GPU scalability in mind, supporting up to two PCIe add-in cards for AI inference and other specialized workloads. This design allows Cloudflare to deploy GPUs strategically to minimize latency in regions with high demand for AI processing. Looking ahead, Cloudflare says it has begun testing 5th generation AMD EPYC “Turin” CPUs for its future Gen 13 servers.
Separately, Cloudflare has introduced big upgrades to its AI developer products. Workers AI is now powered by more powerful GPUs across its network of over 180 cities, allowing it to handle larger models like Meta’s Llama 3.1 70B and Llama 3.2, and tackle more complex AI tasks. AI Gateway, a tool for monitoring and optimizing AI deployments, has been upgraded with persistent logs (currently in beta) that enable detailed performance analysis using search, tagging, and annotation features. Finally, Vectorize, Cloudflare’s vector database, has reached general availability, supporting indexes up to five million vectors and significantly lowering latency. Additionally, Cloudflare has shifted to a simpler unit-based pricing structure for its three products, making cost management clearer.
Price of AMD’s fastest CPU ever is slashed two months after it launchedCloudflare is adding a firewall to help keep your LLM safeAMD’s new 96-core monster CPU is likely to break world records