Arm has introduced the Lumex Compute Subsystem (CSS) platform, a comprehensive solution designed to accelerate on-device artificial intelligence (AI) experiences in mobile and consumer devices. This platform integrates advanced CPU and GPU architectures, system interconnects, and software tools to deliver enhanced performance and energy efficiency.
At the core of the Lumex CSS platform is the Armv9.3 C1 CPU cluster, featuring Scalable Matrix Extension 2 (SME2) units. These units provide up to five times faster AI processing and three times improved efficiency compared with previous generations. The CPU cluster includes configurations such as C1-Ultra and C1-Pro cores, tailored to meet the demands of flagship devices. Micro-architectural improvements across cores contribute to an average 30% performance uplift and a 12% reduction in power consumption for daily mobile workloads.
The Mali G1-Ultra GPU introduces next-generation ray tracing capabilities, offering a twofold improvement in ray tracing performance over previous mobile GPUs. This enables high-quality gaming and entertainment experiences on mobile devices. In addition, the GPU delivers a 20% increase in graphics performance across key benchmarks and gaming applications, supporting both visual fidelity and efficient power usage.
The Lumex platform incorporates the SI L1 System Interconnect and the MMU L1 System Memory Management Unit to support demanding AI and compute-heavy workloads. The SI L1 Interconnect features an advanced system-level cache (SLC) with a 71% reduction in leakage compared to standard compiled RAM, minimizing idle power consumption. The MMU L1 provides secure, cost-efficient virtualization, scaling across a broad range of mobile and consumer devices. These features allow multiple AI and application workloads to operate concurrently without bottlenecking performance.
Arm offers a comprehensive suite of software tools for developers building applications on the Lumex platform. The KleidiAI library, integrated with major AI frameworks such as PyTorch ExecuTorch and Microsoft ONNX Runtime, provides SME2 acceleration for AI workloads. Developers can utilize top-down telemetry to analyze performance, identify bottlenecks, and optimize algorithms. RenderDoc support and unified observability tools like Vulkan counters, Streamline, and Perfetto allow real-time workload analysis and latency tuning. By providing robust support for widely used AI frameworks, the platform makes it easier for software engineers to deliver sophisticated AI applications on mobile devices.
The Lumex CSS platform functions as an advanced compute platform, incorporating production-ready CPU and GPU designs optimized for 3nm process nodes. These implementations are available through multiple foundries, allowing silicon and OEM partners to achieve competitive frequency, power, and area (PPA) characteristics. This approach helps ensure first-time silicon success when transitioning to the latest process nodes, supporting both performance and energy efficiency goals.
Extended AI use cases and responsiveness
The increased processing capability of SME2-enabled CPUs expands the range of AI use cases that can be handled directly on the device. Applications such as advanced natural language processing, augmented reality overlays, computer vision for image recognition, and generative audio models can now operate with reduced latency. By performing these tasks locally, Lumex CSS minimizes dependency on cloud computation, improving responsiveness and user experience even in low-bandwidth or offline scenarios.
The Lumex CSS platform includes built-in mechanisms to optimize energy usage and manage heat generation. Power gating and dynamic frequency scaling allow the system to adjust processing based on workload demand, reducing unnecessary energy consumption. These enhancements ensure devices can sustain high-performance AI and graphics processing without overheating or compromising battery life, which is critical for mobile devices and portable consumer electronics.
Flagship devices built on the Lumex CSS platform are projected to achieve six consecutive years of double-digit instructions per cycle (IPC) performance gains. This sustained growth allows devices to manage increasingly complex workloads over time, extending their useful life without the need for frequent hardware upgrades.
By combining high-performance CPU and GPU capabilities with extensive developer support, Lumex CSS redefines what mobile and consumer devices can achieve. Manufacturers can deploy AI-driven applications more efficiently while maintaining responsiveness, energy efficiency, and battery longevity. This sets the stage for more capable smartphones, tablets, and next-generation PCs capable of supporting increasingly sophisticated software.
The Arm Lumex CSS platform represents a significant step forward in mobile computing, emphasizing high-performance AI processing, advanced graphics, energy-efficient design, and integration with popular AI frameworks such as Microsoft ONNX Runtime. With SME2-enabled CPUs, the Mali G1-Ultra GPU, and developer-ready software, Lumex CSS provides a foundation for next-generation devices capable of delivering advanced AI experiences without compromising efficiency or responsiveness.
Address:
1855 S Ingram Mill Rd
STE# 201
Springfield, Mo 65804
Phone: 1-844-277-3386
Fax:417-429-2935
E-Mail: contact@appdevelopermagazine.com