
This blog presents a telecom quality of service (QoS) Gen AI Agentic RAG solution powered by the AMD Instinct MI300X accelerators on Dell PowerEdge XE9680 servers.
The introduction of the AMD Instinct MI300X accelerator, now integrated with Dell's flagship PowerEdge XE9680 server, provides a robust platform for high-performance AI applications. Leveraging this powerful combination, we have developed a Telecom Issue Management solution to demonstrate the value of Generative AI in optimizing network operations for telecom companies and their enterprise clients. This solution is crucial for minimizing network downtime, ensuring consistent service quality, and enabling telecom operators to make informed decisions about network investments and improvements.
In this blog, we offer insights into the solution architecture built with industry-leading software and hardware components, showcasing the following:
- How to utilize LLM-based agentic RAG to build a telecom issue management solution
- How to deploy a cutting-edge language model, embeddings model, and vector database on a Dell PowerEdge XE9680 server equipped with eight AMD Instinct MI300X accelerators
- How to navigate an up-to-date and intelligent dashboard that tracks and recommends actions for telecom issues, with a focus on base station performance
| Understanding the Telecom Landscape

The expected data volume over cellular networks is projected to exceed hundreds of exabytes per month by 2025, driven by human and machine data, representing tens of billions of devices. This explosive growth poses a challenge in ensuring that networks remain robust and resilient while handling increased demand for data throughput and low-latency services.
| Solution Architecture

We selected the Dell PowerEdge XE9680 equipped with AMD Instinct MI300X accelerators for our solution due to its exceptional performance and memory capacity. With 192GB of HBM3 memory per accelerator, we can comfortably run the entire Llama 3.1 70B model on a single accelerator.

This solution leverages the following technologies:
- Utilization of LLM Agents: LLM agents are AI systems that leverage large language models to understand and respond to natural language inputs.
- Plug-and-Play Architecture: This architecture enables the seamless integration of essential components, including vector databases, embedding models, and LLMs, into existing systems.
- Extensibility: The system is designed to efficiently process thousands of messages daily across hundreds of RRHs on a single server.
The software stack includes:
- vLLM (v0.5.3.post1), an industry-standard library for optimized open-source large language model (LLM) serving, with support for AMD ROCm 6.1.
- llama-agents, an async-first framework for building, iterating, and productionizing multi-agent systems.
- Llama 3.1 70B Model, an industry-leading open-weight language model with 70 billion parameters.
- LlamaIndex, a popular open-source retrieval augmented generation framework.
- bge-large-en embeddings model, one of the top ranked text embeddings models.
- MilvusDB, an open-source vector database with high performance embedding and similarity search.
- kafka broker, a key component of Apache Kafka's distributed architecture.
| Solution Overview

This solution is centered on real-time monitoring and management of telecom services, highlighting the potential of Generative AI in creating detailed incident reports and recommending appropriate responses.

The image above illustrates each segment of the workflow, and details how the RAG-based agentic workload and vector database interact with the simulated telecom network data.
In this blog, we demonstrated how enterprises deploying applied AI can leverage their proprietary data to harness multimodal RAG capabilities in the context of a telecom issue management tool. We explored the capabilities of the Dell PowerEdge XE9680 server equipped with AMD Instinct MI300X accelerators, achieving the following milestones:
- Developed a telecom issue management solution using LLM-based agentic RAG.
- Deployed cutting-edge language model, embeddings model, and vector database on Dell PowerEdge XE9680 server with eight AMD Instinct MI300X accelerators.
- Integrated an intelligent, real-time dashboard that monitors and recommends actions for telecom issues, with a focus on base station performance.
To learn more, please request access to our reference code by contacting us at contact@metrum.ai.
Copyright © 2024 Metrum AI, Inc. All Rights Reserved. This project was commissioned by Dell Technologies. Dell and other trademarks are trademarks of Dell Inc. or its subsidiaries. AMD, AMD Instinct, AMD ROCm, and combinations thereof are trademarks of Advanced Micro Devices, Inc. All other product names are the trademarks of their respective owners.
DISCLAIMER - Performance varies by hardware and software configurations, including testing conditions, system settings, application complexity, the quantity of data, batch sizes, software versions, libraries used, and other factors. The results of performance testing provided are intended for informational purposes only and should not be considered as a guarantee of actual performance.