Skip to main content

Interoperability

The following IO Versions are required for the AIFO integration:

  • Infrastructure Observability 8.0.0 or later must have been properly installed and configured.

Each AI Factory host will have to run the AIFO Collector service which scrapes the data and makes it available for polling. The AIFO collector is designed to run as a container in a Linux container environment, such as docker or kubernetes, and is capable of pulling metrics from NVIDIA GPUs using exisitng NVIDIA tools and packages. On each AI Facotry host, nv-hostengine service must be up and running, and reachable from the container. The packages versions that have tested:

  • nvidia-smi: 560.35.03

  • NVML: 560.35

  • DRIVER: 560.35.03

  • CUDA: 12.6

  • dcgmi: 3.3.9 to < 4.0.0