system_monitoring_probe
qoa4ml.probes.system_monitoring_probe
¶
Classes¶
SystemMonitoringProbe
¶
SystemMonitoringProbe is responsible for monitoring system resources and creating reports based on usage statistics.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
SystemProbeConfig
|
Configuration settings for the system monitoring probe. |
required |
|
BaseConnector
|
Connector to send the report data. |
required |
|
Optional[ClientInfo]
|
Information about the client, default is None. |
None
|
Attributes:
Name | Type | Description |
---|---|---|
config |
SystemProbeConfig
|
The system monitoring probe configuration. |
node_name |
str
|
The name of the node being monitored. |
environment |
EnvironmentEnum
|
The environment in which the node is running. |
cpu_metadata |
dict
|
Metadata about the CPU. |
gpu_metadata |
dict
|
Metadata about the GPU. |
mem_metadata |
dict
|
Metadata about the memory. |
metadata |
dict
|
General metadata about the node. |
Methods:
Name | Description |
---|---|
get_cpu_metadata |
Get metadata about the CPU. |
get_cpu_usage |
Get the CPU usage of the system. |
get_gpu_metadata |
Get metadata about the GPU. |
get_gpu_usage |
Get the GPU usage of the system. |
get_mem_metadata |
Get metadata about the memory. |
get_mem_usage |
Get the memory usage of the system. |
create_report |
Create a JSON report based on system resource usage statistics. |
Functions¶
__init__(config, connector, client_info=None)
¶
Initialize an instance of SystemMonitoringProbe.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config
¶ |
SystemProbeConfig
|
Configuration settings for the system monitoring probe. |
required |
connector
¶ |
BaseConnector
|
Connector to send the report data. |
required |
client_info
¶ |
Optional[ClientInfo]
|
Information about the client, default is None. |
None
|
create_report()
¶
Create a JSON report based on system resource usage statistics.
Returns:
Type | Description |
---|---|
str
|
JSON-encoded report containing system resource usage statistics. |
Notes
- This method collects CPU, GPU, and memory usage stats for the system.
- Reports are generated differently based on the environment (HPC or other).
get_cpu_metadata()
¶
Get metadata about the CPU.
Returns:
Type | Description |
---|---|
dict
|
Dictionary containing metadata about the CPU. |
get_cpu_usage()
¶
Get the CPU usage of the system.
Returns:
Type | Description |
---|---|
dict
|
Dictionary containing the CPU usage information in percentage. |
get_gpu_metadata()
¶
Get metadata about the GPU.
Returns:
Type | Description |
---|---|
dict
|
Dictionary containing metadata about the GPU. |
get_gpu_usage()
¶
Get the GPU usage of the system.
Returns:
Type | Description |
---|---|
dict
|
Dictionary containing the GPU usage information. |
get_mem_metadata()
¶
Get metadata about the memory.
Returns:
Type | Description |
---|---|
dict
|
Dictionary containing memory metadata in gigabytes. |
get_mem_usage()
¶
Get the memory usage of the system.
Returns:
Type | Description |
---|---|
dict
|
Dictionary containing the memory usage in megabytes. |