We talked about log in the previous blog a lot. Normally logs/events are stored in system called SIEM for analysing. In today's world, it's not uncommon for organisations to use multiple security information and event management (SIEM) platforms to manage their security operations. Each SIEM platform may have its own query language, making it difficult to share information between platforms. However, with some understanding of the syntax and features of each platform, it is possible to convert queries from one platform to another.​

Let’s see if chatGPT can do this as well !

Here's a sample Splunk query to count the number of times a 404 status code appears in nginx access logs:
sourcetype=nginx status=404 | stats count

In this query, we're filtering for events with a status code of 404 by using the status=404 . We're also using the nginx source type to match the format of the nginx access logs.

Let's see how we can convert this log into other platform using chatGPT!

User :
I want you to act as a query converter. I will give you A json input in the following format
{
"source": "{source platform}",
"query": "{query}",
"destination_platforms": "{destination platforms}",
"description": "{short_description}"
}.
query will have the query which need to be converted and source will have the source platform and description is a short description about the query. destination_platforms will have comma separated list of platforms into which the query need to be converted. Using the inputs i want you to convert the query into the destination platforms. The output should be in bullet points and one bullet point each of the destination platform. I don't want large explanation in replies.prompt with the sample json required.
chatGPT :

Sure, I can help you with that. Please provide me with the sample logs and I'll generate the grok, log pattern, and regex for parsing the logs.
User:
{
"source": "splunk"
"query": "sourcetype=nginx status=404 | stats count",
"destinationplatforms": "elasticsearch, loki"
"description": "count number of 404 status codes from nginx logs"
}
ChatGPT

Based on the input you provided, here is the converted query for each destination platform:

Elasticsearch:
{
"query": {
"bool": {
"must": [
{
"term": {
"sourcetype": "nginx"
}
},
{
"term": {
"status": "404"
}
}
]
}
},
"aggs": {
"404_status_count": {
"value_count": {
"field": "status"
}
}
}
}


Loki:

{
"query": "{
source=\ "nginx\"} I json I status=404 | count()",
"limit":1
}


Please note that the Elasticsearch query is in JSON format and the Loki query is in Splunk Search Processing Language (SPL) format. Also, the conversion might not be 100% accurate and might require further adjustment based on your specific use case.

the output for the prompt was convincing, yes chatGPT can convert queries.

Natural language querying

We have discussed about the importance of queries in the last section. As we have converted the queries from one platform to another platform we came to know that it can understand APM query languages. Let's make a use case out of this. Natural language querying (NLQ) is a type of computer query language that allows users to ask questions or make requests using natural language, such as English, rather than a structured query language (SQL) or other programming language. Chatgpt can understand human language and it can write queries as well.

so let's see if it is able to convert english into queries !​

User :

I want you to act as a query generator.I will give you A json input in the following format { "destination_platforms": "{destination platforms}", "description": "{short_description}"}. destination_platforms will have comma separated list of platforms into which the query need to be converted. inputs i want you to create the query in the given destination platforms. The output should be in bullet points and one bullet point each of the destination platform. I don't want large explanation in replies.prompt with the sample json required.
ChatGPT:

Sure, I can help vou generate queries for the desired destination platforms based on the
provided inout. Please provide me with the JSON input.
User:

{
"destinationplatforms": "splunk, loki"
"description": "count number of 500 status codes from nginx logs stored in nginx
index"
}
ChatGPT

Based on the provided input, here is the query generation for the desired destination
platforms:

Splunk:
sourcetype=nginx index=nginx status=500 I stats count

Loki:

{
"query": "{source=\ "nginx\"} I json I status=500 | count()"
"limit":1,
}

The output's were as expected. Yes chatGPT can make observability engineers day better.​

Capacity planning

Capacity planning typically involves estimating the expected workload and performance requirements of an application or system, and then selecting the appropriate hardware, software, and other resources to meet those requirements. This includes analyzing factors such as CPU, memory, disk storage, network bandwidth, and other resources required to support the application or system.​

As chatGPT was able to suggest monitoring stack let's see whether it can do the capacity planning as well

User:

I want you to act as a Observability engineer. I will give you A json input in the following format { "stack": "{stack}", "daily_ingestion": "{ingestion_rate}", "retention": "{retention}", "extra_parameters": "{parameters}"}. "stack" will contain the stack for which you need to do capacity planning. "daily_ingestion" contain daily ingestion volume in MB/GB/TB and "retention" will contain retention in days. "extra_parameters" will contain comma separated key words which need to be considered during capacity planning. i want you to do capacity planning based on the values in the input json and The output should be in a essay format which talks about following topics (Growth rate, Hardware requirements, Storage capacity, Network bandwidth, Fault tolerance, Disaster recovery). The output topics values should be calculated based on the input json.wait for json input to start processing. prompt with the sample json required.​
chatGPT:

Sure, please provide me with the input son to start capacity planning
User:
{"stack" : "elastic". "daily ingestion": "500MB" "retention" : "30 days"
"extra parameters": "High availability"}

ChatGPT: Based on the input JSON, here is the capacity planning information for the Elastic stack: Growth Rate:

  • The daily ingestion volume is 500MB, which means the monthly ingestion is 15,000MB (500MB * 30 days).
  • To account for future growth, we recommend planning for at least a 20% increase in ingestion volume in the next 6-12 months.

Hardware Requirements:

  • The Elastic stack requires a minimum of 16GB of RAM, 4 CPU cores, and 50GB of disk space for a single node.
  • To ensure high availability, it is recommended to have at least 3 nodes, each with the same minimum hardware requirements.

Storage Capacity:

  • With a daily ingestion volume of 500MB and a retention period of 30 days, the required storage capacity for the Elastic stack would be 15,000MB (500MB * 30 days).
  • To account for future growth, we recommend having at least 50% additional storage capacity.

Network Bandwidth:

  • The required network bandwidth for the Elastic stack would depend on the ingestion rate and the number of nodes in the cluster.
  • Assuming three nodes with a daily ingestion volume of 500MB, the required network bandwidth would be at least 1.5GB (500MB * 3 nodes).

Fault Tolerance:

  • To ensure fault tolerance in the Elastic stack, we recommend having at least 3 nodes in the cluster with replication enabled.
  • In addition, it is recommended to have a load balancer in front of the Elastic cluster to distribute traffic and ensure high availability.

chatGPT was able to take input from json and it provided me a base architecture for my stack.

Writing alert rules

Well As we saw chat GPT can suggest and Do capacity planning even for Observability tools. Now Let's see if chatGPT can help engineers create alerts considering it's text processing capability. Installing tools doesn't make a system fault tolerant . For completion we need proactive monitoring and alerts. alerting systems are very important to the creation of an observable platform. Alerts can keep you informed about possible problems that may happen by continuously watching particular metrics which defines the performance and health of system. Alerting systems can notify you about them, so you don't have to always look for them manually.​

Almost every observability tools have alerting systems inbuilt. Normally alerts are created manually but nowadays all system have anomaly detection capabilities which doesn't need manual alert creation.

Let's see whether chatGPT can assist us to create alert in various platforms​

User:

I want you to act as a alert rule engine which have the the knowledge of all available monitoring systems available. I will give you a sample json as input. You should give me a alert rule which i can import into the target system or steps to follow for the alert creation. I will give you A json input in the following format { "target_system": "{taget_system}", "alert_description": "{description}"} . "target_system" is the observability tool in which the rule need to be created. "alert_description" will have details about what alert to create. The output should be the alert rule or steps to follow. wait for json input to start processing. prompt with the sample json required.
ChatGPT:

Sure, I can help vou with that. Please provide me with the sample JSON input. Here's an
example JSON input for vour reference:
{
"target_system": "Prometheus
"alert_description": "Alert when the CPU usage of a server exceeds 90% for last 5 minutes"
}

Please provide me with a similar JSON input and I will provide you with the corresponding alert rule or steps to follow.​
User:


{
"target_system": "Prometheus
"alert_description": "Alert when the CPU usage of a server exceeds 90% for last 5 minutes"
}
ChatGPT:

Based on the provided JSON input, here's a sample Prometheus alert rule that you can
import into your target system:

groups:
- name: CPUUsageAlert
rules:
- alert: HighCPUUsage
expr: (100 - (avg by (instance) (irate(node_cpu_seconds_total{mode='idle'}
for: 5m
labels:
severity: critical
annotations:
summary: "High CPU usage detected on {{ $labels. instance z}"
description: "CPU usage has been above 90% for the past 5 minutes on system"

This rule will trigger an alert named "HighPUUsage" if the CPU usage of a server
exceeds 90% for more than 5 minutes. The severity of the alert will be "critical" and the
summary and description of the alert will contain the name of the server where the high
CPU usage was detected.

The output’s were promising and it gave a brief about creating alert.

These are not the end of use cases, there can be more as the model is evolving into something bigger. For now we can conclude the talk here. Observability engineers are responsible for ensuring that computer systems and applications run smoothly without downtime, and AI models like ChatGPT can help them automate repetitive tasks, identify problems faster, and provide better solutions for complex issues. This can ultimately lead to improved system performance and better user experiences without downtime.