See www.zabbix.com for the official Zabbix site.

Docs/specs/ZBXNEXT-444

From Zabbix.org
Jump to: navigation, search

Sending of "lastlogsize", "mtime" and "state" to the server

ZBXNEXT-444

Status: 1.0

Owner: Aleksandrs

Summary

There are three problems this specification attempts to solve:

  1. Currently, agent only sends "lastlogsize" and "mtime" when it finds a matching line in the log file. When the log file receives significant traffic and a matching line occurs only rarely, this causes a performance overhead, because the agent rescans the log file from the position of the last match when the agent is restarted. This was reported in ZBXNEXT-444.
  2. When a log file gets rotated away, the log item becomes unsupported. When a log file appears again, the log item stays unsupported for an indefinitely long period of time until a matching line appears. This was reported in ZBXNEXT-444, too.
  3. When a log file gets truncated, the agent should notify the server about this fact, so that no data is lost in case the log file grows beyond "lastlogsize" during agent downtime. This was reported in ZBXNEXT-91.

What follows is a way for the agent to periodically send "lastlogsize", "mtime" and "state" updates to server and proxy.

Specification (agent)

Active agent will send "lastlogsize", "mtime" and "state" updates at the same time as it usually sends historical data. If it has data for the log item to send, it sends "lastlogsize" and "mtime" together with that data, as before. If it does not have any data to send, or "lastlogsize" and "mtime" increased further after last data was acquired, or "state" has changed, it sends "lastlogsize", "mtime" and "state" update with "value" JSON tag omitted (note that if "state" tag is omitted, it defaults to "0"):

// regular message
{
    "host":"Server 115",
    "key":"logrt[/var/log/.*messages,linux]",
    "value":"Jan 26 18:26:48 linux-h5fr dhcpcd[3410]: eth0: broadcasting for a lease",
    "lastlogsize":225,
    "mtime":1360297600,
    "clock":1360317696,
    "ns":103174218
}

// "lastlogsize" and "mtime" changed
{
    "host":"Server 115",
    "key":"logrt[/var/log/.*messages,linux]",
    "lastlogsize":375,
    "mtime":1360297610,
    "clock":1360317697,
    "ns":471824374
}

// item became not supported
{
    "host":"Server 115",
    "key":"logrt[/var/log/.*messages,linux]",
    "value":"Cannot open file \"/var/log/messages\": [13] Permission denied",
    "lastlogsize":375,
    "mtime":1360297610,
    "clock":1360317712,
    "ns":44846123,
    "state":1
}

// item became supported
{
    "host":"Server 115",
    "key":"logrt[/var/log/.*messages,linux]",
    "lastlogsize":412,
    "mtime":1360297715,
    "clock":1360317718,
    "ns":536131222
}
...

The rationale for omitting "value" tag (rather than making it "null") is that in Zabbix 2.4 JSON "null" value is parsed as an empty string, which can be saved into the database. If we omit the tag entirely, the entry will be skipped by Zabbix 2.4 server. While we do not require that an older server works with a newer agent, this will prevent negative consequences of configuration mistakes (a lot of empty values in history).

Note that according to the protocol log[] and eventlog[] items only send "lastlogsize", whereas logrt[] items send both "lastlogsize" and "mtime" (see protocol documentation). Server and proxy should account for that and only update the necessary fields.

Agent should send single meta information update packet with every check. That means if multiple non-matching lines found during the check agent should not generate separate packet for every line.

Database changes

The following fields should be added to "proxy_history" table:

 FIELD		|lastlogsize	|t_bigint	|'0'	|NOT NULL	|0
 FIELD		|mtime		|t_integer	|'0'	|NOT NULL	|0
 FIELD		|meta		|t_integer	|'0'	|NOT NULL	|0

Here, if "meta" field is 1, it indicates that there is no data associated with this history entry. Rather, only meta information like "lastlogsize" and "mtime" should be updated. If "meta" field is 0 (the default), it will function in the same way as before.

The rationale for adding an extra "meta" field, rather than making "value" field accept NULL, is twofold: (1) we have a convention not to make string fields in the database NULL, (2) Oracle does not distinguish between NULL and empty strings.

Specification (server and proxy)

When a server receives a message from an agent without the "value" tag, it should update "lastlogsize", "mtime", and "state" meta information. This should be done by going through the history cache, so that this data does not get in front of historical data. However, no history in the database should be stored: only the above meta information should be updated in the configuration cache and the database.

When a proxy receives a similar message, it should update its configuration cache, the database, and also write a record into "proxy_history" table with the updated meta information fields and "meta" value of "1". This only concerns meta information updates in supported item state. For unsupported item state, records into "proxy_history" are inserted as before, with "state" of "1" and "value" containing the error message.

When a proxy sends its data to the server, it should send the meta records in the same manner as the agent sends meta information updates, i.e. with "value" tag omitted. Server should then process these records in the same way as it processes such records from the agent.

Documentation

ChangeLog

  • N/A