logpp

Section: User Commands (1)
Updated: August 2008
Index Return to Main Contents
 

NAME

logpp - log preprocessor  

SYNOPSIS

logpp
[-b <IO block size>]
[-d]
[-f <logging facility>]
[-h]
[-i <input buffer size>]
[-l <logging level>]
[-p <pid file>]
[-r <reopen interval>]
[-s <sleep interval>]
[-t <logging tag>]
[-v]
<config file> ...
 

DESCRIPTION

Logpp is a tool for preprocessing event logs and feeding relevant information to other programs for storing or in-depth analysis. During its work, logpp reads lines appended to input files (like tail(1) in -f mode), matches the lines with patterns (e.g., regular expressions), converts matching lines according to given templates, and writes the results to given destinations. Logpp supports multi-line matching and several types of output destinations like regular files, FIFOs, external programs, and the system logger. Therefore, logpp can act as a filter in front of the more complex event log analysis system and increase the system's performance by weeding out irrelevant log data; it can work as a syslog gateway between the system logger and the application that doesn't use syslog(3); it can convert multi-line log messages to shorter single line messages, and accomplish other log preprocessing tasks.  

OPTIONS

-b N
Logpp attempts to read N bytes at once and writes at most N bytes at once during all IO operations. This implies that lines from configuration and input files longer than N bytes will be split. Also, the size of an output message written to a destination can't exceed N bytes. N must be a positive integer and defaults to 8192.
-d
Logpp runs as a daemon and employs syslog(3) for logging its own messages (otherwise the messages are written to standard error).
-f S
Logpp uses facility S for logging its own syslog messages (e.g., about its internal errors). S must be a string value (e.g., local0) and defaults to user. See syslog(3) for valid facility values.
-h
print usage information.
-i N
Logpp creates an input buffer of N lines for each input source. The buffer contains last N lines that have been read from a given source, and in order to match input from the source, the content of the buffer is compared with patterns. The user can write patterns for matching at most N lines. N must be a positive integer and defaults to 10.
-l S
Logpp uses verbosity level S for logging its own syslog/stderr messages - if the level of a message is lower than S, the message will not be logged. S must be a string value (e.g., err) and defaults to info. See syslog(3) for valid level values.
-p S
Logpp stores its process ID to file S.
-r T
If an input or output file is not open (e.g., logpp failed to open it at startup or it was closed due to an IO error), logpp attempts to reopen the file after T second intervals, until the open succeeds. T must be a positive integer, the default behavior is "no reopen".
-s T
If no data were read from input sources, logpp sleeps for T microseconds before attempting to read again. T must be a positive integer and defaults to 1000000 (1 second).
-t S
Logpp uses S as a tag (or "program name") for all syslog-style logging. S must be a string value and defaults to logpp.
-v
print version information.
 

CONFIGURATION FILE

Logpp configuration file consists of input, output, filter, and flow definitions. A line with a keyword, definition name, and an opening brace ({) starts a definition, and a line with a closing brace (}) ends the definition. Lines between the first and the last line are keyword-value pairs, with whitespace separating the keyword from the value. Lines which begin with an octothorpe (#) are treated as comments and ignored (whitespace may precede the octothorpe). Empty lines or lines consisting of whitespace are also ignored.

The input definitions are used for setting up input sources for logpp. Each definition starts with the input keyword and in the body of the definition, the file <filename> keyword-value pairs specify individual input sources which must be regular files, FIFOs, or standard input if <filename> is -. During its work, logpp reads lines appended to regular file sources and written to FIFO sources and standard input. It stores the lines without terminating newlines to input buffers of corresponding sources, and processes the stored lines according to flow definitions (see below).

If the input file is recreated or truncated, logpp reopens the file and continues to read it from the beginning (i.e., it does not follow files by i-node but rather by name). If an IO error occurs when reading from a file, the file will be closed, but logpp attempts to reopen it if the -r command line option was given.

Here are example input definitions:

input var-log-messages {

  file /var/log/messages
}

input httpd-accesslogs {

  file /var/log/httpd/access_log

  file /var/log/httpd/ssl_access_log
}

input var-cron-log {

  file /var/cron/log
}

The output definitions are used for setting up output destinations for logpp. Each definition starts with the output keyword and in the body of the definition, the file <filename>, syslog <priority>, and exec <commandline> keyword-value pairs specify individual output destinations. The file keyword tells logpp to write its output to a file, the syslog keyword to log its output to the system logger, and the exec keyword to pipe its output to an external program.

The <filename> must be a regular file or FIFO. If <filename> is a regular file, logpp writes to it in append mode; if <filename> does not exist at logpp startup, it is created as a regular file. If an IO error occurs when writing to a file, the file will be closed, but logpp attempts to reopen it if the -r command line option was given.

The <priority> is a syslog facility.level pair (e.g., mail.err) that logpp will use when logging its output to the system logger. The <commandline> is a command line that is executed as a separate process, with its standard input connected to logpp (for each output operation a new process is created).

Here are example output definitions:

output var-log-logpp {

  file /var/log/logpp
}

output syslog-warning {

  syslog daemon.warning
}

output syslog-crit-and-mail {

  syslog auth.crit

  exec /bin/mail -s "logpp message" root@localhost
}

The filter definitions are used for setting up input matching and conversion schemes for logpp. Each definition starts with the filter keyword and in the body of the definition, the regexp<num> <regular_expression>, nregexp<num> <regular_expression>, tvalue <truth_value>, and template <conversion_string> keyword-value pairs define the matching and conversion scheme.

The regexp<num> keyword is used for specifying a regular expression for matching <num> lines (if <num> is omitted, it defaults to 1). If <num> is 1, the last line from an input source is taken from the source's input buffer and compared with the regular expression. If <num> is greater than 1, <num> last lines from an input source are taken from its buffer, concatenated with the newline acting as separator between lines, and the result is compared with the regular expression. Thus, the -i command line option sets an upper limit for the value of <num>.

The nregexp<num> keyword is used for specifying a negative regular expression - the line(s) is (are) considered matching if the expression itself does not match the line(s). The truth value given with tvalue matches all lines if the value is true, and matches no lines if the value is false.

The template keyword defines a conversion string for the preceding regexp, nregexp, or tvalue keyword. The conversion string may contain $<num> match variables that are set by bracketing constructs inside the regular expressions. The $0 match variable is set to the line(s) that took part in the matching operation; the $~ match variable is set to the name of the input file the line(s) were read from. Note that $0 and $~ are the only variables that are set for the nregexp and tvalue keywords.

The patterns given with regexp, nregexp, and tvalue keywords are compared with the content of the input buffer in the order they are specified, and if a pattern matches, the search for further matches stops. If the matching pattern has a conversion string, its match variables are subsituted with their values and the result is written to output destinations given with flow definitions (see below). If there is no conversion string, pattern produces no output and acts as a suppression condition.

Here are example filter definitions:

filter cisco-cpu {

  # messages from device 192.168.1.111 are ignored

  regexp 192\.168\.1\.111

  # cpu hog events from other devices produce output

  regexp ([0-9\.]+) [0-9]+: %SYS-3-CPUHOG

  template Device $1 cpu hog
}

filter cisco-link {

  # link down events produce output

  regexp ([0-9\.]+) [0-9]+: %LINK-3-UPDOWN: Interface (.+), changed state to down

  template Device $1 link $2 down

  # link up events produce output

  regexp ([0-9\.]+) [0-9]+: %LINK-3-UPDOWN: Interface (.+), changed state to up

  template Device $1 link $2 up
}

filter httpd-php-access-192.168.0 {

  # messages for other nets than 192.168.0 are ignored

  nregexp ^192\.168\.0\.

  # PHP script accesses from 192.168.0 produce output

  regexp ^([0-9\.]+).*"GET (.+\.php) HTTP/[0-9\.]+"

  template Host $1 accessed the PHP script $2
}

filter cron-cmd-started {

  # match cron "command started" messages that span over

  # two lines and convert them into single line messages

  # (the regular expression is written in Perl dialect)

  regexp2 ^>\s*CMD: (.+)\n>\s*(\S+)\s+(\d+)

  template Cron started command $1 (user $2 pid $3)
}

filter 192.168.7.113 {

  # lines with 192.168.7.113 are important in all logs

  regexp 192\.168\.7\.113

  template $0
}

The flow definitions are used for setting up processings flows for logpp. Each definition starts with the flow keyword and in the body of the definition, the input <name>, filter <name>, and output <name> keyword-value pairs define the flow's inputs, matching and conversion schemes that are applied for inputs, and outputs where the results of the matching and conversion are written. The <name> parameter for all keywords must be a name of the previously defined input, filter, or output. Note that if more than one filter has been specified, a matching pattern in one filter does not prevent line(s) from being matched by patterns in other filters.

Here are example flow definitions:

# this flow accepts lines from /var/log/messages as input;
# it writes cisco "cpu hog" messages from other hosts than
# 192.168.1.111 to /var/log/logpp, and cisco "link down" and
# "link up" messages from all hosts to /var/log/logpp

flow cisco {

  input var-log-messages

  filter cisco-cpu

  filter cisco-link

  output var-log-logpp
}

# this flow accepts lines from httpd access logs as input;
# it generates a syslog warning-level message when a PHP
# script is accessed from the 192.168.0 network

flow php {

  input httpd-accesslogs

  filter httpd-php-access-192.168.0

  output syslog-warning
}

# this flow accepts lines from cron daemon log as input;
# it writes messages about started commands to /var/log/logpp

flow cron {

  input var-cron-log

  filter cron-cmd-started

  output var-log-logpp
}

# this flow accepts lines from /var/log/messages and httpd access
# logs as input; it generates a syslog crit-level message and sends
# an e-mail to the local root user if a line with the IP address
# 192.168.7.113 appears in the logs

flow 192.168.7.113 {

  input var-log-messages

  input httpd-accesslogs

  filter 192.168.7.113

  output syslog-crit-and-mail
}  

SIGNALS

SIGHUP
Logpp will close all inputs, outputs and the connection to the system logger, reread the configuration and reinitialize itself.
SIGUSR1
Logpp will write its status information to syslog or stderr.
SIGUSR2
Logpp will reopen its outputs and the connection to the system logger.
SIGTERM
Logpp will terminate gracefully.
 

NOTES

Logpp can be built with the support for Perl-compatible regular expressions (described in perlre(1)) if the local system has the PCRE library (see pcre(3)).

If logpp has been built with the POSIX regular expression support, the regular expressions are compiled with REG_EXTENDED | REG_NEWLINE flags (see regcomp(3) for details).

In order to prevent itself from blocking during calls to write(2), logpp opens FIFOs and pipes in non-blocking mode. If the consumer of the FIFO or pipe is not reading the data fast enough, write(2) to the FIFO or pipe will fail, and logpp will not attempt to write again.  

AUTHOR

Risto Vaarandi <ristov at users d0t s0urcef0rge d0t net>  

SEE ALSO

pcre(3), perlre(1), regcomp(3), syslog(3), tail(1), write(2)


 

Index

NAME
SYNOPSIS
DESCRIPTION
OPTIONS
CONFIGURATION FILE
SIGNALS
NOTES
AUTHOR
SEE ALSO

This document was created by man2html, using the manual pages.
Time: 22:00:31 GMT, August 21, 2008