Proper end-to-end syslog logging with Apache httpd

Table of content

  • Context
  • Objectives
  • Folder structures and relevant files
  • Configurations
    • /etc/httpd/conf.d/httpd_base_module.conf
    • /etc/httpd/conf.d/zz-03-example.com.conf
    • /opt/fcrouzat/syslog/httpd-log.sh
    • /etc/rsyslog.d/zz-10-httpd.conf
    • Remote logging: /etc/rsyslog.d/zz-09-remote-httpd.conf
  • Caveats
  • Conclusions

Context

For an unknown reason, Apache httpd only supports syslog since 2.4 and only for error logs.
On top of that, default log format is very poor and (to me) doesn’t mean anything.
Add these two informations and you’ll give any sysadmin a headache.

Fortunately there is a way around these issues by clever usage of the named pipes. This W/A is not so complicated but requires to be careful and a properly configured stack.

Objectives

Using this workaround I will demonstrate how to perform proper end-to-end syslog logging with both httpd 2.2 and 2.4.
This is very helpful in PCI-DSS compliant environment because it allow you to easily centralize Apache httpd logs. As a reminder: unlike many other regulations and standards, PCI DSS explicitly requires organization to implement log management. Requirement 10 explicitly requires organizations stating: logging mechanisms and the ability to track user activities are critical in preventing, detecting, or minimizing the impact of a data compromise. The presence of logs in all environments allows thorough tracking, alerting, and analysis when something does go wrong. Determining the cause of a compromise is very difficult, if not impossible, without system activity logs.[1]

In addition to proper logging through syslog (both local and to remote hosts), and since we are about to modify our Apache httpd configuration for the better, we will use this opportunity to improve various things that always bugged me with httpd: having a maintained and useful LogFormat and having proper filenames for the local log files.

Testing and validation have been done under the following configurations:

  • RHEL 6.x + httpd-2.2
  • RHEL 7.x + httpd-2.4

Depending on how you might have already tweaked your LogFormat, your logs should looks like this:

15.211.201.81 - - [21/Jan/2016:16:42:25 +0100] "GET /wp-content/ HTTP/1.1" 200 13776 "http://floriancrouzat.net/2011/09/pacemaker-beginner-tips/" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0"

It is not syslog-compliant, date format is quite unexpected, there is no app-name, and it is probably missing interesting informations. In addition, this is for example the combined LogFormat, but there are others and they are all different, you cannot find a discriminating way to filter them or manipulate them: remember, no app-name …
Assuming you’d like to parse or analyze these logs, you’d have to write very specific decoder because you cannot just use a filter on the app-name. The same goes for sending these logs to remote hosts over syslog, you cannot easily filter to only send Apache httpd logs without writing complex decoders…

We want at least to do the following: remove the poorly written datetime and add a valid syslog header with a deterministic app-name in front of the untouched message. A log line would then look like this:

Jan 15 03:28:07 http01.example.com httpd: 15.211.201.81 - - "GET /wp-content/ HTTP/1.1" 200 13776 "http://floriancrouzat.net/2011/09/pacemaker-beginner-tips/" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0"

But while doing so, why not also improve everything by replacing the default LogFormat (combined) with one of our own that is a lot more powerful. Of course, feel free to adapt my “powerful” LogFormat to your needs. A log line would then look like this:

Jan 15 03:28:07 http01.example.com httpd: 192.168.1.1:43573 floriancrouzat.net:443 TLSv1.2(ECDHE-RSA-AES256-GCM-SHA384) "GET /index.html HTTP/1.1" 200 11301B "- - -"

Folder structures and relevant files

You can obviously organize your folder structure as you wish and name your files as you intend to. But, to be consistent I’ll provide the filenames I have been using.
All my configuration is done in /etc/httpd/conf.d/ and all *.conf files are included from httpd’ default configuration /etc/httpd/conf/httpd.conf. I never edit this package-shipped file. Same applies for rsyslog with /etc/rsyslog.d/ included from package-shipped /etc/rsyslog.conf

Here are the relevant files we are going to use:

  • /etc/httpd/conf.d/httpd_base_module.conf : define LogFormat and stuff module dependent
  • /etc/httpd/conf.d/zz-03-example.com.conf : virtualhost file for example.com
  • /etc/rsyslog.d/zz-10-httpd.conf : define rsyslog httpd-related configurations
  • /opt/fcrouzat/syslog/httpd-log.sh : syslog wrapper for httpd

Configuration

/etc/httpd/conf.d/httpd_base_module.conf

First, create a new LogFormat that match your needs, feel free to define your own but do not includes date, time or app-name. This has to be the raw message. The syslog header will be added later.

[...]
    LogFormat "%h:%{remote}p %{Host}i:%{canonical}p %{SSL_PROTOCOL}x(%{SSL_CIPHER}x) \"%m %U %H\" %>s %BB \"%!200,302,304{Referer}i %!200,302,304{User-Agent}i %!200,302,304{cookie}i\"" default
[...]

/etc/httpd/conf.d/zz-03-example.com.conf

We have to modify our virtualhost(s) to use the new LogFormat and also the named pipe.
Apache httpd must never write directly to disk anymore, everything has to be piped.
This named pipe execute a custom-made script that takes 3 arguments:

  • Virtualhost FQDN
  • Log type: value can be access or error
  • Encryption: value can be http or https
[...]
    CustomLog "|/opt/fcrouzat/syslog/httpd-log.sh example.com access https" default
    ErrorLog "|/opt/fcrouzat/syslog/httpd-log.sh example.com error https"
[...]

/opt/fcrouzat/syslog/httpd-log.sh

This is the wrapper used as a named pipe by httpd. Every single line of log is piped to this script via stdout.
This script uses logger to address rsyslog.
Argument 2 is used to determine if we are handling access or error logs.
Argument 3 is used to determine if we are handling http or https logs.
Argument 1 is passed after the fixed part to have the knowledge of the virtualhost name in rsyslog as you’ll see after
This is useful to use because we will use that to define filename pattern for our logs such as:

  • /var/log/httpd/<fqdn>.access.log
  • /var/log/httpd/<fqdn>.error.log
  • /var/log/httpd/<fqdn>.ssl.access.log
  • /var/log/httpd/<fqdn>.ssl.error.log

This is purely cosmetic and is not really syslog-related but having properly named logfile has shown to be useful.

#!/bin/bash

# Log through syslog using logger and set the "app-name" protocol header with
# "httpd-(access|error)-http[s]-".

while read line ; do
    [[ $2 == access ]] && [[ $3 == http ]]  && /bin/logger -p daemon.info  -t "httpd-access-http-${1}"
    [[ $2 == access ]] && [[ $3 == https ]] && /bin/logger -p daemon.info  -t "httpd-access-https-${1}"
    [[ $2 == error ]]  && [[ $3 == http ]]  && /bin/logger -p daemon.error -t "httpd-error-http-${1}"
    [[ $2 == error ]]  && [[ $3 == https ]] && /bin/logger -p daemon.error -t "httpd-error-https-${1}"
done

/etc/rsyslog.d/zz-10-httpd.conf

Here’s the deal: we now have httpd logging through a named piped that uses logger to perform syslog. We also have relevant informations from the app-name that has been set with the -t flag from logger. We can use all these properties to add the relevant syslog header on httpd raw message and log to a properly named logfile.

$PreserveFQDN on

#
# Log Format template: add a proper syslog header and append %msg%
#

$template HttpdFormat,"%timegenerated% %hostname% httpd: %msg%\n"
$template HttpdRemoteFormat,"<%PRI%>%timegenerated% %hostname% httpd: %msg%\n"

#
# Dynamic app-name logfile templates
#

# Remove "httpd-access-http-" from %app-name% to extract the virtualhost name
$template HttpAccessLog,"/var/log/httpd/%app-name:19:$:%.access.log"
# Remove "httpd-access-https-" from %app-name% to extract the virtualhost name
$template HttpsAccessLog,"/var/log/httpd/%app-name:20:$:%.ssl.access.log"
# Remove "httpd-error-http-" from %app-name% to extract the virtualhost name
$template HttpErrorLog,"/var/log/httpd/%app-name:18:$:%.error.log"
# Remove "httpd-error-https-" from %app-name% to extract the virtualhost name
$template HttpsErrorLog,"/var/log/httpd/%app-name:19:$:%.ssl.error.log"

#
# Syslog appname-based routing with proper template depending on access/error and http/https
#

if $app-name startswith 'httpd-access-http-' then ?HttpAccessLog;HttpdFormat
& stop

# HTTPS ACCESS LOG
if $app-name startswith 'httpd-access-https-' then ?HttpsAccessLog;HttpdFormat
& stop

# HTTP ERROR LOG
if $app-name startswith 'httpd-error-http-' then ?HttpErrorLog;HttpdFormat
& stop

# HTTPS ERROR LOG
if $app-name startswith 'httpd-error-https-' then ?HttpsErrorLog;HttpdFormat
& stop

Remote logging: /etc/rsyslog.d/zz-09-remote-httpd.conf

If you want to forward httpd logs to a remote host now that they are properly formatted, just add the following configuration prior to the & stop instructions, hence the numbering on the filename. We are also using a dedicated syslog template for remote messages that preserve the syslog priority by using <%PRI%>.
This setup is using TCP syslog forward with failover.

# APACHE
if $app-name startswith 'httpd-' then {
    *.* @@sysloghost01.example.com:601;HttpdRemoteFormat
    $ActionExecOnlyWhenPreviousIsSuspended on
    & @@sysloghost02.example.com:601;HttpdRemoteFormat
    $ActionExecOnlyWhenPreviousIsSuspended off
}
# DEFAULT (all)
else {
    *.* @@sysloghost01.example.com:601
    $ActionExecOnlyWhenPreviousIsSuspended on
    & @@sysloghost02.example.com:601
    $ActionExecOnlyWhenPreviousIsSuspended off
}

Caveat(s)

There is at least one caveat with this setup: for an unknown reason, with RHEL6+httpd-2.2 there is an extra leading space in the %msg% part of the log when piped. Because of that, you have to replace httpd: %msg%\n" with httpd:%msg%\n" in the LogFormat section of the rsyslog configuration to have a properly formatted message on the wire and in the logs.

Conclusions

Besides TL;DR and for those who survived, as we have seen in the introduction, having a proper end-to-end configuration for httpd and syslog is not available out-of-the-box and requires many adjustments and maybe skills. Thanks to this setup we have been able to make httpd log through syslog and also do a little cleanup of the configuration: we have properly named logfiles and relevant content, we can filter locally or remotely on the app-name and write decoders from there.

As for stressing this setup: I am currently running this on different servers with approx. 2,000,000 lines of log every day each and without any latency on RHEL7.x + httpd-2.4 and SELinux=enforcing.

[1] – https://jluby.blogs.balabit.com/2014/01/08/pci-dss-3-0-continues-to-emphasize-the-importance-of-log-management/

No comments yet, d'oh!

Share your thoughts

*