Apache access logs in Kibana

KibanaI needed a more convenient way to view my Apache access logs, other than tailing the access logs files on my webserver. Why not use Kibana for this? It not only shows you the access log lines, it also lets you create nice graphs about visitors, response codes, user agents, etcetera.

Also see my update in Apache access logs in Kibana – part 2.

First I had to install Kibana. As I was looking how to do this, I came by a really clear instruction by Mitchell Anicas. This explains how to install Elasticsearch, Logstash and Kibana (also know als ELK) on Ubuntu, which also works for my Debian installation.

After having Kibana up and running, I needed apache to ship its log to Kibana (actually Logstash, which collects logs for Elasticsearch, which in its turn is used by Kibana. You still follow? 🙂 ). This could be done by the logstash-forwarder, but I used a different approach, which I will describe below.

First I had to modify the patterns used by logstash, since it only has a standard pattern for apache, using the apache “combined” log format. As I run multiple virtual hosts on my webserver, I use the “vhost_combined” log format. So lets add a pattern for that, so Logstash understands it.

We add a new line to grok-patterns which is located in /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-0.3.0/patterns (that is /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-2.0.2/patterns when using Logstash 2.0).

COMBINEDAPACHE_VHOST %{HOSTNAME:vhost}\:%{NUMBER:port} %{COMBINEDAPACHELOG}

I noticed this didn’t work really well after upgrading logstash. After every upgrade, the pattern had to be added again, which really drove me crazy. So, let’s put this pattern in the input file.

Now, let’s create an input for Logstash where Apache can send its logs to.

In /etc/logstash/conf.d, create a file called 50-apache_vhost.conf (well, it could be anything, as long as it comes later than the 30-lumberjack-output.conf file). Its contents will be:

input {
udp {
port => 10515
type => "apache_log"
}

}

filter {
grok {
match => { "message" => "%{HOSTNAME:vhost}\:%{NUMBER:port} %{COMBINEDAPACHELOG}"}
}

geoip {
source => "clientip"
target => "geoip"
database => "/etc/logstash/GeoLiteCity.dat"
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}"  ]
}
mutate {
convert => [ "[geoip][coordinates]", "float"]
}
# do GeoIP lookup for the ASN/ISP information.
geoip {
database => "/etc/logstash/GeoIPASNum.dat"
source => "clientip"
}
}

output {
elasticsearch { host => localhost }
}

Note: When using Logstash 2.0, the output section should look like this:

output {
elasticsearch { hosts => localhost }
}

This lets Logstash listen on UDP port 10515. Perform a service logstash restart to activate it. Now we turn to apache to have it send its logs to logstash.

Actually it’s quite easy to have Apache send its log to logstash. Just add this line to the definition of every virtual host:

CustomLog "| nc -u host.domain 10515" vhost_combined

Replace host.domain with the servername running Logstash.

Well, that’s about it. Restart Apache with service apache2 restart and you should see the logs appear in Kibana.

Now all that lasts is creating a nice dashboard to visualize the data. Let your creativity run riot, but here’s a little example:

I removed the domain names to anonymize the data, but it still shows some possibilities for a dashboard.

And this is what it looks like in Kibana 4.2 using the dark theme:

kibana42_apache

Also see my update in Apache access logs in Kibana – part 2.

3 thoughts on “Apache access logs in Kibana”

Leave a Reply

Your email address will not be published. Required fields are marked *