In this post we will setup a Pipeline that will use Filebeat to ship our Nginx Web Servers Access Logs into Logstash, which will filter our data according to a defined pattern, which also includes Maxmind's GeoIP, and then will be pushed to Elasticsearch.

Our Environment:

Note that all commands are run as root. In this environment, our resources will look like the following:

filebeat/nginx -> 10.21.5.120 logstash -> 10.21.5.5 elasticsearch -> 10.21.5.190

Elasticsearch:

If you don't have Elasticsearch running yet, a post on the deployment of Elasticsearch can be found here

Prepare the Logstash Environment:

Get the repositories:

$ apt update && apt upgrade -y $ apt install wget apt-transport-https gunzip -y $ wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | apt-key add - $ echo "deb https://artifacts.elastic.co/packages/5.x/apt stable main" | tee -a /etc/apt/sources.list.d/elastic-5.x.list $ apt update

Get the dependencies and install Logstash:

$ apt install openjdk-8-jdk -y $ echo JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64" >> /etc/environment $ source /etc/environment $ apt install logstash -y

Backup Logstash Config:

$ mkdir /opt/backups/logstash -p $ mv /etc/logstash/logstash.yml /opt/backups/logstash/logstash.yml.BAK

Get the Latest GeoIP Databases:

$ cd /etc/logstash/ $ wget http://geolite.maxmind.com/download/geoip/database/GeoLite2-City.mmdb.gz $ gunzip GeoLite2-City.mmdb.gz

Setup Logstash Main Config:

cat > /etc/logstash/logstash.yml << EOF path.data: /var/lib/logstash path.config: /etc/logstash/conf.d path.logs: /var/log/logstash EOF

Configure Logstash Application Config:

cat > /etc/logstash/conf.d/logstash-nginx-es.conf << EOF input { beats { host => "0.0.0.0" port => 5400 } } filter { grok { match => [ "message" , "%{COMBINEDAPACHELOG}+%{GREEDYDATA:extra_fields}"] overwrite => [ "message" ] } mutate { convert => ["response", "integer"] convert => ["bytes", "integer"] convert => ["responsetime", "float"] } geoip { source => "clientip" target => "geoip" add_tag => [ "nginx-geoip" ] } date { match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ] remove_field => [ "timestamp" ] } useragent { source => "agent" } } output { elasticsearch { hosts => ["10.21.5.190:9200"] index => "weblogs-%{+YYYY.MM.dd}" document_type => "nginx_logs" } stdout { codec => rubydebug } } EOF

Enable Logstash on Boot and Start Logstash:

$ systemctl enable logstash $ systemctl restart logstash

Prepare Filebeat:

Filebeat is a lightweight log shipper, which will reside on the same instance as the Nginx Web Server(s):

$ apt update && apt upgrade -y $ apt install wget apt-transport-https -y

Setup Nginx Web Server:

$ apt install nginx -y $ systemctl enable nginx $ systemctl restart nginx

Get the repositories:

$ wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add - $ echo "deb https://artifacts.elastic.co/packages/5.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-5.x.list

Setup the Dependencies:

$ apt update $ apt install openjdk-8-jdk -y $ echo JAVA_HOME=$(find /usr/lib/jvm/ -name "*openjdk*" -type d) >> /etc/environment $ source /etc/environment

Install Filebeat:

$ apt install filebeat -y

Backup Filebeat configuration:

$ mkdir /opt/backups/filebeat -p $ mv /etc/filebeat/filebeat.yml /opt/backups/filebeat/filebeat.yml.BAK

Create the Filebeat configuration, and specify the Logstash outputs:

$ cat > /etc/filebeat/filebeat.yml << EOF filebeat.prospectors: - input_type: log paths: - /var/log/nginx/*.log exclude_files: ['\.gz$'] output.logstash: hosts: ["10.21.5.5:5400"] EOF

Enable Filebeat on Boot and Start Filebeat:

$ systemctl enable filebeat $ systemctl restart filebeat

Testing:

While Nginx, Logstash, Filebeat and Elasticsearch is running, we can test our deployment by accessing our Nginx Web Server, we left the defaults "as-is" so we will expect the default page to respond, which is fine.

But before, accessing your web server, tail your logs:

$ tail -f /var/log/nginx/access.log /var/log/filebeat/filebeat

Now access your Web Server:

==> /var/log/nginx/access.log <== 165.1.2.3 - - [06/Jun/2017:21:53:35 +0000] "GET / HTTP/1.1" 200 396 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" ==> /var/log/filebeat/filebeat <== 2017-06-06T21:53:43Z INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=1 libbeat.logstash.publish.read_bytes=6 libbeat.logstash.publish.write_bytes=464 libbeat.logstash.published_and_acked_events=2 libbeat.publisher.published_events=2 publish.events=2 registrar.states.update=2 registrar.writes=1

Having a look at Elasticsearch:

$ curl http://10.21.5.190:9200/_cat/indices?v health status index uuid pri rep docs.count docs.deleted store.size pri.store.size green open weblogs-2017.06.06 KwfrPYnsRmiQ8EvrpHJ1-g 5 1 6 0 286.6kb 130.4kb

Search our index to retrieve details about our document:

$ curl -XGET http://10.21.5.190:9200/weblogs-2017.06.06/_search?pretty

{ "took" : 68, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 6, "max_score" : 1.0, "hits" : [ { "_index" : "weblogs-2017.06.06", "_type" : "nginx_logs", "_id" : "AVx_ZiLN-RV1_0gc9l6o", "_score" : 1.0, "_source" : { "request" : "/", "agent" : "\"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36\"", "minor" : "0", "auth" : "-", "ident" : "-", "source" : "/var/log/nginx/access.log", "type" : "log", "patch" : "3029", "major" : "58", "clientip" : "165.1.2.3", "@version" : "1", "beat" : { "hostname" : "nginx-web-01", "name" : "nginx-web-01", "version" : "5.4.1" }, "host" : "nginx-web-01", "geoip" : { "timezone" : "Africa/Johannesburg", "ip" : "165.1.2.3", "latitude" : -33.935462, "continent_code" : "AF", "city_name" : "Cape Town", "country_code2" : "ZA", "country_name" : "South Africa", "country_code3" : "ZA", "region_name" : "Province of the Western Cape", "location" : [ 18.377256, -33.935462 ], "postal_code" : "7945", "longitude" : 18.377256, "region_code" : "WC" }, "offset" : 196, "os" : "Windows 10", "input_type" : "log", "verb" : "GET", "message" : "165.1.2.3 - - [06/Jun/2017:21:53:35 +0000] \"GET / HTTP/1.1\" 200 396 \"-\" \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36\"", "tags" : [ "beats_input_codec_plain_applied", "nginx-geoip" ], "referrer" : "\"-\"", "@timestamp" : "2017-06-06T21:53:35.000Z", "response" : 200, "bytes" : 396, "name" : "Chrome", "os_name" : "Windows 10", "httpversion" : "1.1", "device" : "Other" } ] } }

And when we take the co-ordinates, and place them into google, we can see, that I am chilling at the beach! :)

Google Maps:

Elastic, most definitely have their game on, when it comes to awesome software! You can further visualize this, by adding Kibana to your stack.