Are you a developer who has been looking for a tool to capture, search, monitor and archive logs from your application? Logstash is a great Open Source program that may be able to take your logs and scale with your business, yet simple enough to be used in even a smaller application. And it’s Open Source. I’ll outline how to set up and manage your Logstash application in this post.

When dealing with log scaling and analytics in Logstash, you’ll encounter a number of problems. There are multiple log formats, there’s no easy way to search logs, and no easy method to gather statistics. You’ll also run into problems logging to a database or filesystem, and you’ll notice that logging will place a load on the database and the filesystem.

First, let’s define Logstash. Logstash is an open source, Apache license. It was written in JRuby, and as luck would have it, plugins are easily written in Ruby. Logstash runs on jvm and is part of the Elasticsearch family.

Logstash is extremely scalable. An included product, Elasticsearch, can be used for indexing, search and retrieval. You can process multiple log formats, receive logs from multiple sources, and output logs to multiple destinations.

Furthermore, Kibana provides web interface for search and analytics. And as mentioned above, Logstash can easily be extended with plugins written in Ruby.

Logstash has a unique architecture that starts with the shipper, moves to the broker, the indexer, searchstorage, and ultimately web interface.Logstash also filters through a unique pipeline that begins with input, filters the data, and ends with output in separate threads. Filters are applied in order of config file and outputs processed in order of config file.

There are four Logstash plugins: Input, Codecs, Filters, and Output. Input reads input stream such as File input, Log4j, Redis and Syslog. Codecs decodes log messages such as Json and multiline. Filters process messages; Csv’s define fields in a csv, date defines date field formats, mutate changes date type, Xml extracts xml, and grok parses arbitrary text. The last of the four plugins, output, produces Elasticsearch, Elasticsearch_http, Mongodb, Email and Nagios.

The Indexer sends messages to Elasticsearch for indexing on a daily basis. Each index is split into five shards by default. The original message is stored in the index and each field is indexed.

Let’s take a moment to define Elasticsearch. Elasticsearch is an Apache Lucene search engine with an index made up of multiple shards. Each shard is a lucene index. The primary shard has at least one replica. Shards are moved between servers when servers are added or removed.

To configure Elasticsearch, there are three modes of self discovery: multicast, unicast and combination. Multicast is simplest to use if all nodes are on the same network. Unicast is best for providing a list of servers.

Elasticsearch is great because it adds more nodes which improves indexing and search time. The primary node is indexed first then replicas. The number of shards determines when the index is created. The number of replicas is configurable.

Kibana, on the other hand, is a browser based analytic tool for time-stamped data. It’s included in the logstash jar and connects to the logstash server port 9292. To avoid overloading the server, Kibana sends multiple requests.

Log4j to Logstash has the architecture that follows from app to Logstash to Redis, then back to Logstash, and ultimately to the Elasticsearch Cluster.

Configuring Logstash is a really simple three-step process. Step One: Set up an input that is a Log4j server. Step Two: Set up a broker to store the Log4j events. Step Three: Set up an indexer that reads from Redis and writes to Elasticsearch. Now we’re done with the basic setup. If you want to scale bigger, simply add more brokers or indexers.

To scale the broker, add another server (Redis2) to the list.

The following is an example of scaling. You’ll notice the indexer is now reading from multiple Redis servers.

To get a quick start, configure Logstash, Elasticsearch and Kibana to run from the logstash jar by doing the following: download and untar, bin/logstash agent -f config.file, and bin/logstash web.

Rich originally presented this topic at our ‘All Things API’ Meetup in Denver, CO. Check out our other posts on integrating cloud document services or subscribe to our blog: