Logging is the difference between mediocre and great products. Learn why this is the case and how to make it all fit together. Free Logging, like security, is a critical component of web applications (or applications in general) that is often overlooked due to ingrained habits and a lack of foresight.
What many consider to be useless reams of digital tape are actually powerful tools for inspecting your applications, correcting errors, strengthening weak areas, and delighting customers. Let’s take a look at why free logging is so important before moving on to centralized free logging.
As a professional developer, I’ve seen many instances where the app’s observed behavior perplexed everyone for days, but the key was always in the logs. Every piece of software we use generates (or, at the very least, should generate) logs that tell us what it was doing when the problem occurred.
Now, in my opinion, there are two types of free logging: auto-generated logs and programmer-generated logs. Please keep in mind that this isn’t a textbook distinction, and citing me on this terminology will get you into trouble.
If you check the slow query log on a regular basis, you’ll learn which operations are taking the longest and, as a result, you’ll be able to identify small but critical areas that require attention. Often, a minor adjustment like this is preferable to doubling the hardware capacity.
It’s impossible to count how many ways a good free logging system can assist you. The best argument is that it’s an automated activity that, once set up, requires no monitoring and will one day save you from ruin.
Now that we’ve cleared that up, let’s take a look at some of the best Open Source Log Collectors (unified free logging tools) available. In case you missed it, we talked about commercial cloud-based inventory tools in a previous post.
Top 10 Log Collectors for Free Logging
Let’s take a look at some of the most popular Open Source Log Collectors all-in-one free logging tools in 2021.
When it comes to industry-grade free logging and visualisation capabilities, Graylog is one of the most well-known names in the industry. It’s also one-of-a-kind in that it scans your collected logs for signs of security flaws and alerts you right away. Graylog is a centralised logging system with the flexibility you require, allowing you to customise alerts, dashboards, and other features.
Greylog is open-source, but if your needs are more complex, there is an enterprise plan available. Graylog is a tool you can trust with your eyes closed, with clients like SAP, Cisco, and LinkedIn on its roster.
Logstash is worth looking into if you’re a fan or user of the Elastic stack (the ELK stack is already a thing, in case you didn’t know). Logstash, like the other logging tools on this list, is completely open-source, allowing you to deploy and use it however you want. But don’t be fooled: Logstash is a mothership with far more capabilities than any simple free logging tool.
It can collect large amounts of data from multiple platforms, define and execute your own data pipelines, and interpret unstructured log dumps, among other things. Of course, it only works with the Elastic suite of products, but if you’re just getting started and want to scale quickly, Logstash is the way to go!
Flutend is a first among equals among centralized free logging tools that serve as a middle layer for data ingestion. Fluentd can capture data from virtually any production system, knead it into the desired structure, build a custom pipeline, and feed it to your favourite analytics platform, be it MongoDB or Elasticsearch, thanks to an excellent library of plugins.
Fluentd is a Ruby-based open-source database that is widely used due to its flexibility and modularity. Fluentd has nothing to prove, with major companies like Microsoft, Atlassian, and Twilio using the platform.
The Syslog-ng tool was created to process Syslog data files in real time (Syslog is a well-known client-server protocol for system free logging). However, it has evolved to support a variety of data formats, including unstructured, SQL, and NoSQL. The following diagram pretty much sums up how the Syslog protocol works.
syslog-ng is a production-ready, dependable log collection and classification tool written in C that has a long track record in the industry. The best part is that it can be extended with plugins written in C, Python, Java, Lua, or Perl.
If you’re dealing with extremely large data sets and eventually want to feed everything into something like Hadoop, Flume is one of the best options available. It’s a “pure” open source project in the sense that it’s maintained by our beloved Apache Foundation, so no enterprise plan is required.
This might or might not be exactly what you’re looking for. Flume’s source code is completely open, and it’s written in Java (which continues to amaze me when it comes to groundbreaking technology). If you need a distributed, fault-tolerant data ingestion platform for heavy-duty applications, Flume is the way to go.
Octopussy gets a zero out of ten for product naming, but it’s a good option if your needs are simple and you’re unsure what all the fuss about pipelines, ingestion, aggregation, and so on is about. Octopussy, in my opinion, meets the needs of the vast majority of products (estimated statistics are useless, but if I had to guess, it covers 80% of real-world use cases).
Octopussy doesn’t have a particularly appealing user interface (see here), but it makes up for it in terms of speed and lack of bloat. As expected, the source code is available on GitHub, and I believe it is well worth investigating.
Rsyslog stands for a lightning-fast log processing system. It’s a Unix-like operating system utility. In technical terms, it’s a highly configurable message router with dynamically loadable inputs and outputs. It can take data from a variety of sources, transform it, and send the results to a variety of destinations.
You can send 1 million messages per second over local destinations using Rsyslog. Rsyslog also offers a Windows agent that integrates seamlessly with the Linux agent. It’s used to link the two environments together. This Windows agent is used to forward Windows event logs and configure the file monitor service.
Rsyslog also has the following features:
- Configurations that are adaptable
- Multi-threading capabilities are available.
- Log signatures and encryption are used to protect log files from manipulation.
- Big Data platforms are supported.
- Filtering based on content is available.
Lnav stands for (Log Navigator) and is a single-machine, single-directory pure-terminal tool. It’s for those who want to filter and display real-time logs from a single source or who have their free logging consolidated into a single directory. You’d be mistaken if you thought lnav was just a fancy version of tailf |grep.
Time-series view, pretty-printing (for JSON and other formats), color-coded log sources, powerful filters, and the ability to understand several free logging protocols are just a few of the features that will make you fall in love with it. It’s just that there are times when you want a no-fuss, no-setup, possibly-temporary logging layer, and lnav is perfect for that!
Grafana Loki is a multi-tenant log aggregation solution inspired by Prometheus. Loki is a cost-effective solution that only indexes metadata and can be integrated into popular systems such as Kubernetes, Prometheus, Linux, SQL, and others. You can follow the instructions in this getting started guide to install it and see how it works.
I’m sure some of us don’t want all the pomp and circumstance that comes with a “unified,” “centralised” logging system. Their business is based on single servers, and they’re looking for a way to monitor their log files quickly and efficiently. Say hello to Logwatch, for starters.
LogWatch can scan your system logs and generate any report you want once it’s installed. It is, however, an out-of-date (read: “reliable”) piece of software written in Perl. To run it, you’ll need Perl 5.6 or higher on your server. I don’t have any screenshots to share because it’s a daemonized command line process. You’ll love Logwatch if you’re a CLI addict who prefers the old-school way of doing things.
LOGalyze was originally a commercial product that was recently released as open source. Though I couldn’t find the project on GitHub, they do provide a Windows installer as well as all source code. If you want to start a community, you can sign up for a mailing list here.
LOGalyze is a relatively flexible and powerful offering that will work well for single-system deployments that want to combine logging from known sources such as Postfix, Apache, and others, and output it in CSV, PDF, HTML, or other formats. Yes, it doesn’t do everything, but considering it was once a commercial product, it does so quite well.
That’s all there is to it! To be honest, compiling this list was difficult because free logging isn’t as popular as, say, content management, and three or four tools appear to have monopolized the market. Nonetheless, everyone’s needs are unique, and I’ve attempted to address them all. It’s all here, from silly command-line no-setup tools to full-fledged data juggernauts!