Log analysis using GoAccess

Updated January 2021 for latest GoAccess release (v1.4.5)

GoAccess is a great tool for getting some useful insights from web server access logs, with hardly any effort.

In this two-part series, I'll explain how to install GoAccess and configure it with an Nginx web server, to build an attractive web-based analytics report:

  1. In this post, we'll install GoAccess and explore how to use it for quick reports from the command line, as well as generating some one-off HTML reports.
  2. The next post focuses on more advanced usage: how to serve a real-time HTML report with Nginx and run GoAccess as a service. Also, we'll reverse-proxy the GoAccess websocket server to support secure real-time updates to the report.

Let's start with the basics.

What is an access.log anyway?

If you have a website, it's likely that your web server will be diligently logging all the requests it receives as users (and non-human visitors like bots and search engine crawlers) access the various resources that make up the site.
Even with the default access log settings, the logs can still give some useful insights into your site and its visitors:

  • How many users are visiting your site?
  • What are the most popular browsers and operating systems?
  • Where are your visitors located (or where do they VPN to)?
  • Which sites have links to your site?
  • What are the busiest times of day for your site, on average?

Checking the logs manually (e.g. using grep and sed) is possible, of course. For some very specific searches, those tools might still be the best way, but for visualising trends over time or aggregating the data... not so much.

GoAccess reads an access log and analyses the data to produce useful statistical summaries and pretty visualisations that are very easy to interpret.

Unlike Google Analytics, Piwik and similar Javascript-based analytics tools, GoAccess doesn't require any extra code on your web pages to work. Bonus.

Before we begin

A few assumptions about your existing setup:

  • You already have Nginx configured to serve (or reverse-proxy) one or more websites/services.
  • Your server is Debian or Ubuntu (or Raspbian) based.
  • It's your server and you have an account with sudo rights.

If some of the above don't apply to you, some of this article might not be relevant.

Right then, let's go!

Installation

  1. Update existing packages:
sudo apt-get update && sudo apt-get upgrade
  1. Install dependencies
sudo apt-get install libncursesw5-dev libgeoip-dev build-essential -y
  1. Download, build and install:
# create a folder for building packages, if it's not there already
mkdir -p ~/build/
cd ~/build/

# download the latest stable version (1.4.5 as at January 2021)
wget https://tar.goaccess.io/goaccess-1.4.5.tar.gz

# expand the package files into a subfolder
tar -xzvf goaccess-1.4.5.tar.gz
cd goaccess-1.4.5/

# compile and install
./configure --enable-utf8 --enable-geoip=legacy
make
sudo make install

Command-line usage

Let's check that GoAccess is working as expected:

# amend the path if your access log lives somewhere else
goaccess /var/log/nginx/access.log -c

You should be prompted to choose a log format:

If you haven't customised your Nginx server's access log format, it will use the 'NCSA Combined Log Format'. Press Space to select the first option, then press Enter to proceed. (Or you can customise the log string, date and time format before proceeding, if your server has been configured differently from the defaults.)

After the log has been parsed, you'll see an initial screen of statistics, something like this:

You can navigate using the cursor keys. After you've had a look around, press q to quit. Here's the full list of interactive controls, in case you want to explore further before quitting:

  • F1 or h Main help.
  • F5 Redraw main window.
  • q Quit the program, current window or collapse active module
  • o or ENTER Expand selected module or open window
  • 0-9 and Shift + 0 Set selected module to active
  • j Scroll down within expanded module
  • k Scroll up within expanded module
  • c Set or change scheme color
  • ^ f Scroll forward one screen within active module
  • ^ b Scroll backward one screen within active module
  • TAB Iterate modules (forward)
  • SHIFT + TAB Iterate modules (backward)
  • s Sort options for active module
  • / Search across all modules (regex allowed)
  • n Find position of the next occurrence
  • g Move to the first item or top of screen
  • G Move to the last item or bottom of screen

Configuration

We should set some global defaults for our GoAccess settings, so we won't need to specify them manually. This will be useful later on.

Firstly, check the location of the current default config file using the goaccess --dcf command.

In my case, the example config file installed earlier (by the sudo make install command) is listed as the default, located at /usr/local/etc/goaccess/goaccess.conf.

You might find there is no current default config file found, with a result like this:

No default config file found.
You may specify one with `-p /path/goaccess.conf`

In that case, then you can copy the file to its parent folder, where GoAccess can find it:

sudo cp /usr/local/etc/goaccess/goaccess.conf /usr/local/etc/

But for me, I just made a copy of the default config so that I could revert easily in case I accidentally broke something:

sudo cp /usr/local/etc/goaccess/goaccess.conf /usr/local/etc/goaccess/goaccess.conf.default

If you try the goaccess --dcf command again now, it should report the correct location.

Now let's edit the file so we have a few sensible default options for our logs:

sudo nano /usr/local/etc/goaccess/goaccess.conf

We want to uncomment the appropriate lines for time-format, date-format and log-format, so find the lines below and remove the # from the start of each of these lines:

time-format %H:%M:%S
date-format %d/%b/%Y
log-format %h %^[%d:%t %^] "%r" %s %b "%R" "%u"

Press Ctrl-x to exit nano and follow the prompts to save your changes (keep the existing name).

Static HTML reports

For a self contained one-off report, you can use a command like this:

goaccess /var/log/nginx/access.log -o report.html

# or use this if you've not saved a custom default config as per the previous section
goaccess /var/log/nginx/access.log --log-format=COMBINED -o report.html

After running that command, just open the output report.html in a browser and you'll get some great high-level insights into your website and its readers - and some pretty graphs too. :)

The static reports output by GoAccess are totally self-contained (the scripts and styles are all inlined in the file with the HTML), so you can easily copy or share/email them without any fuss.

Next steps

One-off reports are great, but wouldn't it be good if we could set up an always-up-to-date version, with the statistics updating in real-time as visitors browsed our site?

That's just what we'll be looking at in the next post of this two-part series.

See you there!


Photo by VizAforMemories on Unsplash