Log File Analysis for SEO: What It Is & How to Do It

Carlos Silva

Aug 14, 20234 min read
Log File Analysis for SEO
Share

TABLE OF CONTENTS

What Are Log Files?

A log file is a document that contains information about every request made to your server. And details about how people and search engines interact with your site. 

Here’s what a log file looks like:

an example of a log file document

As you can see, log files contain a wealth of information. So, it’s important to understand them and how to use that information.

In this guide, we’ll look at:

Tip: Create a free Semrush account (no credit card needed) to follow along. 

What Is Log File Analysis?

Log file analysis is the process of downloading and auditing your site’s log file to proactively identify bugs, crawling issues, and other technical SEO problems. 

Your site’s log file is stored on your server. And it records every request it gets from people, search engines, and other bots. 

By analyzing these logs, you can see how Google and other search engines interact with your site. And identify and fix any issues that might affect your site’s performance and visibility in search results.

What Is Log File Analysis Used for in SEO?

Log file analysis is a game-changer for improving your technical SEO.

Why?

Because it shows you how Google crawls your site. And when you know how Google crawls your site, you can optimize it for better organic performance. 

For example, log file analysis can help you:

  • See how often Google crawls your site (and its most important pages)
  • Identify the pages Google crawls the most
  • Monitor spikes and drops in crawl frequency
  • Measure how fast your site loads for Google
  • Check the HTTP status codes for every page on your site
  • Discover if you have any crawl issues or redirects

In short: Log file analysis gives you data you can use to improve your site’s SEO. 

How to Analyze Log Files

Now that we've taken a look at some of the benefits of log file analysis in SEO, let's look at how to do it. 

You’ll need:

  • Your website's server log file
  • Access to a log file analyzer

Note: We’ll be showing you how to do a log file analysis using Semrush’s Log File Analyzer.

Access Log Files

First, you need to obtain a copy of your site’s log file. 

Log files are stored on your web server. And you'll need access to it to download a copy. The most common way of accessing the server is through a file transfer protocol (FTP) client like FileZilla

You can download FileZilla for free on their website.

download FileZilla for free

You’ll need to set a new connection to your server using the FTP client and authorize it by entering your login credentials.

Once you've connected, you’ll need to find the server log file. Where it’s located will depend on the server type. 

Here are three of the most common servers and locations where you can find the logs:

  • Apache: /var/log/access_log
  • Nginx: logs/access.log
  • IIS: %SystemDrive%\inetpub\logs\LogFiles

But retrieving your site's log file isn't always so simple. 

Common challenges include:

  • Finding that log files have been disabled by a server admin and aren’t available
  • Huge file sizes
  • Log files that only store recent data (based either on a number of days or entries—also called “hits”)
  • Partial data if you use multiple servers and content delivery networks (CDNs)

That said, you can easily solve most issues by working with a developer or server admin. 

And if you don't have server access, you’ll need to speak with your developer or IT team anyway. To have them share a copy. 

Analyze Log Files

Now that you have your log file, it’s time to analyze it. 

You can analyze log files manually using Google Sheets and other tools. But it’s tiresome. And it can get messy. Quickly. 

We recommend using our Log File Analyzer

First, make sure your log file is unarchived and in the access.log, W3C, or Kinsta file format. 

Then, drag and drop it into the tool and click “Start Log File Analyzer.” 

Log File Analyzer tool

You’ll see a chart displaying Googlebot activity. 

It shows daily hits, a breakdown of different status codes, and the different file types it’s requested. 

"Googlebot activity" section in Log File Analyzer tool

You can use these insights to understand:

  • How many requests Google is making to your site each day
  • The breakdown of different HTTP status codes found per day 
  • A breakdown of the different file types crawled each day 

If you scroll down, you’ll see a table with insights for specific pages and folders.

"Hits by Pages" table in Log File Analyzer tool

You can sort by the “Crawl Frequency” column to see how Google is spending its crawl budget. 

“Crawl Frequency” column highlighted in the table

Or, click the “Inconsistent status codes” button to see paths with inconsistent status codes.

Like switching between a 404 status code indicating the page can’t be found and a 301 status code indicating a permanent redirect. 

an example of “Inconsistent status codes” section

Using the tool makes server log analysis simple and straightforward. So you can spend time optimizing your site, not analyzing data.

Ensure Crawlability Is a Priority

Now you know how to access and analyze your log file. But don’t stop there. 

You need to take proactive steps to make sure your site is optimized for crawlability. 

This means doing some advanced SEO and auditing your site to get even more data. 

For example, you can run your site through Site Audit to see a dashboard with important recommendations like this one:

"Overview" dashboard in the Site Audit tool

Head to the “Issues” tab and select “Crawlability” in the “Category” drop-down. 

filter “Crawlability” in the “Category” drop-down menu in Site Audit tool

These are all the issues affecting your site’s crawlability. 

If you don’t know what an issue means or how to address it, click on “Why and how to fix it” to learn more. 

an example of “Why and how to fix it” section explaining 4xx error and how to fix it

Run an audit like this on a monthly basis. And iron out any issues that pop up. 

You need to make sure Google and other search engines can crawl and index your webpages in order to rank them.

Share