Practical Example: Archiving Old Log Files

๐Ÿท๏ธ Operating System and System Operations / The shutil Module

๐Ÿง  Context Introduction

Log files are essential for troubleshooting and monitoring, but they can quickly consume disk space if left unchecked. Engineers often need to archive or compress older log files to free up storage while preserving them for future reference. This practical example demonstrates how to use Python's shutil module to automate the process of identifying and archiving old log files into a compressed archive.


โš™๏ธ The Goal

The objective is to create a script that:

  • Scans a specified directory for log files (e.g., files ending with .log)
  • Identifies files older than a certain number of days (e.g., 7 days)
  • Compresses those old files into a single archive (e.g., .zip or .tar.gz)
  • Removes the original log files after successful archiving

๐Ÿ› ๏ธ Key Tools from the shutil Module

The shutil module provides high-level file operations. For this task, we rely on:

  • shutil.make_archive() โ€“ Creates a compressed archive (zip, tar, etc.) from a collection of files
  • shutil.rmtree() โ€“ Removes a directory and all its contents (useful if logs are stored in subdirectories)
  • os.path.getmtime() โ€“ Retrieves the last modification time of a file to determine its age

๐Ÿ“Š Comparison: Archive Formats

Format Compression Platform Support Use Case
zip Moderate Cross-platform General purpose, easy to open on any OS
tar.gz High (gzip) Linux/macOS preferred Better compression, common in server environments
tar.bz2 Very high Linux/macOS preferred Maximum compression, slower to create

For this example, we will use zip format for simplicity and broad compatibility.


๐Ÿ•ต๏ธ Step-by-Step Script Logic

Step 1: Define the target directory and age threshold

  • Set a variable for the directory containing log files (e.g., /var/log/myapp)
  • Set a variable for the maximum age in days (e.g., 7)

Step 2: List all files in the directory

  • Use os.listdir() to get all entries in the directory
  • Filter for files ending with .log using str.endswith()

Step 3: Check file age

  • For each log file, use os.path.getmtime() to get the last modification timestamp
  • Compare the timestamp with the current time minus the age threshold (using time.time())
  • If the file is older than the threshold, add it to a list of files to archive

Step 4: Create the archive

  • Use shutil.make_archive() with the base name (e.g., old_logs_archive), format (zip), and the list of file paths
  • The archive will be created in the current working directory or a specified location

Step 5: Remove the original files

  • After successful archiving, use os.remove() to delete each original log file
  • Optionally, log the names of archived and removed files for auditing

๐Ÿงช Example Script Structure (Inline)

The script begins by importing os, shutil, and time. It then defines the target directory and age threshold. A function called archive_old_logs() is created to encapsulate the logic.

Inside the function:

  • A list called files_to_archive is initialized as empty
  • The script loops through all entries in the target directory using os.listdir()
  • For each entry that ends with .log, it checks the file's modification time
  • If the file is older than the threshold, its full path is appended to files_to_archive
  • After the loop, if there are files to archive, shutil.make_archive() is called with a base name like archived_logs, format zip, and the root directory set to the target directory. The files to archive are passed as a list of relative paths
  • Finally, each original file in files_to_archive is deleted using os.remove()

The script prints a message for each file archived and removed, and a summary at the end.


โœ… Expected Outcome

When the script runs:

  • It scans the target directory and identifies all .log files older than 7 days
  • It creates a single .zip file named archived_logs.zip containing those old log files
  • It deletes the original log files from the directory
  • The console output shows something like:

Archived: /var/log/myapp/error_20231001.log
Archived: /var/log/myapp/access_20230928.log
Removed: /var/log/myapp/error_20231001.log
Removed: /var/log/myapp/access_20230928.log
Archiving complete. 2 files processed.


๐Ÿงน Best Practices for Engineers

  • Always test the script on a copy of your log directory first to avoid accidental data loss
  • Use absolute paths to avoid confusion about the current working directory
  • Add error handling (e.g., try-except blocks) to manage permission issues or missing files
  • Schedule the script using cron (Linux) or Task Scheduler (Windows) for regular automated cleanup
  • Consider logging the script's actions to a separate file for audit trails

๐Ÿ” Summary

This practical example shows how the shutil module simplifies file archiving tasks that would otherwise require complex shell commands or manual effort. By combining shutil.make_archive() with file age checks, engineers can build a reliable log rotation system that keeps storage under control without losing historical data. The same approach can be extended to archive other file types or to compress entire directories.


This example shows how to use Python's shutil module to archive old log files into compressed archives for storage or cleanup.


๐Ÿ“ฆ Example 1: Creating a simple ZIP archive of a single file

This example demonstrates how to compress one log file into a ZIP archive.

import shutil
import os

log_file = "server.log"
archive_name = "server_log_backup"

shutil.make_archive(archive_name, "zip", ".", log_file)

๐Ÿ“ค Output: server_log_backup.zip created in current directory


๐Ÿ“ฆ Example 2: Archiving an entire directory of log files

This example shows how to compress a folder containing multiple log files into one archive.

import shutil

log_directory = "logs"
archive_name = "logs_backup"

shutil.make_archive(archive_name, "zip", log_directory)

๐Ÿ“ค Output: logs_backup.zip created containing all files from logs/


๐Ÿ“ฆ Example 3: Archiving only files older than 7 days

This example demonstrates how to filter and archive log files based on their age.

import shutil
import os
import time

log_directory = "logs"
archive_name = "old_logs_backup"
temp_dir = "temp_old_logs"

os.makedirs(temp_dir, exist_ok=True)
seven_days_ago = time.time() - (7 * 24 * 60 * 60)

for file_name in os.listdir(log_directory):
    file_path = os.path.join(log_directory, file_name)
    if os.path.getmtime(file_path) < seven_days_ago:
        shutil.copy2(file_path, temp_dir)

shutil.make_archive(archive_name, "zip", temp_dir)
shutil.rmtree(temp_dir)

๐Ÿ“ค Output: old_logs_backup.zip created with files older than 7 days


๐Ÿ“ฆ Example 4: Archiving and deleting original log files after backup

This example shows how to archive old logs and then remove the original files to free up space.

import shutil
import os
import time

log_directory = "logs"
archive_name = "archived_logs"
temp_dir = "temp_archive"

os.makedirs(temp_dir, exist_ok=True)
seven_days_ago = time.time() - (7 * 24 * 60 * 60)

for file_name in os.listdir(log_directory):
    file_path = os.path.join(log_directory, file_name)
    if os.path.getmtime(file_path) < seven_days_ago:
        shutil.copy2(file_path, temp_dir)
        os.remove(file_path)

shutil.make_archive(archive_name, "zip", temp_dir)
shutil.rmtree(temp_dir)

๐Ÿ“ค Output: archived_logs.zip created, original old log files deleted


๐Ÿ“ฆ Example 5: Archiving logs with date-stamped archive names

This example demonstrates how to create archives with the current date in the filename for better organization.

import shutil
import os
from datetime import datetime

log_directory = "logs"
current_date = datetime.now().strftime("%Y-%m-%d")
archive_name = f"logs_archive_{current_date}"

shutil.make_archive(archive_name, "zip", log_directory)

๐Ÿ“ค Output: logs_archive_2024-01-15.zip created (date will vary)


๐Ÿ“Š Comparison Table: Archive Methods

Method Use Case Preserves Original Files Creates Archive
make_archive with single file Quick backup of one log Yes Yes
make_archive with directory Backup entire log folder Yes Yes
Filter by age + archive Archive old logs only Yes Yes
Archive + delete originals Free up disk space No Yes
Date-stamped archive name Organized backups Yes Yes

๐Ÿง  Context Introduction

Log files are essential for troubleshooting and monitoring, but they can quickly consume disk space if left unchecked. Engineers often need to archive or compress older log files to free up storage while preserving them for future reference. This practical example demonstrates how to use Python's shutil module to automate the process of identifying and archiving old log files into a compressed archive.


โš™๏ธ The Goal

The objective is to create a script that:

  • Scans a specified directory for log files (e.g., files ending with .log)
  • Identifies files older than a certain number of days (e.g., 7 days)
  • Compresses those old files into a single archive (e.g., .zip or .tar.gz)
  • Removes the original log files after successful archiving

๐Ÿ› ๏ธ Key Tools from the shutil Module

The shutil module provides high-level file operations. For this task, we rely on:

  • shutil.make_archive() โ€“ Creates a compressed archive (zip, tar, etc.) from a collection of files
  • shutil.rmtree() โ€“ Removes a directory and all its contents (useful if logs are stored in subdirectories)
  • os.path.getmtime() โ€“ Retrieves the last modification time of a file to determine its age

๐Ÿ“Š Comparison: Archive Formats

Format Compression Platform Support Use Case
zip Moderate Cross-platform General purpose, easy to open on any OS
tar.gz High (gzip) Linux/macOS preferred Better compression, common in server environments
tar.bz2 Very high Linux/macOS preferred Maximum compression, slower to create

For this example, we will use zip format for simplicity and broad compatibility.


๐Ÿ•ต๏ธ Step-by-Step Script Logic

Step 1: Define the target directory and age threshold

  • Set a variable for the directory containing log files (e.g., /var/log/myapp)
  • Set a variable for the maximum age in days (e.g., 7)

Step 2: List all files in the directory

  • Use os.listdir() to get all entries in the directory
  • Filter for files ending with .log using str.endswith()

Step 3: Check file age

  • For each log file, use os.path.getmtime() to get the last modification timestamp
  • Compare the timestamp with the current time minus the age threshold (using time.time())
  • If the file is older than the threshold, add it to a list of files to archive

Step 4: Create the archive

  • Use shutil.make_archive() with the base name (e.g., old_logs_archive), format (zip), and the list of file paths
  • The archive will be created in the current working directory or a specified location

Step 5: Remove the original files

  • After successful archiving, use os.remove() to delete each original log file
  • Optionally, log the names of archived and removed files for auditing

๐Ÿงช Example Script Structure (Inline)

The script begins by importing os, shutil, and time. It then defines the target directory and age threshold. A function called archive_old_logs() is created to encapsulate the logic.

Inside the function:

  • A list called files_to_archive is initialized as empty
  • The script loops through all entries in the target directory using os.listdir()
  • For each entry that ends with .log, it checks the file's modification time
  • If the file is older than the threshold, its full path is appended to files_to_archive
  • After the loop, if there are files to archive, shutil.make_archive() is called with a base name like archived_logs, format zip, and the root directory set to the target directory. The files to archive are passed as a list of relative paths
  • Finally, each original file in files_to_archive is deleted using os.remove()

The script prints a message for each file archived and removed, and a summary at the end.


โœ… Expected Outcome

When the script runs:

  • It scans the target directory and identifies all .log files older than 7 days
  • It creates a single .zip file named archived_logs.zip containing those old log files
  • It deletes the original log files from the directory
  • The console output shows something like:

Archived: /var/log/myapp/error_20231001.log
Archived: /var/log/myapp/access_20230928.log
Removed: /var/log/myapp/error_20231001.log
Removed: /var/log/myapp/access_20230928.log
Archiving complete. 2 files processed.


๐Ÿงน Best Practices for Engineers

  • Always test the script on a copy of your log directory first to avoid accidental data loss
  • Use absolute paths to avoid confusion about the current working directory
  • Add error handling (e.g., try-except blocks) to manage permission issues or missing files
  • Schedule the script using cron (Linux) or Task Scheduler (Windows) for regular automated cleanup
  • Consider logging the script's actions to a separate file for audit trails

๐Ÿ” Summary

This practical example shows how the shutil module simplifies file archiving tasks that would otherwise require complex shell commands or manual effort. By combining shutil.make_archive() with file age checks, engineers can build a reliable log rotation system that keeps storage under control without losing historical data. The same approach can be extended to archive other file types or to compress entire directories.

Interactive Views

You are currently in ๐Ÿ“š All-in-One mode. Use the tabs at the top to switch to ๐Ÿ“– Theory Only or ๐Ÿ’ป Code Only views.

This example shows how to use Python's shutil module to archive old log files into compressed archives for storage or cleanup.


๐Ÿ“ฆ Example 1: Creating a simple ZIP archive of a single file

This example demonstrates how to compress one log file into a ZIP archive.

import shutil
import os

log_file = "server.log"
archive_name = "server_log_backup"

shutil.make_archive(archive_name, "zip", ".", log_file)

๐Ÿ“ค Output: server_log_backup.zip created in current directory


๐Ÿ“ฆ Example 2: Archiving an entire directory of log files

This example shows how to compress a folder containing multiple log files into one archive.

import shutil

log_directory = "logs"
archive_name = "logs_backup"

shutil.make_archive(archive_name, "zip", log_directory)

๐Ÿ“ค Output: logs_backup.zip created containing all files from logs/


๐Ÿ“ฆ Example 3: Archiving only files older than 7 days

This example demonstrates how to filter and archive log files based on their age.

import shutil
import os
import time

log_directory = "logs"
archive_name = "old_logs_backup"
temp_dir = "temp_old_logs"

os.makedirs(temp_dir, exist_ok=True)
seven_days_ago = time.time() - (7 * 24 * 60 * 60)

for file_name in os.listdir(log_directory):
    file_path = os.path.join(log_directory, file_name)
    if os.path.getmtime(file_path) < seven_days_ago:
        shutil.copy2(file_path, temp_dir)

shutil.make_archive(archive_name, "zip", temp_dir)
shutil.rmtree(temp_dir)

๐Ÿ“ค Output: old_logs_backup.zip created with files older than 7 days


๐Ÿ“ฆ Example 4: Archiving and deleting original log files after backup

This example shows how to archive old logs and then remove the original files to free up space.

import shutil
import os
import time

log_directory = "logs"
archive_name = "archived_logs"
temp_dir = "temp_archive"

os.makedirs(temp_dir, exist_ok=True)
seven_days_ago = time.time() - (7 * 24 * 60 * 60)

for file_name in os.listdir(log_directory):
    file_path = os.path.join(log_directory, file_name)
    if os.path.getmtime(file_path) < seven_days_ago:
        shutil.copy2(file_path, temp_dir)
        os.remove(file_path)

shutil.make_archive(archive_name, "zip", temp_dir)
shutil.rmtree(temp_dir)

๐Ÿ“ค Output: archived_logs.zip created, original old log files deleted


๐Ÿ“ฆ Example 5: Archiving logs with date-stamped archive names

This example demonstrates how to create archives with the current date in the filename for better organization.

import shutil
import os
from datetime import datetime

log_directory = "logs"
current_date = datetime.now().strftime("%Y-%m-%d")
archive_name = f"logs_archive_{current_date}"

shutil.make_archive(archive_name, "zip", log_directory)

๐Ÿ“ค Output: logs_archive_2024-01-15.zip created (date will vary)


๐Ÿ“Š Comparison Table: Archive Methods

Method Use Case Preserves Original Files Creates Archive
make_archive with single file Quick backup of one log Yes Yes
make_archive with directory Backup entire log folder Yes Yes
Filter by age + archive Archive old logs only Yes Yes
Archive + delete originals Free up disk space No Yes
Date-stamped archive name Organized backups Yes Yes