Handling Non-Standard Delimiters (Semicolons, Pipes)

๐Ÿท๏ธ Structured Data Formats: JSON, YAML, and CSV / CSV In-Depth

When working with CSV files in real-world scenarios, you'll often encounter files that don't use commas as delimiters. Many systems export data using semicolons (;) or pipes (|) instead. This is especially common in European locales where commas are used as decimal separators, or in legacy systems that prefer pipe-delimited formats. Python's csv module handles these non-standard delimiters gracefully with just a small configuration change.


โš™๏ธ Why Non-Standard Delimiters Exist

  • Semicolons are commonly used when data contains commas (e.g., addresses like "New York, NY" or numbers like "1,234.56").
  • Pipes are preferred when data may contain both commas and semicolons, as pipes rarely appear in normal text.
  • Some enterprise systems use tabs or spaces as delimiters for specific export formats.

๐Ÿ› ๏ธ Reading Files with Custom Delimiters

The key to handling non-standard delimiters is the delimiter parameter in the csv.reader function. Instead of the default comma, you specify the character that separates your fields.

For a semicolon-delimited file: - Create a reader object with csv.reader(file, delimiter=';') - Each row will be split at every semicolon, returning a list of fields - This works exactly like a comma-delimited file, just with a different separator

For a pipe-delimited file: - Use csv.reader(file, delimiter='|') - The pipe character acts as the field separator - All other CSV rules (quoting, escaping) still apply


๐Ÿ“Š Comparison Table: Delimiter Types

Delimiter Python Parameter Common Use Case Example Row
Comma delimiter=',' Standard CSV format Name,Age,City
Semicolon delimiter=';' European locales, comma-heavy data Name;Age;City
Pipe **delimiter=' '** Legacy systems, safe separator
Tab delimiter='\t' TSV files, spreadsheet exports Name\tAge\tCity

๐Ÿ•ต๏ธ Detecting Delimiters Automatically

When you don't know the delimiter in advance, you can use Python's csv.Sniffer class to detect it automatically:

  • The Sniffer analyzes a sample of your file to determine the delimiter, quote character, and other formatting details
  • Use sniffer.sniff(sample) where sample is a string of your file's content
  • The returned dialect object contains a delimiter attribute that reveals the detected character
  • This is extremely useful when processing files from unknown sources or when the delimiter may vary

โœ๏ธ Writing Files with Custom Delimiters

Writing files with non-standard delimiters follows the same pattern as reading:

  • Create a csv.writer object with the desired delimiter parameter
  • Use writer.writerow() to write each row as a list of values
  • The writer will automatically insert your chosen delimiter between fields
  • You can also specify a lineterminator if you need non-standard line endings

๐Ÿงช Practical Tips for Delimiter Handling

  • Always check the first few lines of a file manually to confirm the delimiter before writing code
  • When using semicolons, be aware that some European CSV files also use commas as decimal separators within numbers
  • For pipe-delimited files, ensure your data doesn't contain pipe characters, or use quoting to escape them
  • The csv.Sniffer works best with a representative sample of at least a few hundred characters
  • If your data contains the delimiter character within fields, wrap those fields in quotes (the csv module handles this automatically)

โš ๏ธ Common Pitfalls to Avoid

  • Forgetting to specify the delimiter when reading a non-standard file will result in a single field per row (since no commas are found)
  • Using the wrong delimiter character (e.g., a lowercase L instead of a pipe) will split data incorrectly
  • Assuming all files from a system use the same delimiter โ€” always verify with a sample
  • Mixing delimiters within the same file (e.g., some rows using commas, others using semicolons) will cause parsing errors

๐ŸŽฏ Summary

Handling non-standard delimiters in Python is straightforward once you understand the delimiter parameter. Whether you're working with semicolons, pipes, tabs, or any other character, the csv.reader and csv.writer functions adapt seamlessly. For unknown formats, the csv.Sniffer provides automatic detection, making your code robust enough to handle diverse data sources. Always test with sample data and verify your delimiter choice before processing large files.


This topic shows how to read and write CSV files that use semicolons (;) or pipes (|) instead of commas as delimiters.


๐Ÿ“˜ Example 1: Reading a CSV file with semicolon delimiter

This example reads a simple CSV file where columns are separated by semicolons.

import csv

with open("employees.csv", "r") as file:
    reader = csv.reader(file, delimiter=";")
    for row in reader:
        print(row)

๐Ÿ“ค Output: ['Alice', 'Engineer', '60000'] ['Bob', 'Technician', '45000'] ['Carol', 'Analyst', '52000']


๐Ÿ“˜ Example 2: Reading a CSV file with pipe delimiter

This example reads a CSV file where columns are separated by pipe symbols.

import csv

with open("inventory.csv", "r") as file:
    reader = csv.reader(file, delimiter="|")
    for row in reader:
        print(row)

๐Ÿ“ค Output: ['Item', 'Quantity', 'Price'] ['Widget', '150', '2.50'] ['Gadget', '75', '8.00'] ['Doodad', '200', '1.20']


๐Ÿ“˜ Example 3: Writing a CSV file with semicolon delimiter

This example writes data to a CSV file using semicolons as the delimiter.

import csv

data = [
    ["Name", "Department", "Salary"],
    ["Dave", "Engineering", 72000],
    ["Eve", "Marketing", 58000],
    ["Frank", "Sales", 63000]
]

with open("departments.csv", "w", newline="") as file:
    writer = csv.writer(file, delimiter=";")
    for row in data:
        writer.writerow(row)

๐Ÿ“ค Output: File 'departments.csv' created with semicolon-separated values


๐Ÿ“˜ Example 4: Reading a semicolon-delimited file with header row

This example reads a semicolon-delimited file and accesses data by column name using DictReader.

import csv

with open("employees.csv", "r") as file:
    reader = csv.DictReader(file, delimiter=";")
    for row in reader:
        print(row["Name"], "works as", row["Role"])

๐Ÿ“ค Output: Alice works as Engineer Bob works as Technician Carol works as Analyst


๐Ÿ“˜ Example 5: Converting a pipe-delimited file to a list of dictionaries

This example reads a pipe-delimited file and stores each row as a dictionary for easier data access.

import csv

records = []

with open("inventory.csv", "r") as file:
    reader = csv.DictReader(file, delimiter="|")
    for row in reader:
        records.append(row)

for item in records:
    print(f"{item['Item']}: {item['Quantity']} units at ${item['Price']} each")

๐Ÿ“ค Output: Widget: 150 units at $2.50 each Gadget: 75 units at $8.00 each Doodad: 200 units at $1.20 each


๐Ÿ“˜ Example 6: Writing a pipe-delimited file from a list of dictionaries

This example writes data from a list of dictionaries to a pipe-delimited CSV file.

import csv

data = [
    {"Product": "Laptop", "Stock": 30, "Price": 899.99},
    {"Product": "Mouse", "Stock": 120, "Price": 24.99},
    {"Product": "Keyboard", "Stock": 85, "Price": 49.99}
]

with open("products.csv", "w", newline="") as file:
    fieldnames = ["Product", "Stock", "Price"]
    writer = csv.DictWriter(file, fieldnames=fieldnames, delimiter="|")
    writer.writeheader()
    for row in data:
        writer.writerow(row)

๐Ÿ“ค Output: File 'products.csv' created with pipe-delimited columns


๐Ÿ“˜ Example 7: Handling mixed delimiters in a single file

This example reads a file that uses both semicolons and pipes in different sections by processing each line separately.

import csv

lines = [
    "Name;Age;City",
    "Grace|32|Boston",
    "Henry;28|Dallas",
    "Iris|35;Miami"
]

for line in lines:
    if ";" in line and "|" in line:
        # Handle mixed delimiter line
        parts = line.replace("|", ";").split(";")
        print(parts)
    elif ";" in line:
        print(line.split(";"))
    elif "|" in line:
        print(line.split("|"))

๐Ÿ“ค Output: ['Name', 'Age', 'City'] ['Grace', '32', 'Boston'] ['Henry', '28', 'Dallas'] ['Iris', '35', 'Miami']


Comparison Table: Delimiter Types

Feature Comma (,) Semicolon (;) Pipe (|)
Common use Standard CSV European locale data Log files, system exports
Risk of conflict High (data may contain commas) Low Very low
Readability Good for simple data Good when commas are in data Excellent for complex data
Python parameter delimiter="," (default) delimiter=";" delimiter="\|"

When working with CSV files in real-world scenarios, you'll often encounter files that don't use commas as delimiters. Many systems export data using semicolons (;) or pipes (|) instead. This is especially common in European locales where commas are used as decimal separators, or in legacy systems that prefer pipe-delimited formats. Python's csv module handles these non-standard delimiters gracefully with just a small configuration change.


โš™๏ธ Why Non-Standard Delimiters Exist

  • Semicolons are commonly used when data contains commas (e.g., addresses like "New York, NY" or numbers like "1,234.56").
  • Pipes are preferred when data may contain both commas and semicolons, as pipes rarely appear in normal text.
  • Some enterprise systems use tabs or spaces as delimiters for specific export formats.

๐Ÿ› ๏ธ Reading Files with Custom Delimiters

The key to handling non-standard delimiters is the delimiter parameter in the csv.reader function. Instead of the default comma, you specify the character that separates your fields.

For a semicolon-delimited file: - Create a reader object with csv.reader(file, delimiter=';') - Each row will be split at every semicolon, returning a list of fields - This works exactly like a comma-delimited file, just with a different separator

For a pipe-delimited file: - Use csv.reader(file, delimiter='|') - The pipe character acts as the field separator - All other CSV rules (quoting, escaping) still apply


๐Ÿ“Š Comparison Table: Delimiter Types

Delimiter Python Parameter Common Use Case Example Row
Comma delimiter=',' Standard CSV format Name,Age,City
Semicolon delimiter=';' European locales, comma-heavy data Name;Age;City
Pipe **delimiter=' '** Legacy systems, safe separator
Tab delimiter='\t' TSV files, spreadsheet exports Name\tAge\tCity

๐Ÿ•ต๏ธ Detecting Delimiters Automatically

When you don't know the delimiter in advance, you can use Python's csv.Sniffer class to detect it automatically:

  • The Sniffer analyzes a sample of your file to determine the delimiter, quote character, and other formatting details
  • Use sniffer.sniff(sample) where sample is a string of your file's content
  • The returned dialect object contains a delimiter attribute that reveals the detected character
  • This is extremely useful when processing files from unknown sources or when the delimiter may vary

โœ๏ธ Writing Files with Custom Delimiters

Writing files with non-standard delimiters follows the same pattern as reading:

  • Create a csv.writer object with the desired delimiter parameter
  • Use writer.writerow() to write each row as a list of values
  • The writer will automatically insert your chosen delimiter between fields
  • You can also specify a lineterminator if you need non-standard line endings

๐Ÿงช Practical Tips for Delimiter Handling

  • Always check the first few lines of a file manually to confirm the delimiter before writing code
  • When using semicolons, be aware that some European CSV files also use commas as decimal separators within numbers
  • For pipe-delimited files, ensure your data doesn't contain pipe characters, or use quoting to escape them
  • The csv.Sniffer works best with a representative sample of at least a few hundred characters
  • If your data contains the delimiter character within fields, wrap those fields in quotes (the csv module handles this automatically)

โš ๏ธ Common Pitfalls to Avoid

  • Forgetting to specify the delimiter when reading a non-standard file will result in a single field per row (since no commas are found)
  • Using the wrong delimiter character (e.g., a lowercase L instead of a pipe) will split data incorrectly
  • Assuming all files from a system use the same delimiter โ€” always verify with a sample
  • Mixing delimiters within the same file (e.g., some rows using commas, others using semicolons) will cause parsing errors

๐ŸŽฏ Summary

Handling non-standard delimiters in Python is straightforward once you understand the delimiter parameter. Whether you're working with semicolons, pipes, tabs, or any other character, the csv.reader and csv.writer functions adapt seamlessly. For unknown formats, the csv.Sniffer provides automatic detection, making your code robust enough to handle diverse data sources. Always test with sample data and verify your delimiter choice before processing large files.

Interactive Views

You are currently in ๐Ÿ“š All-in-One mode. Use the tabs at the top to switch to ๐Ÿ“– Theory Only or ๐Ÿ’ป Code Only views.

This topic shows how to read and write CSV files that use semicolons (;) or pipes (|) instead of commas as delimiters.


๐Ÿ“˜ Example 1: Reading a CSV file with semicolon delimiter

This example reads a simple CSV file where columns are separated by semicolons.

import csv

with open("employees.csv", "r") as file:
    reader = csv.reader(file, delimiter=";")
    for row in reader:
        print(row)

๐Ÿ“ค Output: ['Alice', 'Engineer', '60000'] ['Bob', 'Technician', '45000'] ['Carol', 'Analyst', '52000']


๐Ÿ“˜ Example 2: Reading a CSV file with pipe delimiter

This example reads a CSV file where columns are separated by pipe symbols.

import csv

with open("inventory.csv", "r") as file:
    reader = csv.reader(file, delimiter="|")
    for row in reader:
        print(row)

๐Ÿ“ค Output: ['Item', 'Quantity', 'Price'] ['Widget', '150', '2.50'] ['Gadget', '75', '8.00'] ['Doodad', '200', '1.20']


๐Ÿ“˜ Example 3: Writing a CSV file with semicolon delimiter

This example writes data to a CSV file using semicolons as the delimiter.

import csv

data = [
    ["Name", "Department", "Salary"],
    ["Dave", "Engineering", 72000],
    ["Eve", "Marketing", 58000],
    ["Frank", "Sales", 63000]
]

with open("departments.csv", "w", newline="") as file:
    writer = csv.writer(file, delimiter=";")
    for row in data:
        writer.writerow(row)

๐Ÿ“ค Output: File 'departments.csv' created with semicolon-separated values


๐Ÿ“˜ Example 4: Reading a semicolon-delimited file with header row

This example reads a semicolon-delimited file and accesses data by column name using DictReader.

import csv

with open("employees.csv", "r") as file:
    reader = csv.DictReader(file, delimiter=";")
    for row in reader:
        print(row["Name"], "works as", row["Role"])

๐Ÿ“ค Output: Alice works as Engineer Bob works as Technician Carol works as Analyst


๐Ÿ“˜ Example 5: Converting a pipe-delimited file to a list of dictionaries

This example reads a pipe-delimited file and stores each row as a dictionary for easier data access.

import csv

records = []

with open("inventory.csv", "r") as file:
    reader = csv.DictReader(file, delimiter="|")
    for row in reader:
        records.append(row)

for item in records:
    print(f"{item['Item']}: {item['Quantity']} units at ${item['Price']} each")

๐Ÿ“ค Output: Widget: 150 units at $2.50 each Gadget: 75 units at $8.00 each Doodad: 200 units at $1.20 each


๐Ÿ“˜ Example 6: Writing a pipe-delimited file from a list of dictionaries

This example writes data from a list of dictionaries to a pipe-delimited CSV file.

import csv

data = [
    {"Product": "Laptop", "Stock": 30, "Price": 899.99},
    {"Product": "Mouse", "Stock": 120, "Price": 24.99},
    {"Product": "Keyboard", "Stock": 85, "Price": 49.99}
]

with open("products.csv", "w", newline="") as file:
    fieldnames = ["Product", "Stock", "Price"]
    writer = csv.DictWriter(file, fieldnames=fieldnames, delimiter="|")
    writer.writeheader()
    for row in data:
        writer.writerow(row)

๐Ÿ“ค Output: File 'products.csv' created with pipe-delimited columns


๐Ÿ“˜ Example 7: Handling mixed delimiters in a single file

This example reads a file that uses both semicolons and pipes in different sections by processing each line separately.

import csv

lines = [
    "Name;Age;City",
    "Grace|32|Boston",
    "Henry;28|Dallas",
    "Iris|35;Miami"
]

for line in lines:
    if ";" in line and "|" in line:
        # Handle mixed delimiter line
        parts = line.replace("|", ";").split(";")
        print(parts)
    elif ";" in line:
        print(line.split(";"))
    elif "|" in line:
        print(line.split("|"))

๐Ÿ“ค Output: ['Name', 'Age', 'City'] ['Grace', '32', 'Boston'] ['Henry', '28', 'Dallas'] ['Iris', '35', 'Miami']


Comparison Table: Delimiter Types

Feature Comma (,) Semicolon (;) Pipe (|)
Common use Standard CSV European locale data Log files, system exports
Risk of conflict High (data may contain commas) Low Very low
Readability Good for simple data Good when commas are in data Excellent for complex data
Python parameter delimiter="," (default) delimiter=";" delimiter="\|"