Python Learning Guide: File Handling & I/O

Who this guide is for

  • Learners building scripts that read and write real files
  • Developers handling text, JSON, CSV, and binary data
  • Anyone moving from toy examples to practical data workflows

What you’ll learn

  • Safe file operations using context managers (with)
  • Reading and writing text, JSON, and CSV files
  • The difference between text and binary file handling
  • Why pathlib is preferred over manual path strings
  • Common file I/O errors and robust handling strategies

Why this topic matters

Most production scripts interact with files: configuration, logs, exports, imports, and reports. Reliable I/O is essential for automation, data engineering, backend systems, and testing workflows.

File handling mistakes can corrupt data or crash workflows. This guide focuses on safe defaults so your scripts stay stable and portable across operating systems.

Core concepts

Always use context managers for files

with ensures files close properly, even if errors occur.

with open("notes.txt", "w", encoding="utf-8") as file:
	file.write("Hello file I/On")

This avoids leaked file handles and incomplete writes.

Structured formats: JSON and CSV

Use dedicated modules instead of manual string parsing.

JSON example:

import json

payload = {"name": "Ava", "active": True}
with open("user.json", "w", encoding="utf-8") as file:
	json.dump(payload, file, indent=2)

CSV example:

import csv

rows = [["name", "score"], ["A", 80], ["B", 92]]
with open("scores.csv", "w", newline="", encoding="utf-8") as file:
	writer = csv.writer(file)
	writer.writerows(rows)

YAML example (third-party parser):

import yaml

config = {"env": "dev", "debug": True}
with open("config.yaml", "w", encoding="utf-8") as file:
    yaml.safe_dump(config, file)

pathlib for modern path operations

pathlib improves readability and cross-platform compatibility.

from pathlib import Path

report_path = Path("data") / "report.txt"
report_path.parent.mkdir(parents=True, exist_ok=True)
report_path.write_text("Report generated", encoding="utf-8")

This is safer than string concatenation for paths.

pathlib vs os.path quick comparison:

  • pathlib: object-oriented and usually easier to read
  • os.path: older functional style, still widely used in legacy code

Binary file example:

binary_data = bytes([80, 89, 84, 72, 79, 78])

with open("sample.bin", "wb") as file:
	file.write(binary_data)

with open("sample.bin", "rb") as file:
	loaded = file.read()

print(loaded)

Expected output:

b'PYTHON'

Step-by-step walkthrough

Step 1 — Read and write plain text safely

from pathlib import Path

file_path = Path("example.txt")
file_path.write_text("line 1nline 2n", encoding="utf-8")

content = file_path.read_text(encoding="utf-8")
print(content)

This pattern is concise and reliable for many scripts.

Step 2 — Work with JSON data

import json
from pathlib import Path

settings = {"theme": "dark", "autosave": True}
Path("settings.json").write_text(json.dumps(settings, indent=2), encoding="utf-8")

loaded = json.loads(Path("settings.json").read_text(encoding="utf-8"))
print(loaded["theme"])

You now have structured data persistence with minimal code.

Step 3 — Handle I/O errors gracefully

from pathlib import Path

path = Path("missing.txt")

try:
	print(path.read_text(encoding="utf-8"))
except FileNotFoundError:
	print("File not found. Please check the path.")
except PermissionError:
	print("Permission denied when accessing the file.")

Explicit handling gives user-friendly failures.

Practical examples

Example 1 — Merge multiple text files into one

from pathlib import Path

source_dir = Path("logs")
output = Path("merged.log")

with output.open("w", encoding="utf-8") as out:
	for file_path in sorted(source_dir.glob("*.log")):
		out.write(file_path.read_text(encoding="utf-8"))
		out.write("n")

print("Merged completed")

Expected output:

Merged completed

Example 2 — Read CSV and compute average

import csv

total = 0
count = 0

with open("scores.csv", "r", encoding="utf-8") as file:
	reader = csv.DictReader(file)
	for row in reader:
		total += int(row["score"])
		count += 1

print(total / count if count else 0)

Expected output (sample):

86.0

Pandas CSV/Excel example:

import pandas as pd

df = pd.read_csv("scores.csv")
df.to_excel("scores.xlsx", index=False)
print(df.head())

Common mistakes and how to avoid them

  • Forgetting encoding for text files -> Use encoding="utf-8" explicitly.
  • Writing CSV without newline="" on Windows -> Include newline="" when opening CSV for writing.
  • Manual path string concatenation -> Use pathlib.Path joins.
  • Ignoring exceptions around file access -> Handle FileNotFoundError and permission errors.

Quick practice

  • Write a script that reads a text file and prints line count + word count.
  • Create a JSON config file, load it, and print one selected key.
  • Build a CSV reader that outputs min, max, and average for a numeric column.

Key takeaways

  • Safe file handling depends on with blocks and explicit encoding.
  • Use built-in modules (json, csv) for structured formats.
  • pathlib is the preferred path API for modern Python projects.
  • Robust scripts handle I/O failures predictably.

Next step

Continue to Testing in Python. In the next guide, you will learn how to verify code behavior with unittest, pytest, and coverage tooling.

No Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.