Who this guide is for
- Learners building scripts that read and write real files
- Developers handling text, JSON, CSV, and binary data
- Anyone moving from toy examples to practical data workflows
What you’ll learn
- Safe file operations using context managers (
with) - Reading and writing text, JSON, and CSV files
- The difference between text and binary file handling
- Why
pathlibis preferred over manual path strings - Common file I/O errors and robust handling strategies
Why this topic matters
Most production scripts interact with files: configuration, logs, exports, imports, and reports. Reliable I/O is essential for automation, data engineering, backend systems, and testing workflows.
File handling mistakes can corrupt data or crash workflows. This guide focuses on safe defaults so your scripts stay stable and portable across operating systems.
Core concepts
Always use context managers for files
with ensures files close properly, even if errors occur.
with open("notes.txt", "w", encoding="utf-8") as file:
file.write("Hello file I/On")
This avoids leaked file handles and incomplete writes.
Structured formats: JSON and CSV
Use dedicated modules instead of manual string parsing.
JSON example:
import json
payload = {"name": "Ava", "active": True}
with open("user.json", "w", encoding="utf-8") as file:
json.dump(payload, file, indent=2)
CSV example:
import csv
rows = [["name", "score"], ["A", 80], ["B", 92]]
with open("scores.csv", "w", newline="", encoding="utf-8") as file:
writer = csv.writer(file)
writer.writerows(rows)
YAML example (third-party parser):
import yaml
config = {"env": "dev", "debug": True}
with open("config.yaml", "w", encoding="utf-8") as file:
yaml.safe_dump(config, file)
pathlib for modern path operations
pathlib improves readability and cross-platform compatibility.
from pathlib import Path
report_path = Path("data") / "report.txt"
report_path.parent.mkdir(parents=True, exist_ok=True)
report_path.write_text("Report generated", encoding="utf-8")
This is safer than string concatenation for paths.
pathlib vs os.path quick comparison:
pathlib: object-oriented and usually easier to reados.path: older functional style, still widely used in legacy code
Binary file example:
binary_data = bytes([80, 89, 84, 72, 79, 78])
with open("sample.bin", "wb") as file:
file.write(binary_data)
with open("sample.bin", "rb") as file:
loaded = file.read()
print(loaded)
Expected output:
b'PYTHON'
Step-by-step walkthrough
Step 1 — Read and write plain text safely
from pathlib import Path
file_path = Path("example.txt")
file_path.write_text("line 1nline 2n", encoding="utf-8")
content = file_path.read_text(encoding="utf-8")
print(content)
This pattern is concise and reliable for many scripts.
Step 2 — Work with JSON data
import json
from pathlib import Path
settings = {"theme": "dark", "autosave": True}
Path("settings.json").write_text(json.dumps(settings, indent=2), encoding="utf-8")
loaded = json.loads(Path("settings.json").read_text(encoding="utf-8"))
print(loaded["theme"])
You now have structured data persistence with minimal code.
Step 3 — Handle I/O errors gracefully
from pathlib import Path
path = Path("missing.txt")
try:
print(path.read_text(encoding="utf-8"))
except FileNotFoundError:
print("File not found. Please check the path.")
except PermissionError:
print("Permission denied when accessing the file.")
Explicit handling gives user-friendly failures.
Practical examples
Example 1 — Merge multiple text files into one
from pathlib import Path
source_dir = Path("logs")
output = Path("merged.log")
with output.open("w", encoding="utf-8") as out:
for file_path in sorted(source_dir.glob("*.log")):
out.write(file_path.read_text(encoding="utf-8"))
out.write("n")
print("Merged completed")
Expected output:
Merged completed
Example 2 — Read CSV and compute average
import csv
total = 0
count = 0
with open("scores.csv", "r", encoding="utf-8") as file:
reader = csv.DictReader(file)
for row in reader:
total += int(row["score"])
count += 1
print(total / count if count else 0)
Expected output (sample):
86.0
Pandas CSV/Excel example:
import pandas as pd
df = pd.read_csv("scores.csv")
df.to_excel("scores.xlsx", index=False)
print(df.head())
Common mistakes and how to avoid them
- Forgetting encoding for text files -> Use
encoding="utf-8"explicitly. - Writing CSV without
newline=""on Windows -> Includenewline=""when opening CSV for writing. - Manual path string concatenation -> Use
pathlib.Pathjoins. - Ignoring exceptions around file access -> Handle
FileNotFoundErrorand permission errors.
Quick practice
- Write a script that reads a text file and prints line count + word count.
- Create a JSON config file, load it, and print one selected key.
- Build a CSV reader that outputs min, max, and average for a numeric column.
Key takeaways
- Safe file handling depends on
withblocks and explicit encoding. - Use built-in modules (
json,csv) for structured formats. pathlibis the preferred path API for modern Python projects.- Robust scripts handle I/O failures predictably.
Next step
Continue to Testing in Python. In the next guide, you will learn how to verify code behavior with unittest, pytest, and coverage tooling.
No Comments