psd-tools: Compression module has unguarded zlib decompression, missing dimension validation, and hardening gaps

Summary

A security review of the psd_tools.compression module (conducted against the fix/invalid-rle-compression branch, commits 7490ffa–2a006f5) identified the following pre-existing issues. The two findings introduced and fixed by those commits (Cython buffer overflow, IndexError on lone repeat header) are excluded from this report.

Findings

1. Unguarded `zlib.decompress` — ZIP bomb / memory exhaustion (Medium)

Location: src/psd_tools/compression/__init__.py, lines 159 and 162

result = zlib.decompress(data)          # Compression.ZIP
decompressed = zlib.decompress(data)    # Compression.ZIP_WITH_PREDICTION

zlib.decompress is called without a max_length cap. A crafted PSD file containing a ZIP-compressed channel whose compressed payload expands to gigabytes would exhaust process memory before any limit is enforced. The RLE path is not vulnerable to this because the decoder pre-allocates exactly row_size × height bytes; the ZIP path has no equivalent ceiling.

Impact: Denial-of-service / OOM crash when processing untrusted PSD files.

Suggested mitigation: Pass a reasonable max_length to zlib.decompress, derived from the expected width * height * depth // 8 byte count already computed in decompress().

2. No upper-bound validation on image dimensions before allocation (Low)

Location: src/psd_tools/compression/__init__.py, lines 138 and 193

length = width * height * max(1, depth // 8)   # decompress()
row_size = max(width * depth // 8, 1)           # decode_rle()

Neither width, height, nor depth are range-checked before these values drive memory allocation. The PSD format (version 2 / PSB) permits dimensions up to 300,000 × 300,000 pixels; a 4-channel 32-bit image at that size would require ~144 TB to hold. While the OS/Python allocator will reject such a request, there is no early, explicit guard that produces a clean, user-facing error.

Impact: Uncontrolled allocation attempt from a malformed or adversarially crafted PSB file; hard crash rather than a recoverable error.

Suggested mitigation: Validate width, height, and depth against known PSD/PSB limits before entering decompression, and raise a descriptive ValueError early.

3. `assert` used as a runtime integrity check (Low)

Location: src/psd_tools/compression/__init__.py, line 170

assert len(result) == length, "len=%d, expected=%d" % (len(result), length)

This assertion can be silently disabled by running the interpreter with -O (or -OO), which strips all assert statements. If the assertion ever becomes relevant (e.g., after future refactoring), disabling it would allow a length mismatch to propagate silently into downstream image compositing.

Impact: Loss of an integrity guard in optimised deployments.

Suggested mitigation: Replace with an explicit if + raise ValueError(...).

4. `cdef int` indices vs. `Py_ssize_t size` type mismatch in Cython decoder (Low)

Location: src/psd_tools/compression/_rle.pyx, lines 18–20

cdef int i = 0
cdef int j = 0
cdef int length = data.shape[0]

All loop indices are C signed int (32-bit). The size parameter is Py_ssize_t (64-bit on modern platforms). The comparison j < size promotes j to Py_ssize_t, but if j wraps due to a row size exceeding INT_MAX (~2.1 GB), the resulting comparison is undefined behaviour in C. In practice, row sizes are bounded by PSD/PSB dimension limits and are unreachable at this scale; however, the mismatch is a latent defect if the function is ever called directly with large synthetic inputs.

Impact: Theoretical infinite loop or UB at >2 GB row sizes; not reachable from standard PSD/PSB parsing.

Suggested mitigation: Change cdef int i, j, length to cdef Py_ssize_t.

5. Silent data degradation not surfaced to callers (Informational)

Location: src/psd_tools/compression/__init__.py, lines 144–157

The tolerant RLE decoder (introduced in 2a006f5) replaces malformed channel data with zero-padded (black) pixels and emits a logger.warning. This is the correct trade-off over crashing, but the warning is only observable if the caller has configured a log handler. The public PSDImage API does not surface channel-level decode failures to the user in any other way.

Impact: A user parsing a silently corrupt file gets a visually wrong image with no programmatic signal to check.

Suggested mitigation: Consider exposing a per-channel decode-error flag or raising a distinct warning category that users can filter or escalate via the warnings module.

6. `encode()` zero-length return type inconsistency in Cython (Informational)

Location: src/psd_tools/compression/_rle.pyx, lines 66–67

if length == 0:
    return data   # returns a memoryview, not an explicit std::string

All other return paths return an explicit cdef string result. This path returns data (a const unsigned char[:] memoryview) and relies on Cython's implicit coercion to bytes. It is functionally equivalent today but is semantically inconsistent and fragile if Cython's coercion rules change in a future version.

Impact: Potential silent breakage in future Cython versions; not a current security issue.

Suggested mitigation: Replace return data with return result (the already-declared empty string).

Environment

Branch: fix/invalid-rle-compression
Reviewed commits: 7490ffa, 2a006f5
Python: 3.x (Cython extension compiled for CPython)

References

kyamagu published to psd-tools/psd-tools Feb 25, 2026

Published by the National Vulnerability Database Feb 26, 2026

Published to the GitHub Advisory Database Feb 26, 2026

Reviewed Feb 26, 2026

Last updated Feb 26, 2026

Package

Affected versions

Patched versions

Description

Summary

Findings

1. Unguarded zlib.decompress — ZIP bomb / memory exhaustion (Medium)

2. No upper-bound validation on image dimensions before allocation (Low)

3. assert used as a runtime integrity check (Low)

4. cdef int indices vs. Py_ssize_t size type mismatch in Cython decoder (Low)

5. Silent data degradation not surfaced to callers (Informational)

6. encode() zero-length return type inconsistency in Cython (Informational)

Environment

References

Severity

CVSS v4 base metrics

Exploitability Metrics

Vulnerable System Impact Metrics

Subsequent System Impact Metrics

EPSS score

Weaknesses

Integer Overflow or Wraparound

Improper Handling of Highly Compressed Data (Data Amplification)

Reachable Assertion

Incorrect Type Conversion or Cast

Improper Handling of Exceptional Conditions

Memory Allocation with Excessive Size Value

CVE ID

GHSA ID

Source code

Uh oh!

1. Unguarded `zlib.decompress` — ZIP bomb / memory exhaustion (Medium)

3. `assert` used as a runtime integrity check (Low)

4. `cdef int` indices vs. `Py_ssize_t size` type mismatch in Cython decoder (Low)

6. `encode()` zero-length return type inconsistency in Cython (Informational)