Summary
A security review of the psd_tools.compression module (conducted against the fix/invalid-rle-compression branch, commits 7490ffa–2a006f5) identified the following pre-existing issues. The two findings introduced and fixed by those commits (Cython buffer overflow, IndexError on lone repeat header) are excluded from this report.
Findings
1. Unguarded zlib.decompress — ZIP bomb / memory exhaustion (Medium)
Location: src/psd_tools/compression/__init__.py, lines 159 and 162
result = zlib.decompress(data) # Compression.ZIP
decompressed = zlib.decompress(data) # Compression.ZIP_WITH_PREDICTION
zlib.decompress is called without a max_length cap. A crafted PSD file containing a ZIP-compressed channel whose compressed payload expands to gigabytes would exhaust process memory before any limit is enforced. The RLE path is not vulnerable to this because the decoder pre-allocates exactly row_size × height bytes; the ZIP path has no equivalent ceiling.
Impact: Denial-of-service / OOM crash when processing untrusted PSD files.
Suggested mitigation: Pass a reasonable max_length to zlib.decompress, derived from the expected width * height * depth // 8 byte count already computed in decompress().
2. No upper-bound validation on image dimensions before allocation (Low)
Location: src/psd_tools/compression/__init__.py, lines 138 and 193
length = width * height * max(1, depth // 8) # decompress()
row_size = max(width * depth // 8, 1) # decode_rle()
Neither width, height, nor depth are range-checked before these values drive memory allocation. The PSD format (version 2 / PSB) permits dimensions up to 300,000 × 300,000 pixels; a 4-channel 32-bit image at that size would require ~144 TB to hold. While the OS/Python allocator will reject such a request, there is no early, explicit guard that produces a clean, user-facing error.
Impact: Uncontrolled allocation attempt from a malformed or adversarially crafted PSB file; hard crash rather than a recoverable error.
Suggested mitigation: Validate width, height, and depth against known PSD/PSB limits before entering decompression, and raise a descriptive ValueError early.
3. assert used as a runtime integrity check (Low)
Location: src/psd_tools/compression/__init__.py, line 170
assert len(result) == length, "len=%d, expected=%d" % (len(result), length)
This assertion can be silently disabled by running the interpreter with -O (or -OO), which strips all assert statements. If the assertion ever becomes relevant (e.g., after future refactoring), disabling it would allow a length mismatch to propagate silently into downstream image compositing.
Impact: Loss of an integrity guard in optimised deployments.
Suggested mitigation: Replace with an explicit if + raise ValueError(...).
4. cdef int indices vs. Py_ssize_t size type mismatch in Cython decoder (Low)
Location: src/psd_tools/compression/_rle.pyx, lines 18–20
cdef int i = 0
cdef int j = 0
cdef int length = data.shape[0]
All loop indices are C signed int (32-bit). The size parameter is Py_ssize_t (64-bit on modern platforms). The comparison j < size promotes j to Py_ssize_t, but if j wraps due to a row size exceeding INT_MAX (~2.1 GB), the resulting comparison is undefined behaviour in C. In practice, row sizes are bounded by PSD/PSB dimension limits and are unreachable at this scale; however, the mismatch is a latent defect if the function is ever called directly with large synthetic inputs.
Impact: Theoretical infinite loop or UB at >2 GB row sizes; not reachable from standard PSD/PSB parsing.
Suggested mitigation: Change cdef int i, j, length to cdef Py_ssize_t.
5. Silent data degradation not surfaced to callers (Informational)
Location: src/psd_tools/compression/__init__.py, lines 144–157
The tolerant RLE decoder (introduced in 2a006f5) replaces malformed channel data with zero-padded (black) pixels and emits a logger.warning. This is the correct trade-off over crashing, but the warning is only observable if the caller has configured a log handler. The public PSDImage API does not surface channel-level decode failures to the user in any other way.
Impact: A user parsing a silently corrupt file gets a visually wrong image with no programmatic signal to check.
Suggested mitigation: Consider exposing a per-channel decode-error flag or raising a distinct warning category that users can filter or escalate via the warnings module.
6. encode() zero-length return type inconsistency in Cython (Informational)
Location: src/psd_tools/compression/_rle.pyx, lines 66–67
if length == 0:
return data # returns a memoryview, not an explicit std::string
All other return paths return an explicit cdef string result. This path returns data (a const unsigned char[:] memoryview) and relies on Cython's implicit coercion to bytes. It is functionally equivalent today but is semantically inconsistent and fragile if Cython's coercion rules change in a future version.
Impact: Potential silent breakage in future Cython versions; not a current security issue.
Suggested mitigation: Replace return data with return result (the already-declared empty string).
Environment
- Branch:
fix/invalid-rle-compression
- Reviewed commits:
7490ffa, 2a006f5
- Python: 3.x (Cython extension compiled for CPython)
References
Summary
A security review of the
psd_tools.compressionmodule (conducted against thefix/invalid-rle-compressionbranch, commits7490ffa–2a006f5) identified the following pre-existing issues. The two findings introduced and fixed by those commits (Cython buffer overflow,IndexErroron lone repeat header) are excluded from this report.Findings
1. Unguarded
zlib.decompress— ZIP bomb / memory exhaustion (Medium)Location:
src/psd_tools/compression/__init__.py, lines 159 and 162zlib.decompressis called without amax_lengthcap. A crafted PSD file containing a ZIP-compressed channel whose compressed payload expands to gigabytes would exhaust process memory before any limit is enforced. The RLE path is not vulnerable to this because the decoder pre-allocates exactlyrow_size × heightbytes; the ZIP path has no equivalent ceiling.Impact: Denial-of-service / OOM crash when processing untrusted PSD files.
Suggested mitigation: Pass a reasonable
max_lengthtozlib.decompress, derived from the expectedwidth * height * depth // 8byte count already computed indecompress().2. No upper-bound validation on image dimensions before allocation (Low)
Location:
src/psd_tools/compression/__init__.py, lines 138 and 193Neither
width,height, nordepthare range-checked before these values drive memory allocation. The PSD format (version 2 / PSB) permits dimensions up to 300,000 × 300,000 pixels; a 4-channel 32-bit image at that size would require ~144 TB to hold. While the OS/Python allocator will reject such a request, there is no early, explicit guard that produces a clean, user-facing error.Impact: Uncontrolled allocation attempt from a malformed or adversarially crafted PSB file; hard crash rather than a recoverable error.
Suggested mitigation: Validate
width,height, anddepthagainst known PSD/PSB limits before entering decompression, and raise a descriptiveValueErrorearly.3.
assertused as a runtime integrity check (Low)Location:
src/psd_tools/compression/__init__.py, line 170This assertion can be silently disabled by running the interpreter with
-O(or-OO), which strips allassertstatements. If the assertion ever becomes relevant (e.g., after future refactoring), disabling it would allow a length mismatch to propagate silently into downstream image compositing.Impact: Loss of an integrity guard in optimised deployments.
Suggested mitigation: Replace with an explicit
if+raise ValueError(...).4.
cdef intindices vs.Py_ssize_t sizetype mismatch in Cython decoder (Low)Location:
src/psd_tools/compression/_rle.pyx, lines 18–20All loop indices are C
signed int(32-bit). Thesizeparameter isPy_ssize_t(64-bit on modern platforms). The comparisonj < sizepromotesjtoPy_ssize_t, but ifjwraps due to a row size exceedingINT_MAX(~2.1 GB), the resulting comparison is undefined behaviour in C. In practice, row sizes are bounded by PSD/PSB dimension limits and are unreachable at this scale; however, the mismatch is a latent defect if the function is ever called directly with large synthetic inputs.Impact: Theoretical infinite loop or UB at >2 GB row sizes; not reachable from standard PSD/PSB parsing.
Suggested mitigation: Change
cdef int i,j,lengthtocdef Py_ssize_t.5. Silent data degradation not surfaced to callers (Informational)
Location:
src/psd_tools/compression/__init__.py, lines 144–157The tolerant RLE decoder (introduced in
2a006f5) replaces malformed channel data with zero-padded (black) pixels and emits alogger.warning. This is the correct trade-off over crashing, but the warning is only observable if the caller has configured a log handler. The publicPSDImageAPI does not surface channel-level decode failures to the user in any other way.Impact: A user parsing a silently corrupt file gets a visually wrong image with no programmatic signal to check.
Suggested mitigation: Consider exposing a per-channel decode-error flag or raising a distinct warning category that users can filter or escalate via the
warningsmodule.6.
encode()zero-length return type inconsistency in Cython (Informational)Location:
src/psd_tools/compression/_rle.pyx, lines 66–67All other return paths return an explicit
cdef string result. This path returnsdata(aconst unsigned char[:]memoryview) and relies on Cython's implicit coercion tobytes. It is functionally equivalent today but is semantically inconsistent and fragile if Cython's coercion rules change in a future version.Impact: Potential silent breakage in future Cython versions; not a current security issue.
Suggested mitigation: Replace
return datawithreturn result(the already-declared emptystring).Environment
fix/invalid-rle-compression7490ffa,2a006f5References