Skip to content

​​Bug: Failure to Escape Square Brackets [/] in Link Text Causes Markdown Parsing Errors​​ #1302

@stallboy

Description

@stallboy

🔍 Description

When MarkItDown processes hyperlinks with square brackets [ or ] in their link text (e.g., [Learn [GPT]]), it fails to escape these characters in the output Markdown. This violates the CommonMark specification (Section 6.1), leading to:
Broken link rendering (Unmatched ']' errors)

Truncated link text (e.g., [Learn [GPT]] → parsed as two separate links)

Corruption of downstream LLM/document processing pipelines

🧪 Steps to Reproduce
Input: Convert a document containing a link with text Example [Text] (e.g., HTML: Example [Text] or Word hyperlink).

Conversion: Run MarkItDown to generate Markdown.

Output: https://url # UNSAFE: Unescaped brackets

Observed Result:

GitHub/VSCode preview: Link text truncates to Example [Text (ignores ])

Markdown parsers (e.g., markdown-it): Throw syntax errors

✅ Expected Behavior

Per CommonMark rules, square brackets in link text must be escaped:
https://url # CORRECT: Escaped brackets

Renders as: Example [Text] with functional link.

🌐 Impact
Critical: Breaks all workflows where link texts include [ ] (common in tech/docs).

Affected Components:

Markdown hyperlinks (url)

Reference-style links ([text][id])

Image alt-text (!img.png)

🛠 Suggested Fix

Implement escaping during link serialization:
// Pseudo-code (link renderer logic)
function escapeLinkText(text: string) {
return text.replace(/[[]]/g, "\$&"); // Escapes [ → [ , ] → ]

Standards Compliance:
https://spec.commonmark.org/0.30/#backslash-escapes

GFM: Identical escaping rules

🚧 Workarounds

Users currently must manually add \ to brackets post-conversion. Automation-unfriendly.


Environment: MarkItDown v0.9+, all input formats (PDF/Word/HTML).
Priority: High (blocking tech/docs use cases).
Tags: bug, markdown, links, escaping

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions