The Document Processing tool provides intelligent document conversion capabilities for PDF, DOCX, XLSX, PPTX, HTML, CSV, PNG, and JPG files using the powerful Docling library.
Note: This tool is disabled by default. To enable it, set the ENABLE_ADDITIONAL_TOOLS environment variable to include process_document.
Convert documents to structured Markdown while preserving formatting, extracting tables, images, and metadata. The tool offers processing profiles for different use cases, from simple text extraction to advanced diagram analysis with AI models.
Note: mcp-devtools also providers a PDF extraction tool that's not quite as smart but is quick and doesn't require docling, see PDF Processing for more details.
This tool is experimental and actively developed.
- Multi-format Support: PDF, DOCX, XLSX, PPTX, HTML, CSV, PNG, JPG
- Processing Profiles: Simplified interface with preset configurations
- Intelligent Conversion: Preserves document structure and formatting
- OCR Support: Extract text from scanned documents
- Hardware Acceleration: Supports MPS (macOS), CUDA, and CPU processing
- Caching System: Avoids reprocessing identical documents
- Metadata Extraction: Document metadata (title, author, page count, etc.)
- Table & Image Extraction: Preserves tables and images in markdown
- Diagram Analysis: Advanced diagram detection using vision models
- Mermaid Generation: Convert diagrams to editable Mermaid syntax
- Auto-Save: Automatically saves processed content to files
First, enable the tool by setting the environment variable:
ENABLE_ADDITIONAL_TOOLS="process_document"Then ensure docling is installed in the environment you'll be running the MCP Server from:
pip install -U pip doclingYou can simply prompt the agent using the tool, e.g: "Use your document processing tool to convert and save /path/to/document.pdf to markdown".
{
"name": "process_document",
"arguments": {
"source": "/path/to/document.pdf"
}
}This uses the default text-and-image profile and saves to /path/to/document.md.
{
"name": "process_document",
"arguments": {
"source": "/path/to/document.pdf",
"profile": "basic"
}
}- Text extraction only
- Fastest processing
- No image or diagram analysis
- Best for: Simple text documents, quick content extraction
{
"name": "process_document",
"arguments": {
"source": "/path/to/document.pdf",
"profile": "text-and-image"
}
}- Text and image extraction
- Table processing
- Good balance of speed and features
- Best for: Most document types, general use
{
"name": "process_document",
"arguments": {
"source": "/path/to/scanned-document.pdf",
"profile": "scanned"
}
}- Optimised for scanned documents
- OCR enabled by default
- Best for: Image-based PDFs, scanned documents
{
"name": "process_document",
"arguments": {
"source": "/path/to/document.pdf",
"profile": "llm-smoldocling"
}
}- Enhanced with SmolDocling vision model
- Diagram detection and description
- Chart data extraction
- No external LLM required
- Best for: Documents with diagrams and charts
{
"name": "process_document",
"arguments": {
"source": "/path/to/document.pdf",
"profile": "llm-external"
}
}- Full diagram-to-Mermaid conversion
- Requires LLM environment variables
- Most advanced processing capabilities
- Best for: Complex documents with many diagrams
- Requires: LLM configuration (see setup below)
{
"name": "process_document",
"arguments": {
"source": "/path/to/document.pdf"
}
}- Saves to
/path/to/document.md - Images saved in same directory
- Returns success message with file path
{
"name": "process_document",
"arguments": {
"source": "/path/to/document.pdf",
"save_to": "/custom/path/output.md"
}
}{
"name": "process_document",
"arguments": {
"source": "/path/to/document.pdf",
"return_inline_only": true
}
}- Python 3.10+ (ideally 3.13+)
- Docling (auto-installed if missing)
The tool will attempt to install Docling automatically if not found.
DOCLING_PYTHON_PATH="/path/to/python" # Auto-detected if not setThe tool automatically detects Python installations with Docling in the following order:
DOCLING_PYTHON_PATHenvironment variable (highest priority).python-versionfile in current directory or home directory- Cached Python path from previous detection
- Common Python installation paths
.python-version Support:
The tool respects .python-version files (used by pyenv, asdf, and other version managers) for automatic Python version selection:
- Checks current working directory first
- Falls back to home directory if not found in working directory
- Supports version formats like
3.11.5or3.11 - Automatically resolves Python paths from:
- pyenv:
~/.pyenv/versions/ - asdf:
~/.asdf/installs/python/ - UV:
~/.local/share/uv/python/ - System: Homebrew and standard paths
- pyenv:
Example .python-version file:
3.11.5
DOCLING_CACHE_DIR="~/.mcp-devtools/docling-cache"
DOCLING_CACHE_ENABLED="true"DOCLING_HARDWARE_ACCELERATION="auto" # auto, mps, cuda, cpuDOCLING_TIMEOUT="300" # Processing timeout in seconds (default: 300 = 5 minutes)
DOCLING_MAX_FILE_SIZE="100" # Maximum file size in MB (default: 100 MB)
DOCLING_MAX_MEMORY_LIMIT="5368709120" # Memory limit in bytes (default: 5GB)
MCP_DEVTOOLS_MEMORY_LIMIT="5368709120" # Go application memory limit in bytes (default: 5GB)The tool implements memory limits to prevent runaway memory usage during document processing:
-
Go Application Limit: Set via
MCP_DEVTOOLS_MEMORY_LIMIT(default: 5GB)- Soft limit enforced by Go runtime's garbage collector
- Automatically triggers more aggressive GC when approaching limit
-
Python Process Limit: Set via
DOCLING_MAX_MEMORY_LIMIT(default: 5GB)- Hard limit enforced by OS resource limits
- Process terminated if limit exceeded
Example configuration for stricter limits:
# Limit to 2GB for both Go and Python
MCP_DEVTOOLS_MEMORY_LIMIT="2147483648"
DOCLING_MAX_MEMORY_LIMIT="2147483648"DOCLING_OCR_LANGUAGES="en,fr,de"DOCLING_VLM_API_URL="http://localhost:11434/v1" # OpenAI-compatible endpoint
DOCLING_VLM_MODEL="granite_docling" # Vision-capable model (default: granite_docling)
DOCLING_VLM_API_KEY="your-api-key-here" # API keyFor environments with MITM proxies:
DOCLING_EXTRA_CA_CERTS="/path/to/mitm-ca-bundle.pem"OCR Disabled (Default):
- Best for: Digital documents (native PDFs, Word documents)
- Advantages: Faster, perfect accuracy, preserves formatting
- How it works: Extracts text directly from document structure
OCR Enabled (scanned profile):
- Best for: Scanned documents, image-based PDFs, photos
- Advantages: Processes any document type, handles handwritten text
- How it works: Uses computer vision to recognise text from images
{
"name": "process_document",
"arguments": {
"profile": "scanned",
"ocr_languages": ["en", "fr", "de", "es"]
}
}Supported languages: English (en), French (fr), German (de), Spanish (es), Italian (it), Portuguese (pt), Dutch (nl), Russian (ru), Chinese (zh), Japanese (ja), Korean (ko), and many others.
The llm-smoldocling profile uses built-in vision models:
- Automatic diagram detection
- Type classification with confidence scores
- Element extraction
- No external services required
The llm-external profile converts diagrams to Mermaid syntax:
- Ollama (local):
http://localhost:11434/v1 - LM Studio (local):
http://localhost:1234/v1 - OpenAI:
https://api.openai.com/v1 - OpenRouter:
https://openrouter.ai/api/v1
DOCLING_VLM_API_URL="http://localhost:11434/v1"
DOCLING_VLM_MODEL="granite_docling" # Default VLM model (qwen2.5vl:7b-q8_0, or any other vision-capable model)
DOCLING_VLM_API_KEY="your-api-key"
DOCLING_LLM_MAX_TOKENS="16384"
DOCLING_LLM_TEMPERATURE="0.1"
DOCLING_LLM_TIMEOUT="240"- Automatic Detection: Identifies flowcharts, architecture diagrams, charts
- Mermaid Conversion: Generates valid Mermaid syntax
- AWS Colour Coding: Consistent colour schemes for architecture diagrams
- Validation: Validates generated Mermaid syntax
- Fallback Handling: Graceful degradation if LLM unavailable
{
"success": true,
"message": "Content successfully exported to file",
"save_path": "/path/to/document.md",
"source": "/path/to/document.pdf",
"cache_hit": false,
"metadata": {
"file_size": 15420,
"document_title": "Document Title",
"document_author": "Author Name",
"page_count": 10,
"word_count": 1500
},
"processing_info": {
"processing_mode": "advanced",
"processing_method": "advanced+vision:standard",
"hardware_acceleration": "mps",
"ocr_enabled": false,
"processing_time": 2.5,
"timestamp": "2025-07-09T22:12:15+10:00"
}
}{
"source": "/path/to/document.pdf",
"content": "# Document Title\n\nDocument content in markdown...",
"cache_hit": false,
"metadata": {
"title": "Document Title",
"author": "Author Name",
"page_count": 10
},
"images": [
{
"id": "image_1",
"type": "picture",
"caption": "Figure 1",
"file_path": "/path/to/extracted/image_1.png"
}
],
"diagrams": [
{
"id": "diagram_1",
"type": "flowchart",
"description": "Process flow diagram showing...",
"mermaid_code": "flowchart TD\n A[Start] --> B[Process]\n B --> C[End]",
"confidence": 0.95
}
]
}basic: 1-3 secondstext-and-image: 3-10 secondsscanned: 10-30 secondsllm-smoldocling: 5-15 secondsllm-external: 15-60 seconds
- CPU: Baseline performance
- MPS (macOS): 2-5x faster on Apple Silicon
- CUDA: 3-10x faster on NVIDIA GPUs
Intelligent caching based on:
- Document source and modification time
- Processing parameters and profile
- 24-hour TTL by default
{
"name": "process_document",
"arguments": {
"source": "/path/to/research-paper.pdf",
"profile": "llm-smoldocling"
}
}{
"name": "process_document",
"arguments": {
"source": "/path/to/scanned-invoice.pdf",
"profile": "scanned"
}
}{
"name": "process_document",
"arguments": {
"source": "/path/to/architecture-doc.pdf",
"profile": "llm-external"
}
}{
"name": "process_document",
"arguments": {
"source": "/path/to/simple-doc.pdf",
"profile": "basic"
}
}"Python path is required but not found"
- Install Python 3.10+ and ensure it's in PATH
- Set
DOCLING_PYTHON_PATHenvironment variable - Or create a
.python-versionfile in your project directory or home directory - Supported version managers: pyenv, asdf, UV
"Docling not available"
- Install:
pip install docling - Verify:
python -c "import docling; print('OK')"
"Processing timeout"
- Increase
DOCLING_TIMEOUTenvironment variable - Use faster profile (
basicinstead ofllm-external)
"Hardware acceleration not working"
- Install appropriate PyTorch version
- Check:
python -c "import torch; print(torch.backends.mps.is_available())"
"LLM external profile not available"
- Set all
DOCLING_LLM_*environment variables - Verify LLM endpoint accessibility
- Ensure model supports vision input
{
"name": "process_document",
"arguments": {
"source": "/path/to/document.pdf",
"debug": true
}
}For technical implementation details, see the Document Processing source documentation.