Document Storage Key Patterns
When a document is uploaded to Papra, the system generates a storage key that determines where and how the file is stored in the underlying storage backend (filesystem, S3, Azure Blob Storage, etc.). By default, Papra uses a legacy system that generates opaque storage keys based on the document ID. With storage key patterns, you can define a human-readable, customizable naming scheme for your stored files.
How It Works
Section titled “How It Works”A storage key pattern is a string template containing expressions enclosed in double curly braces ({{ }}). When a document is uploaded, each expression is evaluated using the document’s metadata and replaced with the resulting value.
For example, the default pattern:
{{organization.id}}/{{document.name}}For a document named invoice-2025.pdf in organization org_123456789012345678901234, this produces:
org_123456789012345678901234/invoice-2025.pdfStorage Key Persistence
Section titled “Storage Key Persistence”The generated storage key is stored in the database alongside each document record. This means that changing the pattern configuration does not affect existing documents, only newly uploaded documents will use the updated pattern. Existing documents retain their original storage keys and continue to be served from their original location.
Enabling Storage Key Patterns
Section titled “Enabling Storage Key Patterns”At the moment, Papra uses the legacy storage key system for backward compatibility. To enable the new pattern-based system, set the following environment variable:
DOCUMENT_STORAGE_USE_LEGACY_STORAGE_KEY_DEFINITION_SYSTEM=falseThen configure your desired pattern:
DOCUMENT_STORAGE_KEY_PATTERN={{organization.id}}/{{document.name}}Docker Compose Example
Section titled “Docker Compose Example”services: papra: container_name: papra image: ghcr.io/papra-hq/papra:latest restart: unless-stopped environment: # ... other environment variables ... - DOCUMENT_STORAGE_USE_LEGACY_STORAGE_KEY_DEFINITION_SYSTEM=false - DOCUMENT_STORAGE_KEY_PATTERN={{organization.id}}/{{document.name}} volumes: - ./app-data:/app/app-data ports: - "1221:1221"Pattern Syntax
Section titled “Pattern Syntax”A pattern is a string that can contain:
- Literal text: Any text outside of
{{ }}is kept as-is (e.g., path separators/, prefixes, etc.) - Expressions: Placeholders enclosed in
{{and}}that are evaluated at upload time - Transformers: Optional functions applied to expression values using the pipe
|operator
The general syntax for an expression with transformers is:
{{expression | transformer1 | transformer2 arg1 arg2}}Multiple transformers can be chained, and each transformer receives the output of the previous one.
Available Expressions
Section titled “Available Expressions”| Expression | Description | Example Output |
|---|---|---|
document.id | The unique document identifier | doc_123456789012345678901234 |
document.name | The original file name (sanitized for safe filesystem use) | invoice-2025.pdf |
organization.id | The organization identifier | org_123456789012345678901234 |
currentDate | The current date and time in ISO 8601 format | 2025-06-15T14:30:00.000Z |
currentDate.yyyy | Current year (4 digits) | 2025 |
currentDate.MM | Current month (2 digits, zero-padded) | 06 |
currentDate.dd | Current day of month (2 digits, zero-padded) | 15 |
currentDate.HH | Current hour (2 digits, 24-hour, zero-padded) | 14 |
currentDate.mm | Current minute (2 digits, zero-padded) | 30 |
currentDate.ss | Current second (2 digits, zero-padded) | 00 |
currentDate.SSS | Current millisecond (3 digits, zero-padded) | 000 |
random | A random 8-character alphanumeric string | k9x2m4pq |
Example:
{{organization.id}}/{{currentDate.yyyy}}/{{currentDate.MM}}/{{document.name}}
Available Transformers
Section titled “Available Transformers”Transformers modify the value of an expression. They are applied using the pipe (|) operator after the expression name.
| Transformer | Arguments | Description | Example |
|---|---|---|---|
uppercase | None | Converts the value to uppercase | {{document.name | uppercase}} |
lowercase | None | Converts the value to lowercase | {{document.name | lowercase}} |
formatDate | Optional format string (default: {yyyy}-{MM}-{dd}) | Formats a date value | {{currentDate | formatDate {yyyy}/{MM}}} |
padStart | Target length, optional pad character (default: space) | Pads the start of the value | {{currentDate.MM | padStart 2 0}} |
padEnd | Target length, optional pad character (default: space) | Pads the end of the value | {{document.id | padEnd 20 _}} |
Transformer Chaining
Section titled “Transformer Chaining”Transformers can be chained to apply multiple transformations in sequence:
{{document.name | lowercase | padEnd 20 _}}Transformer Arguments
Section titled “Transformer Arguments”Arguments are space-separated after the transformer name. If an argument contains spaces, wrap it in double quotes:
{{currentDate | formatDate {yyyy}-{MM}-{dd}}}Pattern Examples
Section titled “Pattern Examples”Here are some common patterns for different use cases:
Organize by Organization and Document Name (Default)
Section titled “Organize by Organization and Document Name (Default)”DOCUMENT_STORAGE_KEY_PATTERN={{organization.id}}/{{document.name}}Organize by Date
Section titled “Organize by Date”DOCUMENT_STORAGE_KEY_PATTERN={{currentDate.yyyy}}/{{currentDate.MM}}/{{document.name}}Organize by Date with Unique ID
Section titled “Organize by Date with Unique ID”DOCUMENT_STORAGE_KEY_PATTERN={{organization.id}}/{{currentDate.yyyy}}/{{currentDate.MM}}/{{document.id}}-{{document.name}}Conflict Resolution
Section titled “Conflict Resolution”When using patterns that don’t guarantee uniqueness (e.g., patterns based on document.name alone), two documents could generate the same storage key. Papra handles this automatically to prevent data loss using a two-stage conflict resolution mechanism.
Stage 1: Incremental Suffix
Section titled “Stage 1: Incremental Suffix”When a storage key already exists, Papra appends an incrementing numeric suffix before the file extension:
org_abc123/invoice.pdf # original (already exists)org_abc123/invoice_1.pdf # first conflictorg_abc123/invoice_2.pdf # second conflictorg_abc123/invoice_3.pdf # third conflict...org_abc123/invoice_9.pdf # ninth conflict (default max)The suffix is inserted before the file extension and after the file name, separated by an underscore. For files without an extension, the suffix is appended at the end (e.g., README becomes README_1).
By default, up to 9 incremental suffix attempts are made (configurable, see below). This means a total of 10 possible slots for a given storage key: the original plus 9 suffixed variants.
Stage 2: Random Suffix Fallback
Section titled “Stage 2: Random Suffix Fallback”If all incremental suffix attempts are exhausted (i.e., all 10 slots are taken), Papra falls back to appending a random 8-character alphanumeric string as a suffix:
org_abc123/invoice_k9x2m4pq.pdf # random suffix fallbackThis provides a very high probability of finding an available key (with 8-character random suffix, need more than 17 million files with same name and path to have a 50% chance of collision). If even this random suffix collides (extremely unlikely), the upload will fail with an error rather than overwriting existing data.
Conflict Resolution Configuration
Section titled “Conflict Resolution Configuration”| Variable | Description | Default |
|---|---|---|
DOCUMENT_STORAGE_PATTERN_MAX_INCREMENTAL_SUFFIX_ATTEMPTS | How many incremental suffixes to try (e.g., _1, _2, …). Set to 0 to skip incremental suffixes entirely. | 9 |
DOCUMENT_STORAGE_PATTERN_ENABLE_RANDOM_SUFFIX_FALLBACK | Whether to try a random suffix if all incremental attempts are exhausted | true |
Disabling Conflict Resolution
Section titled “Disabling Conflict Resolution”If you want to disable conflict resolution, and want to reject uploads that would cause a storage key collision (not sure why you would want this, but it’s possible), set:
DOCUMENT_STORAGE_PATTERN_MAX_INCREMENTAL_SUFFIX_ATTEMPTS=0DOCUMENT_STORAGE_PATTERN_ENABLE_RANDOM_SUFFIX_FALLBACK=falseConfiguration Reference
Section titled “Configuration Reference”| Variable | Description | Default |
|---|---|---|
DOCUMENT_STORAGE_USE_LEGACY_STORAGE_KEY_DEFINITION_SYSTEM | Use the legacy storage key format ({orgId}/originals/{docId}.{ext}). Set to false to enable patterns. | true |
DOCUMENT_STORAGE_KEY_PATTERN | The pattern template for generating storage keys | {{organization.id}}/{{document.name}} |
DOCUMENT_STORAGE_PATTERN_MAX_INCREMENTAL_SUFFIX_ATTEMPTS | Maximum number of incremental suffix attempts for conflict resolution | 9 |
DOCUMENT_STORAGE_PATTERN_ENABLE_RANDOM_SUFFIX_FALLBACK | Enable random suffix fallback when all incremental suffixes are exhausted | true |
Legacy System
Section titled “Legacy System”The legacy storage key system generates keys in the format:
{{organization.id}}/originals/{{document.id}}For example: org_abc123/originals/doc_123456789012345678901234
Since the legacy system uses the unique document ID as the file name, conflicts are impossible and no suffix mechanism is needed. At the moment, this system remains the default while the new pattern-based system is being developed and tested.
To keep using the legacy system, either leave the configuration unchanged or explicitly set:
DOCUMENT_STORAGE_USE_LEGACY_STORAGE_KEY_DEFINITION_SYSTEM=truePattern Validation
Section titled “Pattern Validation”Storage key patterns are validated at application startup. If a pattern contains an invalid expression or an unknown transformer, Papra will refuse to start and report the error. This prevents misconfigured patterns from causing issues at upload time.
A pattern is invalid if:
- It references an expression that does not exist (e.g.,
{{unknown.field}}) - It uses a transformer that does not exist (e.g.,
{{document.name | nonexistent}}) - It ends with a
/(trailing slashes are not allowed)