Skip to content

Document Storage Key Patterns

When a document is uploaded to Papra, the system generates a storage key that determines where and how the file is stored in the underlying storage backend (filesystem, S3, Azure Blob Storage, etc.). By default, Papra uses a legacy system that generates opaque storage keys based on the document ID. With storage key patterns, you can define a human-readable, customizable naming scheme for your stored files.

A storage key pattern is a string template containing expressions enclosed in double curly braces ({{ }}). When a document is uploaded, each expression is evaluated using the document’s metadata and replaced with the resulting value.

For example, the default pattern:

{{organization.id}}/{{document.name}}

For a document named invoice-2025.pdf in organization org_123456789012345678901234, this produces:

org_123456789012345678901234/invoice-2025.pdf

The generated storage key is stored in the database alongside each document record. This means that changing the pattern configuration does not affect existing documents, only newly uploaded documents will use the updated pattern. Existing documents retain their original storage keys and continue to be served from their original location.

At the moment, Papra uses the legacy storage key system for backward compatibility. To enable the new pattern-based system, set the following environment variable:

Terminal window
DOCUMENT_STORAGE_USE_LEGACY_STORAGE_KEY_DEFINITION_SYSTEM=false

Then configure your desired pattern:

Terminal window
DOCUMENT_STORAGE_KEY_PATTERN={{organization.id}}/{{document.name}}
docker-compose.yml
services:
papra:
container_name: papra
image: ghcr.io/papra-hq/papra:latest
restart: unless-stopped
environment:
# ... other environment variables ...
- DOCUMENT_STORAGE_USE_LEGACY_STORAGE_KEY_DEFINITION_SYSTEM=false
- DOCUMENT_STORAGE_KEY_PATTERN={{organization.id}}/{{document.name}}
volumes:
- ./app-data:/app/app-data
ports:
- "1221:1221"

A pattern is a string that can contain:

  • Literal text: Any text outside of {{ }} is kept as-is (e.g., path separators /, prefixes, etc.)
  • Expressions: Placeholders enclosed in {{ and }} that are evaluated at upload time
  • Transformers: Optional functions applied to expression values using the pipe | operator

The general syntax for an expression with transformers is:

{{expression | transformer1 | transformer2 arg1 arg2}}

Multiple transformers can be chained, and each transformer receives the output of the previous one.

ExpressionDescriptionExample Output
document.idThe unique document identifierdoc_123456789012345678901234
document.nameThe original file name (sanitized for safe filesystem use)invoice-2025.pdf
organization.idThe organization identifierorg_123456789012345678901234
currentDateThe current date and time in ISO 8601 format2025-06-15T14:30:00.000Z
currentDate.yyyyCurrent year (4 digits)2025
currentDate.MMCurrent month (2 digits, zero-padded)06
currentDate.ddCurrent day of month (2 digits, zero-padded)15
currentDate.HHCurrent hour (2 digits, 24-hour, zero-padded)14
currentDate.mmCurrent minute (2 digits, zero-padded)30
currentDate.ssCurrent second (2 digits, zero-padded)00
currentDate.SSSCurrent millisecond (3 digits, zero-padded)000
randomA random 8-character alphanumeric stringk9x2m4pq

Example:

org_123456789012345678901234/2025/06/invoice-2025.pdf
{{organization.id}}/{{currentDate.yyyy}}/{{currentDate.MM}}/{{document.name}}

Transformers modify the value of an expression. They are applied using the pipe (|) operator after the expression name.

TransformerArgumentsDescriptionExample
uppercaseNoneConverts the value to uppercase{{document.name | uppercase}}
lowercaseNoneConverts the value to lowercase{{document.name | lowercase}}
formatDateOptional format string (default: {yyyy}-{MM}-{dd})Formats a date value{{currentDate | formatDate {yyyy}/{MM}}}
padStartTarget length, optional pad character (default: space)Pads the start of the value{{currentDate.MM | padStart 2 0}}
padEndTarget length, optional pad character (default: space)Pads the end of the value{{document.id | padEnd 20 _}}

Transformers can be chained to apply multiple transformations in sequence:

{{document.name | lowercase | padEnd 20 _}}

Arguments are space-separated after the transformer name. If an argument contains spaces, wrap it in double quotes:

{{currentDate | formatDate {yyyy}-{MM}-{dd}}}

Here are some common patterns for different use cases:

Organize by Organization and Document Name (Default)

Section titled “Organize by Organization and Document Name (Default)”
org_abc123/invoice-2025.pdf
DOCUMENT_STORAGE_KEY_PATTERN={{organization.id}}/{{document.name}}
2025/06/invoice-2025.pdf
DOCUMENT_STORAGE_KEY_PATTERN={{currentDate.yyyy}}/{{currentDate.MM}}/{{document.name}}
org_123456789012345678901234/2025/06/doc_123456789012345678901234-invoice-2025.pdf
DOCUMENT_STORAGE_KEY_PATTERN={{organization.id}}/{{currentDate.yyyy}}/{{currentDate.MM}}/{{document.id}}-{{document.name}}

When using patterns that don’t guarantee uniqueness (e.g., patterns based on document.name alone), two documents could generate the same storage key. Papra handles this automatically to prevent data loss using a two-stage conflict resolution mechanism.

When a storage key already exists, Papra appends an incrementing numeric suffix before the file extension:

org_abc123/invoice.pdf # original (already exists)
org_abc123/invoice_1.pdf # first conflict
org_abc123/invoice_2.pdf # second conflict
org_abc123/invoice_3.pdf # third conflict
...
org_abc123/invoice_9.pdf # ninth conflict (default max)

The suffix is inserted before the file extension and after the file name, separated by an underscore. For files without an extension, the suffix is appended at the end (e.g., README becomes README_1).

By default, up to 9 incremental suffix attempts are made (configurable, see below). This means a total of 10 possible slots for a given storage key: the original plus 9 suffixed variants.

If all incremental suffix attempts are exhausted (i.e., all 10 slots are taken), Papra falls back to appending a random 8-character alphanumeric string as a suffix:

org_abc123/invoice_k9x2m4pq.pdf # random suffix fallback

This provides a very high probability of finding an available key (with 8-character random suffix, need more than 17 million files with same name and path to have a 50% chance of collision). If even this random suffix collides (extremely unlikely), the upload will fail with an error rather than overwriting existing data.

VariableDescriptionDefault
DOCUMENT_STORAGE_PATTERN_MAX_INCREMENTAL_SUFFIX_ATTEMPTSHow many incremental suffixes to try (e.g., _1, _2, …). Set to 0 to skip incremental suffixes entirely.9
DOCUMENT_STORAGE_PATTERN_ENABLE_RANDOM_SUFFIX_FALLBACKWhether to try a random suffix if all incremental attempts are exhaustedtrue

If you want to disable conflict resolution, and want to reject uploads that would cause a storage key collision (not sure why you would want this, but it’s possible), set:

Terminal window
DOCUMENT_STORAGE_PATTERN_MAX_INCREMENTAL_SUFFIX_ATTEMPTS=0
DOCUMENT_STORAGE_PATTERN_ENABLE_RANDOM_SUFFIX_FALLBACK=false
VariableDescriptionDefault
DOCUMENT_STORAGE_USE_LEGACY_STORAGE_KEY_DEFINITION_SYSTEMUse the legacy storage key format ({orgId}/originals/{docId}.{ext}). Set to false to enable patterns.true
DOCUMENT_STORAGE_KEY_PATTERNThe pattern template for generating storage keys{{organization.id}}/{{document.name}}
DOCUMENT_STORAGE_PATTERN_MAX_INCREMENTAL_SUFFIX_ATTEMPTSMaximum number of incremental suffix attempts for conflict resolution9
DOCUMENT_STORAGE_PATTERN_ENABLE_RANDOM_SUFFIX_FALLBACKEnable random suffix fallback when all incremental suffixes are exhaustedtrue

The legacy storage key system generates keys in the format:

{{organization.id}}/originals/{{document.id}}

For example: org_abc123/originals/doc_123456789012345678901234

Since the legacy system uses the unique document ID as the file name, conflicts are impossible and no suffix mechanism is needed. At the moment, this system remains the default while the new pattern-based system is being developed and tested.

To keep using the legacy system, either leave the configuration unchanged or explicitly set:

Terminal window
DOCUMENT_STORAGE_USE_LEGACY_STORAGE_KEY_DEFINITION_SYSTEM=true

Storage key patterns are validated at application startup. If a pattern contains an invalid expression or an unknown transformer, Papra will refuse to start and report the error. This prevents misconfigured patterns from causing issues at upload time.

A pattern is invalid if:

  • It references an expression that does not exist (e.g., {{unknown.field}})
  • It uses a transformer that does not exist (e.g., {{document.name | nonexistent}})
  • It ends with a / (trailing slashes are not allowed)