Astonous

Uploading a file to a Salesforce record looks simple from the outside. Beneath that simplicity is a content management system with a three-object architecture, a versioning model, a sharing framework, multiple API surfaces, and governor limits that interact in ways that are not always obvious until they cause a problem in production. For developers building integrations, automations, or large-scale data migrations, understanding how Salesforce file storage actually works is not optional — it is the foundation for building solutions that scale, perform predictably, and do not quietly accumulate storage costs that nobody planned for.

‍

The Object Model — Three Objects, One Architecture

Salesforce Files are implemented across three core objects that work together to handle storage, versioning, and sharing. Understanding what each one does — and what happens when you interact with any of them — is the starting point for everything else.

‍

ContentDocument

ContentDocument is the logical representation of a file. There is one ContentDocument per file regardless of how many versions of that file exist. It holds the file's metadata — title, file type, content size, and a pointer to the latest published version. The critical thing developers need to know: deleting a ContentDocument deletes every version of that file simultaneously. There is no partial deletion at the document level.

‍

ContentVersion

ContentVersion is where the actual binary data lives. Every time a file is updated, a new ContentVersion record is created — the old version is not overwritten, it is retained. The VersionData field holds the binary content. Each ContentVersion carries its own VersionNumber, IsLatest flag, ContentLocation, and Checksum fields. The storage implication here is significant and frequently overlooked: every ContentVersion consumes file storage independently. An actively versioned file does not occupy one slot in your storage allocation — it occupies one slot per version. In high-volume environments, unmanaged versioning is one of the most common sources of unexpected storage consumption.

‍

ContentDocumentLink

ContentDocumentLink is the sharing layer. It controls which records, users, and Chatter groups have access to a file by linking a ContentDocument to a target entity. The LinkedEntityId field identifies what the file is linked to. ShareType defines the permission level — Viewer, Collaborator, or Inferred. Visibility controls whether the file is accessible to all users, internal users only, or only those it has been explicitly shared with. The architectural advantage here is meaningful: one file can be linked to multiple records without duplicating the underlying storage. A contract document can appear on an Account, an Opportunity, and a Case simultaneously — one ContentDocument, three ContentDocumentLink records, no additional storage cost.

APIs for File Management

‍

REST API

The REST API is the right choice for modern integrations and external system interactions. It handles file uploads, metadata queries, and binary content downloads through two primary endpoint patterns: /services/data/vXX.X/sobjects/ContentVersion for record-level operations and /services/data/vXX.X/connect/files for the Files Connect surface. For any integration that needs to move files between Salesforce and an external system, REST is the primary tool.

Apex

For server-side file operations, Apex handles ContentVersion creation directly. A basic file upload looks like this:

ContentVersion cv = new ContentVersion();
cv.Title = 'Contract';
cv.PathOnClient = 'Contract.pdf';
cv.VersionData = Blob.valueOf('Sample Content');
cv.FirstPublishLocationId = accountId;
insert cv;

Inserting a ContentVersion with a FirstPublishLocationId automatically creates the associated ContentDocument and the ContentDocumentLink to the target record three objects managed through a single insert operation. This is the standard pattern for file creation in Apex automations and triggers.

‍

Bulk API

The Bulk API supports large data migration scenarios but has limited native support for binary payloads. In practice, high-volume file migrations typically use the Bulk API for record metadata and REST for the actual binary content, combining both surfaces to handle scale.

"Every ContentVersion you create consumes storage independently — which means every developer who does not understand file architecture is quietly accumulating a cost nobody budgeted for. Salesforce Files are not just documents attached to records. They are a content management system that rewards the developers who treat them like one."

Storage Limits and What Developers Often Miss

Salesforce file storage allocation starts at 10 GB per org and adds 2 GB per user license. An org with 100 users has a total allocation of 210 GB. That sounds generous until you consider that each ContentVersion counts against the allocation independently, and that organizations with long version histories, high document volumes, or legacy Attachment records can reach limits faster than expected.

Three things developers frequently overlook on this front. First, deleting ContentVersions — not just the ContentDocument — is required to actually reclaim storage. Deleting only the document without managing versions leaves orphaned data consuming allocation. Second, deleting a ContentDocumentLink does not delete the underlying file — orphaned files that are no longer linked to any record continue consuming storage indefinitely unless explicitly identified and removed. Third, legacy Attachment records from Classic-era implementations count separately and are often forgotten during storage audits.

SOQL Patterns for Storage Management

Finding your largest files by storage consumption:

‍

SELECT ContentDocumentId, ContentSize
FROM ContentVersion
WHERE IsLatest = true
ORDER BY ContentSize DESC

SELECT ContentDocumentId
FROM ContentDocumentLink
WHERE LinkedEntityId = :recordId

SELECT Id
FROM ContentDocument
WHERE Id NOT IN (
  SELECT ContentDocumentId FROM ContentDocumentLink
)

‍

Running these queries as part of regular storage governance prevents the accumulation of unlinked files and helps identify version bloat before it becomes a storage problem.

‍

Security and Access Control

File access in Salesforce is governed at three levels that interact with each other. ContentDocumentLink.Visibility determines the broadest access boundary — AllUsers, InternalUsers, or SharedUsers. Record-level sharing rules apply to files linked to records, meaning a file linked to a restricted Account inherits the Account's sharing configuration. User permissions at the profile and permission set level provide the third layer of control. ShareType on the ContentDocumentLink determines what a linked entity can do with the file. V grants view access. C grants collaborator access, allowing edits. I is system-managed inferred sharing that Salesforce assigns automatically in certain scenarios. For integrations that create ContentDocumentLink records programmatically, explicitly setting ShareType and Visibility is required — relying on defaults can produce access configurations that do not match the intended behavior.

‍

Performance and Governor Limits

Large VersionData payloads have a direct impact on Apex heap size — loading binary file content into memory in a synchronous context is one of the fastest ways to hit heap limits in a trigger or scheduled job. The correct pattern for large file downloads is streaming via REST rather than loading VersionData into an Apex Blob. Triggers on ContentVersion are technically supported but should be used with caution. High-volume file environments generate ContentVersion records at scale, and a poorly optimized trigger on this object can produce significant governor limit pressure. When triggers are genuinely necessary — for virus scanning integration, automated metadata tagging, or file validation — ensure they are bulkified correctly and consider whether the use case can be handled asynchronously. For any file processing that involves meaningful computation — format conversion, content extraction, downstream notifications — use Queueable or Batch Apex rather than synchronous processing. The synchronous context is not the right place for file operations at scale.

‍

External Storage Integration

For organizations handling high file volumes or large individual files, Salesforce should function as a metadata and relationship layer rather than the primary file repository. Storing binary content in Amazon S3, Azure Blob Storage, or a similar object store — and maintaining a URL or external identifier in Salesforce — keeps storage costs predictable and keeps large binary payloads out of the governor limit context entirely. Salesforce Connect provides the framework for external object integration. Named Credentials handle authentication to external storage services in a way that is both secure and maintainable. This pattern is well-established in enterprise Salesforce implementations and is worth evaluating early for any project where file volume is expected to be significant.

‍

Migration Considerations

File migrations into Salesforce follow a consistent four-step pattern. Insert the ContentVersion with the binary content and relevant metadata. Capture the ContentDocumentId that Salesforce generates automatically. Create the ContentDocumentLink records to associate the file with the appropriate records. Validate ownership, visibility settings, and ShareType to confirm that access configuration matches the intended state.Order matters here. Attempting to create ContentDocumentLink records before the ContentDocument exists will fail. And validating access configuration after migration — rather than assuming defaults are correct — prevents the kind of silent access misconfigurations that only surface when a user reports they cannot see a file they should have access to.

‍

Official References

ContentDocument Object Reference:developer.salesforce.com/docs/atlas.en-us.object_reference.meta/object_reference/sforce_api_objects_contentdocument.htm

ContentVersion Object Reference:developer.salesforce.com/docs/atlas.en-us.object_reference.meta/object_reference/sforce_api_objects_contentversion.htm

ContentDocumentLink Object Reference:developer.salesforce.com/docs/atlas.en-us.object_reference.meta/object_reference/sforce_api_objects_contentdocumentlink.htm

Migrating Attachments to Salesforce Files:help.salesforce.com/s/articleView?id=sf.collab_admin_files_migrate.htm

Salesforce Connect Overview:developer.salesforce.com/docs/atlas.en-us.platform_connect.meta/platform_connect

Named Credentials:developer.salesforce.com/docs/atlas.en-us.apexcode.meta/apexcode/apex_named_credentials.htm

Salesforce File Storage: A Developer-Focused Deep Dive Into Architecture, APIs, and Limits