locally hosted web application that allows you to perform various operations on PDF files
Go to file
Balázs Szücs 4d349c047b
[V2] feat(delete-form,modify-form,fill-form,extract-forms): add delete, modify, fill, and extract form functionality (#4830)
# Description of Changes

TLDR
- Adds `/api/v1/form/fields`, `/fill`, `/modify-fields`, and
`/delete-fields` endpoints for end-to-end AcroForm workflows.
- Centralizes form field detection, filling, modification, and deletion
logic in `FormUtils` with strict type handling.
- Introduces `FormPayloadParser` for resilient JSON parsing across
legacy flat payloads and new structured payloads.
- Reuses and extends `FormCopyUtils` plus `FormFieldTypeSupport` to
create, clone, and normalize widget properties when transforming forms.

### Implementation Details
- `FormFillController` updates the new multipart APIs, and streams
updated documents or metadata responses.
- `FormUtils` now owns extraction, template building, value application
(including flattening strategies), and field CRUD helpers used by the
controller endpoints.
- `FormPayloadParser` normalizes request bodies: accepts flat key/value
maps, combined `fields` arrays, or nested templates, returning
deterministic LinkedHashMap ordering for repeatable fills.
- `FormFieldTypeSupport` encapsulates per-type creation, value copying,
default appearance, and option handling; utilized by both modification
flows and `FormCopyUtils` transformations.
- `FormCopyUtils` exposes shared routines for making widgets across
documents

### API Surface (Multipart Form Data)
- `POST /api/v1/form/fields` -> returns `FormUtils.FormFieldExtraction`
with ordered `FormFieldInfo` records plus a fill template.
- `POST /api/v1/form/fill` -> applies parsed values via
`FormUtils.applyFieldValues`; optional `flatten` renders appearances
while respecting strict validation.
- `POST /api/v1/form/modify-fields` -> updates existing fields in-place
using `FormUtils.modifyFormFields` with definitions parsed from
`updates` payloads.
- `POST /api/v1/form/delete-fields` -> removes named fields after
`FormPayloadParser.parseNameList` deduplication and validation.

<img width="1305" height="284" alt="image"
src="https://github.com/user-attachments/assets/ef6f3d76-4dc4-42c1-a779-0649610cbf9a"
/>

### Individual endpoints:

<img width="1318" height="493" alt="image"
src="https://github.com/user-attachments/assets/65abfef9-50a2-42e6-8830-f07a7854d3c2"
/>
<img width="1310" height="582" alt="image"
src="https://github.com/user-attachments/assets/dd903773-5513-42d9-ba5d-3d8f204d6a0d"
/>
<img width="1318" height="493" alt="image"
src="https://github.com/user-attachments/assets/c22f65a7-721a-45bb-bb99-4708c423e89e"
/>
<img width="1318" height="493" alt="image"
src="https://github.com/user-attachments/assets/a76852f5-d5d1-442a-8e5e-d0f29404542a"
/>


### Data Validation & Type Safety
- Field type inference (`detectFieldType`) and choice option resolution
ensure only supported values are written; checkbox mapping uses export
states and boolean heuristics.
- Choice inputs pass through `filterChoiceSelections` /
`filterSingleChoiceSelection` to reject invalid entries and provide
actionable logs.
- Text fills leverage `setTextValue` to merge inline formatting
resources and regenerate appearances when necessary.
- `applyFieldValues` supports strict mode (default) to raise when
unknown fields are supplied, preventing silent data loss.


### Automation Workflow Support

The `/fill` and `/fields` endpoints are designed to work together for
automated form processing. The workflow is straightforward: extract the
form structure, modify the values, and submit for filling.


How It Works:
1. The `/fields` endpoint extracts all form field metadata from your PDF
2. You modify the returned JSON to set the desired values for each field
3. The `/fill` endpoint accepts this same JSON structure to populate the
form

Example Workflow:

```bash
# Step 1: Extract form structure and save to fields.json
curl -o fields.json \
     -F file=@Form.pdf \
     http://localhost:8080/api/v1/form/fields

# Step 2: Edit fields.json to update the "value" property for each field
# (Use your preferred text editor or script to modify the values)

# Step 3: Fill the form using the modified JSON
curl -o filled-form.pdf \
     -F file=@Form.pdf \
     -F data=@fields.json \
     http://localhost:8080/api/v1/form/fill
```

#### How to Fill the `template` JSON

The `template` (your data) is filled by creating key-value pairs that
match the "rules" defined in the `fields` array (the schema).

1. Find the Field `name`: Look in the `fields` array for the `name` of
the field you want to fill.
    * *Example:* `{"name": "Agent of Dependent", "type": "text", ...}`

2. Use `name` as the Key: This `name` becomes the key (in quotes) in
your `template` object.
    * *Example:* `{"Agent of Dependent": ...}`

3. Find the `type`: Look at the `type` for that same field. This tells
you what *kind* of value to provide.
    * `"type": "text"` requires a string (e.g., `"John Smith"`).
    * `"type": "checkbox"` requires a boolean (e.g., `true` or `false`).
* `"type": "combobox"` requires a string that *exactly matches* one of
its `"options"` (e.g., `"Choice 1"`).

4.  Add the Value: This matching value becomes the value for your key.

#### Correct Examples

* For a Textbox:
    * Schema: `{"name": "Agent of Dependent", "type": "text", ...}`
    * Template: `{"Agent of Dependent": "Mary Jane"}`

* For a Checkbox:
    * Schema: `{"name": "Option 2", "type": "checkbox", ...}`
    * Template: `{"Option 2": true}`

* For a Dropdown (Combobox):
* Schema: `{"name": "Dropdown2", "type": "combobox", "options": ["Choice
1", "Choice 2", ...] ...}`
    * Template: `{"Dropdown2": "Choice 1"}`

### Incorrect Examples (These Will Error)

* Wrong Type: `{"Option 2": "Checked"}`
* Error: "Option 2" is a `checkbox` and expects `true` or `false`, not a
string.
* Wrong Option: `{"Dropdown2": "Choice 99"}`
* Error: `"Choice 99"` is not listed in the `options` for "Dropdown2".

### For people manually doing this

For users filling forms manually, there's a simplified format that
focuses only on field names and values:

```json
{
  "FullName": "",
  "ID": "",
  "Gender": "Off",
  "Married": false,
  "City": "[]"
}
```

This format is easier to work with when you're manually editing the
JSON. You can skip the full metadata structure (type, label, required,
etc.) and just provide the field names with their values.

Important caveat: Even though the type information isn't visible in this
simplified format, type validation is still enforced by PDF viewers.
This simplified format just makes manual editing more convenient while
maintaining data integrity.

Please note: this suffers from:
https://issues.apache.org/jira/browse/PDFBOX-5962

Closes https://github.com/Stirling-Tools/Stirling-PDF/issues/237
Closes https://github.com/Stirling-Tools/Stirling-PDF/issues/3569
<!--
Please provide a summary of the changes, including:

- What was changed
- Why the change was made
- Any challenges encountered

Closes #(issue_number)
-->

---

## Checklist

### General

- [x] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [x] I have read the [Stirling-PDF Developer
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md)
(if applicable)
- [ ] I have read the [How to add new languages to
Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md)
(if applicable)
- [x] I have performed a self-review of my own code
- [x] My changes generate no new warnings

### Documentation

- [ ] I have updated relevant docs on [Stirling-PDF's doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
(if functionality has heavily changed)
- [ ] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)

### Translations (if applicable)

- [ ] I ran
[`scripts/counter_translation.py`](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/docs/counter_translation.md)

### UI Changes (if applicable)

- [x] Screenshots or videos demonstrating the UI changes are attached
(e.g., as comments or direct attachments in the PR)

### Testing (if applicable)

- [x] I have tested my changes locally. Refer to the [Testing
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md#6-testing)
for more details.

---------

Signed-off-by: Balázs Szücs <bszucs1209@gmail.com>
Co-authored-by: Anthony Stirling <77850077+Frooodle@users.noreply.github.com>
2025-11-10 23:41:26 +00:00
.devcontainer chore: update development configs, formatting tools, and CI enhancements (#4130) 2025-08-08 12:52:51 +01:00
.github Update PR-Auto-Deploy-V2.yml 2025-11-05 23:33:53 +00:00
.vscode V2 Auto rename (#4244) 2025-09-05 17:12:52 +01:00
app [V2] feat(delete-form,modify-form,fill-form,extract-forms): add delete, modify, fill, and extract form functionality (#4830) 2025-11-10 23:41:26 +00:00
devGuide Feature/v2/filehistory (#4370) 2025-09-16 15:08:11 +01:00
devTools refactor: move modules under app/ directory and update file paths (#3938) 2025-07-14 20:53:11 +01:00
docker Add audit system, invite links, and usage analytics (#4749) 2025-11-06 17:29:34 +00:00
docs pipeline bug, doc bugs, auto split new URL and doc (#2906) 2025-02-07 13:17:35 +00:00
exampleYmlFiles JWT Auth into V2 (#4187) 2025-08-15 14:13:45 +01:00
frontend Viewer update and autozoom (#4800) 2025-11-10 13:52:13 +00:00
gradle/wrapper Upgrade Gradle to 8.14 in CI Workflows and Gradle Wrapper (#3425) 2025-04-27 16:17:07 +01:00
images Update screenshots (#2875) 2025-02-04 11:24:35 +00:00
scripts V2 Tauri integration (#3854) 2025-11-05 11:44:59 +00:00
testing test fixes 2025-11-04 10:24:00 +00:00
.editorconfig Add linting to frontend (#4341) 2025-09-04 14:08:28 +01:00
.git-blame-ignore-revs refactor: move modules under app/ directory and update file paths (#3938) 2025-07-14 20:53:11 +01:00
.gitattributes refactor: move modules under app/ directory and update file paths (#3938) 2025-07-14 20:53:11 +01:00
.gitignore Merge remote-tracking branch 'origin/V2' into mainToV2 2025-10-12 20:45:25 +01:00
.pre-commit-config.yaml 🤖 format everything with pre-commit by stirlingbot (#4075) 2025-08-02 23:18:48 +01:00
ADDING_TOOLS.md V2 flatten (#4358) 2025-09-05 11:25:30 +00:00
build.gradle Merge remote-tracking branch 'origin/V2' into mainToV2 2025-11-03 23:01:41 +00:00
CLAUDE.md Feature/viewer annotation toggle (#4557) 2025-10-02 10:40:18 +01:00
CONTRIBUTING.md exception handling and exception improvements (#3858) 2025-07-02 16:51:45 +01:00
DATABASE.md feat(database): make backup schedule configurable via system keys (#4251) 2025-09-04 15:02:31 +01:00
DeveloperGuide.md V2 Tauri integration (#3854) 2025-11-05 11:44:59 +00:00
gradle.properties build(local): simplify writeVersion task with WriteProperties plugin and enable build caching (#4139) 2025-08-08 10:36:30 +01:00
gradlew Upgrade Gradle to 8.14 in CI Workflows and Gradle Wrapper (#3425) 2025-04-27 16:17:07 +01:00
gradlew.bat Upgrade Gradle to 8.14 in CI Workflows and Gradle Wrapper (#3425) 2025-04-27 16:17:07 +01:00
HowToUseOCR.md
launch4jConfig.xml ci: enhance GitHub Actions workflows with Gradle setup, caching improvements, and Docker image testing (#3956) 2025-07-16 17:17:11 +01:00
LICENSE refactor: move modules under app/ directory and update file paths (#3938) 2025-07-14 20:53:11 +01:00
package-lock.json Feature/v2/improve sign (#4627) 2025-10-09 13:35:42 +01:00
README.md 🌐 [V2] Sync Translations + Update README Progress Table (#4683) 2025-10-27 13:06:11 +00:00
SECURITY.md
settings.gradle refactor: move modules under app/ directory and update file paths (#3938) 2025-07-14 20:53:11 +01:00
test_globalsign.pdf V2 Validate PDF Signature tool (#4679) 2025-10-16 13:45:59 +01:00
test_irs_signed.pdf V2 Validate PDF Signature tool (#4679) 2025-10-16 13:45:59 +01:00

Stirling-PDF

Docker Pulls Discord OpenSSF Scorecard GitHub Repo stars

Stirling PDF - Open source locally hosted web PDF editor | Product Hunt Deploy to DO

Stirling-PDF is a robust, locally hosted web-based PDF manipulation tool using Docker. It enables you to carry out various operations on PDF files, including splitting, merging, converting, reorganizing, adding images, rotating, compressing, and more. This locally hosted web application has evolved to encompass a comprehensive set of features, addressing all your PDF requirements.

All files and PDFs exist either exclusively on the client side, reside in server memory only during task execution, or temporarily reside in a file solely for the execution of the task. Any file downloaded by the user will have been deleted from the server by that point.

Homepage: https://stirlingpdf.com

All documentation available at https://docs.stirlingpdf.com/

stirling-home

Features

  • 50+ PDF Operations
  • Parallel file processing and downloads
  • Dark mode support
  • Custom download options
  • Custom 'Pipelines' to run multiple features in a automated queue
  • API for integration with external scripts
  • Optional Login and Authentication support (see here for documentation)
  • Database Backup and Import (see here for documentation)
  • Enterprise features like SSO (see here for documentation)

PDF Features

Page Operations

  • View and modify PDFs - View multi-page PDFs with custom viewing, sorting, and searching. Plus, on-page edit features like annotating, drawing, and adding text and images. (Using PDF.js with Joxit and Liberation fonts)
  • Full interactive GUI for merging/splitting/rotating/moving PDFs and their pages
  • Merge multiple PDFs into a single resultant file
  • Split PDFs into multiple files at specified page numbers or extract all pages as individual files
  • Reorganize PDF pages into different orders
  • Rotate PDFs in 90-degree increments
  • Remove pages
  • Multi-page layout (format PDFs into a multi-paged page)
  • Scale page contents size by set percentage
  • Adjust contrast
  • Crop PDF
  • Auto-split PDF (with physically scanned page dividers)
  • Extract page(s)
  • Convert PDF to a single page
  • Overlay PDFs on top of each other
  • PDF to a single page
  • Split PDF by sections

Conversion Operations

  • Convert PDFs to and from images
  • Convert any common file to PDF (using LibreOffice)
  • Convert PDF to Word/PowerPoint/others (using LibreOffice)
  • Convert HTML to PDF
  • Convert PDF to XML
  • Convert PDF to CSV
  • URL to PDF
  • Markdown to PDF

Security & Permissions

  • Add and remove passwords
  • Change/set PDF permissions
  • Add watermark(s)
  • Certify/sign PDFs
  • Sanitize PDFs
  • Auto-redact text

Other Operations

  • Add/generate/write signatures
  • Split by Size or PDF
  • Repair PDFs
  • Detect and remove blank pages
  • Compare two PDFs and show differences in text
  • Add images to PDFs
  • Compress PDFs to decrease their filesize (using qpdf)
  • Extract images from PDF
  • Remove images from PDF
  • Extract images from scans
  • Remove annotations
  • Add page numbers
  • Auto-rename files by detecting PDF header text
  • OCR on PDF (using Tesseract OCR)
  • PDF/A conversion (using LibreOffice)
  • Edit metadata
  • Flatten PDFs
  • Get all information on a PDF to view or export as JSON
  • Show/detect embedded JavaScript

📖 Get Started

Visit our comprehensive documentation at docs.stirlingpdf.com for:

  • Installation guides for all platforms
  • Configuration options
  • Feature documentation
  • API reference
  • Security setup
  • Enterprise features

Supported Languages

Stirling-PDF currently supports 40 languages!

Language Progress
Arabic (العربية) (ar_AR) 83%
Azerbaijani (Azərbaycan Dili) (az_AZ) 32%
Basque (Euskara) (eu_ES) 18%
Bulgarian (Български) (bg_BG) 35%
Catalan (Català) (ca_CA) 34%
Croatian (Hrvatski) (hr_HR) 31%
Czech (Česky) (cs_CZ) 34%
Danish (Dansk) (da_DK) 30%
Dutch (Nederlands) (nl_NL) 30%
English (English) (en_GB) 100%
English (US) (en_US) 100%
French (Français) (fr_FR) 82%
German (Deutsch) (de_DE) 84%
Greek (Ελληνικά) (el_GR) 34%
Hindi (हिंदी) (hi_IN) 34%
Hungarian (Magyar) (hu_HU) 38%
Indonesian (Bahasa Indonesia) (id_ID) 31%
Irish (Gaeilge) (ga_IE) 34%
Italian (Italiano) (it_IT) 84%
Japanese (日本語) (ja_JP) 62%
Korean (한국어) (ko_KR) 34%
Norwegian (Norsk) (no_NB) 32%
Persian (فارسی) (fa_IR) 34%
Polish (Polski) (pl_PL) 36%
Portuguese (Português) (pt_PT) 34%
Portuguese Brazilian (Português) (pt_BR) 83%
Romanian (Română) (ro_RO) 28%
Russian (Русский) (ru_RU) 83%
Serbian Latin alphabet (Srpski) (sr_LATN_RS) 37%
Simplified Chinese (简体中文) (zh_CN) 85%
Slovakian (Slovensky) (sk_SK) 26%
Slovenian (Slovenščina) (sl_SI) 36%
Spanish (Español) (es_ES) 84%
Swedish (Svenska) (sv_SE) 33%
Thai (ไทย) (th_TH) 31%
Tibetan (བོད་ཡིག་) (bo_CN) 65%
Traditional Chinese (繁體中文) (zh_TW) 38%
Turkish (Türkçe) (tr_TR) 37%
Ukrainian (Українська) (uk_UA) 36%
Vietnamese (Tiếng Việt) (vi_VN) 28%
Malayalam (മലയാളം) (ml_IN) 73%

Stirling PDF Enterprise

Stirling PDF offers an Enterprise edition of its software. This is the same great software but with added features, support and comforts. Check out our Enterprise docs

🤝 Looking to contribute?

Join our community: