Stirling-PDF/app
Balázs Szücs 4d349c047b
[V2] feat(delete-form,modify-form,fill-form,extract-forms): add delete, modify, fill, and extract form functionality (#4830)
# Description of Changes

TLDR
- Adds `/api/v1/form/fields`, `/fill`, `/modify-fields`, and
`/delete-fields` endpoints for end-to-end AcroForm workflows.
- Centralizes form field detection, filling, modification, and deletion
logic in `FormUtils` with strict type handling.
- Introduces `FormPayloadParser` for resilient JSON parsing across
legacy flat payloads and new structured payloads.
- Reuses and extends `FormCopyUtils` plus `FormFieldTypeSupport` to
create, clone, and normalize widget properties when transforming forms.

### Implementation Details
- `FormFillController` updates the new multipart APIs, and streams
updated documents or metadata responses.
- `FormUtils` now owns extraction, template building, value application
(including flattening strategies), and field CRUD helpers used by the
controller endpoints.
- `FormPayloadParser` normalizes request bodies: accepts flat key/value
maps, combined `fields` arrays, or nested templates, returning
deterministic LinkedHashMap ordering for repeatable fills.
- `FormFieldTypeSupport` encapsulates per-type creation, value copying,
default appearance, and option handling; utilized by both modification
flows and `FormCopyUtils` transformations.
- `FormCopyUtils` exposes shared routines for making widgets across
documents

### API Surface (Multipart Form Data)
- `POST /api/v1/form/fields` -> returns `FormUtils.FormFieldExtraction`
with ordered `FormFieldInfo` records plus a fill template.
- `POST /api/v1/form/fill` -> applies parsed values via
`FormUtils.applyFieldValues`; optional `flatten` renders appearances
while respecting strict validation.
- `POST /api/v1/form/modify-fields` -> updates existing fields in-place
using `FormUtils.modifyFormFields` with definitions parsed from
`updates` payloads.
- `POST /api/v1/form/delete-fields` -> removes named fields after
`FormPayloadParser.parseNameList` deduplication and validation.

<img width="1305" height="284" alt="image"
src="https://github.com/user-attachments/assets/ef6f3d76-4dc4-42c1-a779-0649610cbf9a"
/>

### Individual endpoints:

<img width="1318" height="493" alt="image"
src="https://github.com/user-attachments/assets/65abfef9-50a2-42e6-8830-f07a7854d3c2"
/>
<img width="1310" height="582" alt="image"
src="https://github.com/user-attachments/assets/dd903773-5513-42d9-ba5d-3d8f204d6a0d"
/>
<img width="1318" height="493" alt="image"
src="https://github.com/user-attachments/assets/c22f65a7-721a-45bb-bb99-4708c423e89e"
/>
<img width="1318" height="493" alt="image"
src="https://github.com/user-attachments/assets/a76852f5-d5d1-442a-8e5e-d0f29404542a"
/>


### Data Validation & Type Safety
- Field type inference (`detectFieldType`) and choice option resolution
ensure only supported values are written; checkbox mapping uses export
states and boolean heuristics.
- Choice inputs pass through `filterChoiceSelections` /
`filterSingleChoiceSelection` to reject invalid entries and provide
actionable logs.
- Text fills leverage `setTextValue` to merge inline formatting
resources and regenerate appearances when necessary.
- `applyFieldValues` supports strict mode (default) to raise when
unknown fields are supplied, preventing silent data loss.


### Automation Workflow Support

The `/fill` and `/fields` endpoints are designed to work together for
automated form processing. The workflow is straightforward: extract the
form structure, modify the values, and submit for filling.


How It Works:
1. The `/fields` endpoint extracts all form field metadata from your PDF
2. You modify the returned JSON to set the desired values for each field
3. The `/fill` endpoint accepts this same JSON structure to populate the
form

Example Workflow:

```bash
# Step 1: Extract form structure and save to fields.json
curl -o fields.json \
     -F file=@Form.pdf \
     http://localhost:8080/api/v1/form/fields

# Step 2: Edit fields.json to update the "value" property for each field
# (Use your preferred text editor or script to modify the values)

# Step 3: Fill the form using the modified JSON
curl -o filled-form.pdf \
     -F file=@Form.pdf \
     -F data=@fields.json \
     http://localhost:8080/api/v1/form/fill
```

#### How to Fill the `template` JSON

The `template` (your data) is filled by creating key-value pairs that
match the "rules" defined in the `fields` array (the schema).

1. Find the Field `name`: Look in the `fields` array for the `name` of
the field you want to fill.
    * *Example:* `{"name": "Agent of Dependent", "type": "text", ...}`

2. Use `name` as the Key: This `name` becomes the key (in quotes) in
your `template` object.
    * *Example:* `{"Agent of Dependent": ...}`

3. Find the `type`: Look at the `type` for that same field. This tells
you what *kind* of value to provide.
    * `"type": "text"` requires a string (e.g., `"John Smith"`).
    * `"type": "checkbox"` requires a boolean (e.g., `true` or `false`).
* `"type": "combobox"` requires a string that *exactly matches* one of
its `"options"` (e.g., `"Choice 1"`).

4.  Add the Value: This matching value becomes the value for your key.

#### Correct Examples

* For a Textbox:
    * Schema: `{"name": "Agent of Dependent", "type": "text", ...}`
    * Template: `{"Agent of Dependent": "Mary Jane"}`

* For a Checkbox:
    * Schema: `{"name": "Option 2", "type": "checkbox", ...}`
    * Template: `{"Option 2": true}`

* For a Dropdown (Combobox):
* Schema: `{"name": "Dropdown2", "type": "combobox", "options": ["Choice
1", "Choice 2", ...] ...}`
    * Template: `{"Dropdown2": "Choice 1"}`

### Incorrect Examples (These Will Error)

* Wrong Type: `{"Option 2": "Checked"}`
* Error: "Option 2" is a `checkbox` and expects `true` or `false`, not a
string.
* Wrong Option: `{"Dropdown2": "Choice 99"}`
* Error: `"Choice 99"` is not listed in the `options` for "Dropdown2".

### For people manually doing this

For users filling forms manually, there's a simplified format that
focuses only on field names and values:

```json
{
  "FullName": "",
  "ID": "",
  "Gender": "Off",
  "Married": false,
  "City": "[]"
}
```

This format is easier to work with when you're manually editing the
JSON. You can skip the full metadata structure (type, label, required,
etc.) and just provide the field names with their values.

Important caveat: Even though the type information isn't visible in this
simplified format, type validation is still enforced by PDF viewers.
This simplified format just makes manual editing more convenient while
maintaining data integrity.

Please note: this suffers from:
https://issues.apache.org/jira/browse/PDFBOX-5962

Closes https://github.com/Stirling-Tools/Stirling-PDF/issues/237
Closes https://github.com/Stirling-Tools/Stirling-PDF/issues/3569
<!--
Please provide a summary of the changes, including:

- What was changed
- Why the change was made
- Any challenges encountered

Closes #(issue_number)
-->

---

## Checklist

### General

- [x] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [x] I have read the [Stirling-PDF Developer
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md)
(if applicable)
- [ ] I have read the [How to add new languages to
Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md)
(if applicable)
- [x] I have performed a self-review of my own code
- [x] My changes generate no new warnings

### Documentation

- [ ] I have updated relevant docs on [Stirling-PDF's doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
(if functionality has heavily changed)
- [ ] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)

### Translations (if applicable)

- [ ] I ran
[`scripts/counter_translation.py`](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/docs/counter_translation.md)

### UI Changes (if applicable)

- [x] Screenshots or videos demonstrating the UI changes are attached
(e.g., as comments or direct attachments in the PR)

### Testing (if applicable)

- [x] I have tested my changes locally. Refer to the [Testing
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/devGuide/DeveloperGuide.md#6-testing)
for more details.

---------

Signed-off-by: Balázs Szücs <bszucs1209@gmail.com>
Co-authored-by: Anthony Stirling <77850077+Frooodle@users.noreply.github.com>
2025-11-10 23:41:26 +00:00
..
common [V2] feat(delete-form,modify-form,fill-form,extract-forms): add delete, modify, fill, and extract form functionality (#4830) 2025-11-10 23:41:26 +00:00
core [V2] feat(delete-form,modify-form,fill-form,extract-forms): add delete, modify, fill, and extract form functionality (#4830) 2025-11-10 23:41:26 +00:00
proprietary [V2] feat(delete-form,modify-form,fill-form,extract-forms): add delete, modify, fill, and extract form functionality (#4830) 2025-11-10 23:41:26 +00:00
allowed-licenses.json feat(cbr-to-pdf,pdf-to-cbr): add PDF to/from CBR conversion with ebook optimization option (#4581) 2025-10-04 11:15:23 +01:00