Merge branch 'main' into id-translate

This commit is contained in:
Anthony Stirling 2023-12-28 09:03:02 +00:00 committed by GitHub
commit 48158379ee
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
5 changed files with 114 additions and 67 deletions

View File

@ -20,7 +20,7 @@ Install the following software, if not already installed:
- Git - Git
- Python 3 (with pip) - Python 3.8 (with pip)
- Make - Make
@ -95,14 +95,14 @@ For Debian-based systems, you can use the following command:
```bash ```bash
sudo apt-get install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper ocrmypdf sudo apt-get install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper ocrmypdf
pip3 install uno opencv-python-headless unoconv pngquant pip3 install uno opencv-python-headless unoconv pngquant WeasyPrint
``` ```
For Fedora: For Fedora:
```bash ```bash
sudo dnf install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper ocrmypdf sudo dnf install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper ocrmypdf
pip3 install uno opencv-python-headless unoconv pngquant pip3 install uno opencv-python-headless unoconv pngquant WeasyPrint
``` ```
### Step 4: Clone and Build Stirling-PDF ### Step 4: Clone and Build Stirling-PDF
@ -176,7 +176,7 @@ rpm -qa | grep tesseract-langpack | sed 's/tesseract-langpack-//g'
```bash ```bash
./gradlew bootRun ./gradlew bootRun
or or
java -jar build/libs/app.jar java -jar /opt/Stirling-PDF/Stirling-PDF-*.jar
``` ```
### Step 8: Adding a Desktop icon ### Step 8: Adding a Desktop icon
@ -202,6 +202,64 @@ EOF
Note: Currently the app will run in the background until manually closed. Note: Currently the app will run in the background until manually closed.
### Optional: Run Stirling-PDF as a service
First create a .env file, where you can store environment variables:
```
touch /opt/Stirling-PDF/.env
```
In this file you can add all variables, one variable per line, as stated in the main readme (for example SYSTEM_DEFAULTLOCALE="de-DE").
Create a new file where we store our service settings and open it with nano editor:
```
nano /etc/systemd/system/stirlingpdf.service
```
Paste this content, make sure to update the filename of the jar-file. Press Ctrl+S and Ctrl+X to save and exit the nano editor:
```
[Unit]
Description=Stirling-PDF service
After=syslog.target network.target
[Service]
SuccessExitStatus=143
User=root
Group=root
Type=simple
EnvironmentFile=/opt/Stirling-PDF/.env
WorkingDirectory=/opt/Stirling-PDF
ExecStart=/usr/bin/java -jar Stirling-PDF-0.17.2.jar
ExecStop=/bin/kill -15 $MAINPID
[Install]
WantedBy=multi-user.target
```
Notify systemd that it has to rebuild its internal service database (you have to run this command every time you make a change in the service file):
```
sudo systemctl daemon-reload
```
Enable the service to tell the service to start it automatically:
```
sudo systemctl enable stirlingpdf.service
```
See the status of the service:
```
sudo systemctl status stirlingpdf.service
```
Manually start/stop/restart the service:
```
sudo systemctl start stirlingpdf.service
sudo systemctl stop stirlingpdf.service
sudo systemctl restart stirlingpdf.service
```
--- ---
Remember to set the necessary environment variables before running the project if you want to customize the application the list can be seen in the main readme. Remember to set the necessary environment variables before running the project if you want to customize the application the list can be seen in the main readme.

View File

@ -14,10 +14,9 @@ This is a powerful locally hosted web based PDF manipulation tool using docker t
Stirling PDF makes no outbound calls for any record keeping or tracking. Stirling PDF makes no outbound calls for any record keeping or tracking.
All files and PDFs are either purely client side, in server memory only during the execution of the task or within a temporay file only for execution of the task. All files and PDFs exist either exclusively on the client side, reside in server memory only during task execution, or temporarily reside in a file solely for the execution of the task. Any file downloaded by the user will have been deleted from the server by that point.
Any file which has been downloaded by the user will have already been deleted from the server by that time.
Feel free to request any features or bug fixes either in github issues or our [Discord](https://discord.gg/Cn8pWhQRxZ) Please feel free to submit feature requests or report bugs either through GitHub issues or on our [Discord](https://discord.gg/Cn8pWhQRxZ)
![stirling-home](images/stirling-home.png) ![stirling-home](images/stirling-home.png)
@ -268,7 +267,7 @@ For API usage you must provide a header with 'X-API-Key' and the associated API
- Fill forms mannual and automatic - Fill forms mannual and automatic
### Q2: Why is my application downloading .htm files? ### Q2: Why is my application downloading .htm files?
This is a issue caused commonly by your NGINX congifuration. The default file upload size for NGINX is 1MB, you need to add the following in your Nginx sites-available file. ``client_max_body_size SIZE;`` Where "SIZE" is 50M for example for 50MB files. This is a issue caused commonly by your NGINX configuration. The default file upload size for NGINX is 1MB, you need to add the following in your Nginx sites-available file. ``client_max_body_size SIZE;`` Where "SIZE" is 50M for example for 50MB files.
### Q3: Why is my download timing out ### Q3: Why is my download timing out
NGINX has timeout values by default so if you are running Stirling-PDF behind NGINX you may need to set a timeout value such as adding the config ``proxy_read_timeout 3600;`` NGINX has timeout values by default so if you are running Stirling-PDF behind NGINX you may need to set a timeout value such as adding the config ``proxy_read_timeout 3600;``

View File

@ -1,10 +1,11 @@
import cv2 import cv2
import sys import sys
import argparse import argparse
import numpy as np
def is_blank_image(image_path, threshold=10, white_percent=99, white_value=255, blur_size=5): def is_blank_image(image_path, threshold=10, white_percent=99, white_value=255, blur_size=5):
image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE) image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
if image is None: if image is None:
print(f"Error: Unable to read the image file: {image_path}") print(f"Error: Unable to read the image file: {image_path}")
return False return False
@ -15,19 +16,11 @@ def is_blank_image(image_path, threshold=10, white_percent=99, white_value=255,
_, thresholded_image = cv2.threshold(blurred_image, white_value - threshold, white_value, cv2.THRESH_BINARY) _, thresholded_image = cv2.threshold(blurred_image, white_value - threshold, white_value, cv2.THRESH_BINARY)
# Calculate the percentage of white pixels in the thresholded image # Calculate the percentage of white pixels in the thresholded image
white_pixels = 0 white_pixels = np.sum(thresholded_image == white_value)
total_pixels = thresholded_image.size white_pixel_percentage = (white_pixels / thresholded_image.size) * 100
for i in range(0, thresholded_image.shape[0], 2):
for j in range(0, thresholded_image.shape[1], 2):
if thresholded_image[i, j] == white_value:
white_pixels += 1
white_pixel_percentage = (white_pixels / (i * thresholded_image.shape[1] + j + 1)) * 100
if white_pixel_percentage < white_percent:
return False
print(f"Page has white pixel percent of {white_pixel_percentage}") print(f"Page has white pixel percent of {white_pixel_percentage}")
return True return white_pixel_percentage >= white_percent
if __name__ == "__main__": if __name__ == "__main__":
@ -39,9 +32,6 @@ if __name__ == "__main__":
blank = is_blank_image(args.image_path, args.threshold, args.white_percent) blank = is_blank_image(args.image_path, args.threshold, args.white_percent)
if blank: # Return code 1: The image is considered blank.
# Return code 1: The image is considered blank. # Return code 0: The image is not considered blank.
sys.exit(1) sys.exit(int(blank))
else:
# Return code 0: The image is not considered blank.
sys.exit(0)

View File

@ -336,23 +336,23 @@ home.autoRedact.title=Redazione automatica
home.autoRedact.desc=Redige automaticamente (oscura) il testo in un PDF in base al testo immesso home.autoRedact.desc=Redige automaticamente (oscura) il testo in un PDF in base al testo immesso
showJS.tags=JS showJS.tags=JS
home.tableExtraxt.title=PDF to CSV home.tableExtraxt.title=Da PDF a CSV
home.tableExtraxt.desc=Extracts Tables from a PDF converting it to CSV home.tableExtraxt.desc=Estrae tabelle da un PDF convertendolo in CSV
tableExtraxt.tags=CSV,Table Extraction,extract,convert tableExtraxt.tags=CSV,Estrazione tabella,estrai,converti
home.autoSizeSplitPDF.title=Auto Split by Size/Count home.autoSizeSplitPDF.title=Divisione automatica per dimensione/numero
home.autoSizeSplitPDF.desc=Split a single PDF into multiple documents based on size, page count, or document count home.autoSizeSplitPDF.desc=Dividi un singolo PDF in più documenti in base alle dimensioni, al numero di pagine o al numero di documenti
autoSizeSplitPDF.tags=pdf,split,document,organization autoSizeSplitPDF.tags=pdf,diviso,documento,organizzazione
home.overlay-pdfs.title=Overlay PDFs home.overlay-pdfs.title=Overlay PDFs
home.overlay-pdfs.desc=Overlays PDFs on-top of another PDF home.overlay-pdfs.desc=Overlays PDFs on-top of another PDF
overlay-pdfs.tags=Overlay overlay-pdfs.tags=Overlay
home.split-by-sections.title=Split PDF by Sections home.split-by-sections.title=Dividi PDF per sezioni
home.split-by-sections.desc=Divide each page of a PDF into smaller horizontal and vertical sections home.split-by-sections.desc=Dividi ciascuna pagina di un PDF in sezioni orizzontali e verticali più piccole
split-by-sections.tags=Section Split, Divide, Customize split-by-sections.tags=Dividi sezione, dividi, personalizza
########################### ###########################
# # # #
@ -692,10 +692,10 @@ split.submit=Dividi
imageToPDF.title=Immagine a PDF imageToPDF.title=Immagine a PDF
imageToPDF.header=Immagine a PDF imageToPDF.header=Immagine a PDF
imageToPDF.submit=Converti imageToPDF.submit=Converti
imageToPDF.selectLabel=Image Fit Options imageToPDF.selectLabel=Opzioni di adattamento immagine
imageToPDF.fillPage=Fill Page imageToPDF.fillPage=Riempi la pagina
imageToPDF.fitDocumentToImage=Fit Page to Image imageToPDF.fitDocumentToImage=Adatta la pagina all'immagine
imageToPDF.maintainAspectRatio=Maintain Aspect Ratios imageToPDF.maintainAspectRatio=Mantieni le proporzioni
imageToPDF.selectText.2=Ruota automaticamente PDF imageToPDF.selectText.2=Ruota automaticamente PDF
imageToPDF.selectText.3=Logica multi-file (funziona solo se ci sono più immagini) imageToPDF.selectText.3=Logica multi-file (funziona solo se ci sono più immagini)
imageToPDF.selectText.4=Unisci in un unico PDF imageToPDF.selectText.4=Unisci in un unico PDF
@ -845,41 +845,41 @@ PDFToXML.submit=Converti
#PDFToCSV #PDFToCSV
PDFToCSV.title=Da PDF a CSV PDFToCSV.title=Da PDF a CSV
PDFToCSV.header=Da PDF a CSV PDFToCSV.header=Da PDF a CSV
PDFToCSV.prompt=Choose page to extract table PDFToCSV.prompt=Scegli la pagina per estrarre la tabella
PDFToCSV.submit=Estratto PDFToCSV.submit=Estratto
#split-by-size-or-count #split-by-size-or-count
split-by-size-or-count.header=Split PDF by Size or Count split-by-size-or-count.header=Dividi il PDF per dimensione o numero
split-by-size-or-count.type.label=Select Split Type split-by-size-or-count.type.label=Seleziona il tipo di divisione
split-by-size-or-count.type.size=By Size split-by-size-or-count.type.size=Per dimensione
split-by-size-or-count.type.pageCount=By Page Count split-by-size-or-count.type.pageCount=Per numero di pagine
split-by-size-or-count.type.docCount=By Document Count split-by-size-or-count.type.docCount=Per numero di documento
split-by-size-or-count.value.label=Enter Value split-by-size-or-count.value.label=Inserire il valore
split-by-size-or-count.value.placeholder=Enter size (e.g., 2MB or 3KB) or count (e.g., 5) split-by-size-or-count.value.placeholder=Inserisci la dimensione (ad esempio, 2 MB o 3 KB) o il numero (ad esempio, 5)
split-by-size-or-count.submit=Submit split-by-size-or-count.submit=Separa
#overlay-pdfs #overlay-pdfs
overlay-pdfs.header=Overlay PDF Files overlay-pdfs.header=Invia file PDF in sovrapposizione
overlay-pdfs.baseFile.label=Select Base PDF File overlay-pdfs.baseFile.label=Seleziona File PDF di base
overlay-pdfs.overlayFiles.label=Select Overlay PDF Files overlay-pdfs.overlayFiles.label=Seleziona sovrapposizione file PDF
overlay-pdfs.mode.label=Select Overlay Mode overlay-pdfs.mode.label=Seleziona la modalità di sovrapposizione
overlay-pdfs.mode.sequential=Sequential Overlay overlay-pdfs.mode.sequential=Sovrapposizione sequenziale
overlay-pdfs.mode.interleaved=Interleaved Overlay overlay-pdfs.mode.interleaved=Interleaved Overlay
overlay-pdfs.mode.fixedRepeat=Fixed Repeat Overlay overlay-pdfs.mode.fixedRepeat=Fixed Repeat Overlay
overlay-pdfs.counts.label=Overlay Counts (for Fixed Repeat Mode) overlay-pdfs.counts.label=Overlay Counts (for Fixed Repeat Mode)
overlay-pdfs.counts.placeholder=Enter comma-separated counts (e.g., 2,3,1) overlay-pdfs.counts.placeholder=Inserisci i numeri separati da virgole (ad esempio, 2,3,1)
overlay-pdfs.position.label=Select Overlay Position overlay-pdfs.position.label=Seleziona posizione di sovrapposizione
overlay-pdfs.position.foreground=Foreground overlay-pdfs.position.foreground=Primo piano
overlay-pdfs.position.background=Background overlay-pdfs.position.background=Sfondo
overlay-pdfs.submit=Submit overlay-pdfs.submit=Sovrapponi
#split-by-sections #split-by-sections
split-by-sections.title=Split PDF by Sections split-by-sections.title=Dividi PDF per sezioni
split-by-sections.header=Split PDF into Sections split-by-sections.header=Dividi il PDF in sezioni
split-by-sections.horizontal.label=Horizontal Divisions split-by-sections.horizontal.label=Divisioni orizzontali
split-by-sections.vertical.label=Vertical Divisions split-by-sections.vertical.label=Divisioni verticali
split-by-sections.horizontal.placeholder=Enter number of horizontal divisions split-by-sections.horizontal.placeholder=Inserire il numero di divisioni orizzontali
split-by-sections.vertical.placeholder=Enter number of vertical divisions split-by-sections.vertical.placeholder=Inserire il numero di divisioni verticlai
split-by-sections.submit=Split PDF split-by-sections.submit=Dividi PDF

View File

@ -87,13 +87,13 @@
<div class="pageNumber" id="5" style="top: 50%; left: 50%;">5</div> <div class="pageNumber" id="5" style="top: 50%; left: 50%;">5</div>
<div class="pageNumber" id="6" style="top: 50%; left: 90%;">6</div> <div class="pageNumber" id="6" style="top: 50%; left: 90%;">6</div>
<div class="pageNumber" id="7" style="top: 90%; left: 10%;">7</div> <div class="pageNumber" id="7" style="top: 90%; left: 10%;">7</div>
<div class="pageNumber" id="8" style="top: 90%; left: 50%;">8</div> <div class="pageNumber selectedPosition" id="8" style="top: 90%; left: 50%;">8</div>
<div class="pageNumber" id="9" style="top: 90%; left: 90%;">9</div> <div class="pageNumber" id="9" style="top: 90%; left: 90%;">9</div>
</div> </div>
</div> </div>
<input type="hidden" id="numberInput" name="position" min="1" <input type="hidden" id="numberInput" name="position" min="1"
max="9" required> max="9" value="8" required />
<div class="mb-3"> <div class="mb-3">
<label for="startingNumber" th:text="#{addPageNumbers.selectText.4}"></label> <input <label for="startingNumber" th:text="#{addPageNumbers.selectText.4}"></label> <input
type="number" class="form-control" id="startingNumber" type="number" class="form-control" id="startingNumber"