Merge pull request #155 from trytomakeyouprivate/patch-3

Update LocalRunGuide.md
This commit is contained in:
Anthony Stirling 2023-05-14 19:55:26 +01:00 committed by GitHub
commit 802ae3643c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -52,11 +52,11 @@ sudo dnf install -y git automake autoconf libtool leptonica-devel pkg-config zli
### Step 2: Clone and Build jbig2enc (Only required for certain OCR functionality) ### Step 2: Clone and Build jbig2enc (Only required for certain OCR functionality)
```bash ```bash
git clone https:github.com/agl/jbig2enc git clone https://github.com/agl/jbig2enc.git &&\
cd jbig2enc cd jbig2enc &&\
./autogen.sh ./autogen.sh &&\
./configure ./configure &&\
make make &&\
sudo make install sudo make install
``` ```
@ -97,15 +97,16 @@ pip3 install opencv-python-headless
For Fedora: For Fedora:
```bash ```bash
sudo dnf install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper ocrmypdf tesseract-osd sudo dnf install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper ocrmypdf
pip3 install uno opencv-python-headless unoconv pngquant pip3 install uno opencv-python-headless unoconv pngquant
``` ```
### Step 4: Clone and Build Stirling-PDF ### Step 4: Clone and Build Stirling-PDF
```bash ```bash
git clone https://github.com/Frooodle/Stirling-PDF.git git clone https://github.com/Frooodle/Stirling-PDF.git &&\
cd Stirling-PDF cd Stirling-PDF &&\
chmod +x ./gradlew &&\
./gradlew build ./gradlew build
``` ```
@ -117,18 +118,49 @@ You can move this file to a desired location, for example, `/opt/Stirling-PDF/`.
You must also move the Script folder within the Stirling-PDF repo that you have downloaded to this directory. You must also move the Script folder within the Stirling-PDF repo that you have downloaded to this directory.
This folder is required for the python scripts using OpenCV This folder is required for the python scripts using OpenCV
```bash
sudo mkdir /opt/Stirling-PDF &&\
sudo mv /build/libs/S-PDF-*.jar /opt/Stirling-PDF/ &&\
sudo mv scripts /opt/Stirling-PDF/ &&\
echo "Scripts installed."
```
### Step 6: Other files ### Step 6: Other files
#### OCR #### OCR
If you plan to use the OCR (Optical Character Recognition) functionality, you might need to install language packs for Tesseract if running none english scanning. If you plan to use the OCR (Optical Character Recognition) functionality, you might need to install language packs for Tesseract if running non-english scanning.
##### Installing Language Packs ##### Installing Language Packs
1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need. 1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need. You can also use your repositories provided langpacks.
2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/4.00/tessdata` 2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/4.00/tessdata`
Please view [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html) for more info. Please view [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html) for more info.
**IMPORTANT:** DO NOT REMOVE EXISTING `eng.traineddata`, IT'S REQUIRED. **IMPORTANT:** DO NOT REMOVE EXISTING `eng.traineddata`, IT'S REQUIRED.
Debian based systems, install languages with this command:
```bash
sudo apt update &&\
# All languages
# sudo apt install -y 'tesseract-ocr-*'
# Find languages:
apt search tesseract-ocr-
# View installed languages:
dpkg-query -W tesseract-ocr- | sed 's/tesseract-ocr-//g'
```
Fedora:
```bash
# All languages
# sudo dnf install -y tesseract-langpack-*
# Find languages:
dnf search -C tesseract-langpack-
# View installed languages:
rpm -qa | grep tesseract-langpack | sed 's/tesseract-langpack-//g'
```
### Step 7: Run Stirling-PDF ### Step 7: Run Stirling-PDF