Skip to content

Commit

Permalink
update README (#79)
Browse files Browse the repository at this point in the history
  • Loading branch information
GreatV authored Sep 13, 2024
1 parent f9a6190 commit 9c69d7e
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 21 deletions.
18 changes: 8 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,6 @@ English | [简体中文](README_ch.md)
# PPOCRLabelv2

[![PyPI - Version](https://img.shields.io/pypi/v/PPOCRLabel)](https://pypi.org/project/PPOCRLabel/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/PPOCRLabel)](https://pypi.org/project/PPOCRLabel/)
[![PyPI - Downloads](https://img.shields.io/pypi/dd/PPOCRLabel)](https://github.com/PFCCLab/PPOCRLabel)
[![PyPI - Downloads](https://img.shields.io/pypi/dw/PPOCRLabel)](https://github.com/PFCCLab/PPOCRLabel)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/PPOCRLabel)](https://github.com/PFCCLab/PPOCRLabel)
[![Downloads](https://static.pepy.tech/badge/PPOCRLabel)](https://github.com/PFCCLab/PPOCRLabel)

Expand All @@ -18,9 +15,10 @@ PPOCRLabelv2 is a semi-automatic graphic annotation tool suitable for OCR field,
| <img src="./data/gif/multi-point.gif" width="80%"/> | <img src="./data/gif/kie.gif" width="100%"/> |

### Recent Update
- 2022.09: Added `Re-recognition` and `Auto Save Unsaved changes` features. For usage details, please refer to the "11. Additional Feature Description" in the "2.1 Operational Steps" section below.
- 2022.05: Add table annotations, follow `2.2 Table Annotations` for more information (by [whjdark](https://github.com/peterh0323); [Evezerest](https://github.com/Evezerest))
- 2022.02:(by [PeterH0323](https://github.com/peterh0323)

- 2024.09: Added `Re-recognition` and `Auto Save Unsaved changes` features. For usage details, please refer to the "11. Additional Feature Description" in the "2.1 Operational Steps" section below.
- 2022.05: Add table annotations, follow `2.2 Table Annotations` for more information (by [whjdark](https://github.com/peterh0323); [Evezerest](https://github.com/Evezerest))
- 2022.02: (by [PeterH0323](https://github.com/peterh0323))
- Add KIE Mode by using `--kie`, for [detection + identification + keyword extraction] labeling.
- Improve user experience: prompt for the number of files and labels, optimize interaction, and fix bugs such as only use CPU when inference
- New functions: Support using `C` or `X` to rotate box.
Expand All @@ -40,8 +38,6 @@ PPOCRLabelv2 is a semi-automatic graphic annotation tool suitable for OCR field,
- Click to modify the recognition result.(If you can't change the result, please switch to the system default input method, or switch back to the original input method again)
- 2020.12.18: Support re-recognition of a single label box (by [ninetailskim](https://github.com/ninetailskim) ), perfect shortcut keys.



## 1. Installation and Run

### 1.1 Install PaddlePaddle
Expand Down Expand Up @@ -86,6 +82,7 @@ PPOCRLabel --kie True # [KIE mode] for [detection + recognition + keyword extrac
```

#### MacOS

```bash
pip3 install PPOCRLabel
pip3 install opencv-contrib-python-headless==4.2.0.32
Expand All @@ -96,6 +93,7 @@ PPOCRLabel --kie True # [KIE mode] for [detection + recognition + keyword extrac
```

#### 1.2.2 Run PPOCRLabel by Python Script

If you modify the PPOCRLabel file (for example, specifying a new built-in model), it will be more convenient to see the results by running the Python script. If you still want to start with the whl package, you need to uninstall the whl package in the current environment and then recompile it according to the next section.

```bash
Expand All @@ -114,6 +112,7 @@ pip3 install -e .
```

#### 1.2.4 Pyinstaller build

```bash
cd ./PPOCRLabel
# install pyinstaller
Expand Down Expand Up @@ -162,6 +161,7 @@ PPOCRLabel.exe --lang ch
- `File` -> `Auto Save Unsaved changes`: By default, you need to press the `Check` button to complete the marking confirmation for the current box, which can be cumbersome. After checking, when switching to the next image (by pressing the shortcut key `D`), a prompt box asking to confirm whether to save unconfirmed markings will no longer appear. The current markings will be automatically saved and the next image will be switched, making it convenient for quick marking.

### 2.2 Table Annotation

The table annotation is aimed at extracting the structure of the table in a picture and converting it to Excel format,
so the annotation needs to be done simultaneously with external software to edit Excel.
In PPOCRLabel, complete the text information labeling (text and position), complete the table structure information
Expand Down Expand Up @@ -296,8 +296,6 @@ PPOCRLabel supports three ways to export Label.txt
pip install opencv-contrib-python-headless==4.2.0.32
```



### 4. Related

1.[Tzutalin. LabelImg. Git code (2015)](https://github.com/tzutalin/labelImg)
18 changes: 7 additions & 11 deletions README_ch.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,6 @@
# PPOCRLabelv2

[![PyPI - Version](https://img.shields.io/pypi/v/PPOCRLabel)](https://pypi.org/project/PPOCRLabel/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/PPOCRLabel)](https://pypi.org/project/PPOCRLabel/)
[![PyPI - Downloads](https://img.shields.io/pypi/dd/PPOCRLabel)](https://github.com/PFCCLab/PPOCRLabel)
[![PyPI - Downloads](https://img.shields.io/pypi/dw/PPOCRLabel)](https://github.com/PFCCLab/PPOCRLabel)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/PPOCRLabel)](https://github.com/PFCCLab/PPOCRLabel)
[![Downloads](https://static.pepy.tech/badge/PPOCRLabel)](https://github.com/PFCCLab/PPOCRLabel)

Expand All @@ -18,8 +15,9 @@ PPOCRLabel是一款适用于OCR领域的半自动化图形标注工具,内置P
| <img src="./data/gif/multi-point.gif" width="80%"/> | <img src="./data/gif/kie.gif" width="100%"/> |

#### 近期更新
- 2022.09: 新增`自动重新识别``自动保存未提交变更`功能,使用方法详见下方`2.1 操作步骤``11. 补充功能说明`
- 2022.05:**新增表格标注**,使用方法见下方`2.2 表格标注`(by [whjdark](https://github.com/peterh0323); [Evezerest](https://github.com/Evezerest))

- 2024.09: 新增`自动重新识别``自动保存未提交变更`功能,使用方法详见下方`2.1 操作步骤``11. 补充功能说明`
- 2022.05:**新增表格标注**,使用方法见下方`2.2 表格标注`(by [whjdark](https://github.com/peterh0323); [Evezerest](https://github.com/Evezerest)
- 2022.02:**新增关键信息标注**、优化标注体验(by [PeterH0323](https://github.com/peterh0323)
- 新增:使用 `--kie` 进入 KIE 功能,用于打【检测+识别+关键字提取】的标签
- 提升用户体验:新增文件与标记数目提示、优化交互、修复gpu使用等问题。
Expand All @@ -30,7 +28,7 @@ PPOCRLabel是一款适用于OCR领域的半自动化图形标注工具,内置P
- 2021.8.11:
- 新增功能:打开数据所在文件夹、右键图像旋转90度(注意:旋转前的图片上不能存在标记框,by [Wei-JL](https://github.com/Wei-JL)
- 新增快捷键说明(帮助-快捷键)、修复批处理下的方向快捷键移动功能(by [d2623587501](https://github.com/d2623587501)
- 2021.2.5:新增批处理与撤销功能(by [Evezerest](https://github.com/Evezerest))
- 2021.2.5:新增批处理与撤销功能(by [Evezerest](https://github.com/Evezerest)
- **批处理功能**:按住Ctrl键选择标记框后可批量移动、复制、删除、重新识别。
- **撤销功能**:在绘制四点标注框过程中或对框进行编辑操作后,按下Ctrl+Z可撤销上一部操作。
- 修复图像旋转和尺寸问题、优化编辑标记框过程(by [ninetailskim](https://github.com/ninetailskim)[edencfc](https://github.com/edencfc)
Expand All @@ -47,6 +45,7 @@ PPOCRLabel是一款适用于OCR领域的半自动化图形标注工具,内置P
## 1. 安装与运行

### 1.1 安装PaddlePaddle

```bash
pip3 install --upgrade pip

Expand Down Expand Up @@ -130,6 +129,7 @@ PPOCRLabel.exe --lang ch
## 2. 使用

### 2.1 操作步骤

> 如果您只需要标注文字信息和位置,推荐按照以下步骤展开:
1. 安装与运行:使用上述命令安装与运行程序。
Expand All @@ -147,6 +147,7 @@ PPOCRLabel.exe --lang ch
- `文件` -> `自动保存未提交变更` : 默认是按`确认`按钮完成当前框的标记确认,有点繁琐,勾选后,切换下一张图(按快捷键`D`)的时候,不再弹出提示框确认是否保存未确认的标记,自动保存当前标记并切换下一张图,方便快速标记

### 2.2 表格标注([视频演示](https://www.bilibili.com/video/BV1wR4y1v7JE/?share_source=copy_web&vd_source=cf1f9d24648d49636e3d109c9f9a377d&t=1998)

表格标注针对表格的结构化提取,将图片中的表格转换为Excel格式,因此标注时需要配合外部软件打开Excel同时完成。在PPOCRLabel软件中完成表格中的文字信息标注(文字与位置)、在Excel文件中完成表格结构信息标注,推荐的步骤为:
1. 表格识别:打开表格图片后,点击软件右上角 `表格识别` 按钮,软件调用PP-Structure中的表格识别模型,自动为表格打标签,同时弹出Excel

Expand Down Expand Up @@ -179,8 +180,6 @@ PPOCRLabel.exe --lang ch
| rec_gt.txt | 识别标签。可直接用于PPOCR识别模型训练。需用户手动点击菜单栏“文件” - "导出识别结果"后产生。 |
| crop_img | 识别数据。按照检测框切割后的图片。与rec_gt.txt同时产生。 |



## 3. 说明

### 3.1 快捷键
Expand All @@ -206,7 +205,6 @@ PPOCRLabel.exe --lang ch
| Ctrl-- | 放大 |
| ↑→↓← | 移动标记框 |


### 3.2 内置模型

- 默认模型:PPOCRLabel默认使用PaddleOCR中的中英文超轻量OCR模型,支持中英文与数字识别,多种语言检测。
Expand Down Expand Up @@ -277,8 +275,6 @@ python gen_ocr_train_val_test.py --trainValTestRatio 6:2:2 --datasetRootPath ../
pip install opencv-contrib-python-headless==4.2.0.32
```



### 4. 参考资料

1.[Tzutalin. LabelImg. Git code (2015)](https://github.com/tzutalin/labelImg)

0 comments on commit 9c69d7e

Please sign in to comment.