Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge pull request #2 from fuksja/wersja_robocza #3

Open
wants to merge 3 commits into
base: wersja_robocza
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 20 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,35 +20,50 @@ Stakeholder requirements: there is a need for a tool for fast and automated conv

General description of fuctionality: user goes to the upload page, uploads a pdf file, conversion takes place and user receives doc file as output.

EDIT: added new functionality: conversion to .pptx format. User uploads file and chooses whether to convert to .doc or .pptx and converted file pops out.

### Assumptions:
- the project will only be using open source software and will be open software licensed
- no conversion of encrypted files for now
- all pages converted as default
- custom max file size limitation
- no special security features
- simple conversion from pdf to .pptx as images put in slides, no strings OCRed

### Limitations:

- english language version for now
- no security features, user profiles, login option, session control, simple file input and output for now
- limitations derived from conversion method and library [pdf2doc](https://pypi.org/project/pdf2docx/):
- for conversion to doc format:
limitations derived from conversion method and library [pdf2doc](https://pypi.org/project/pdf2docx/):
- text based files
- language from left to right
- no rotation possible
- no 1:1 layout conversion achievable
- for conversion to .pptx format:
limitations derived from conversion method and library [pdf2pptx](https://pypi.org/project/pdf2pptx/):
- each original file page rendered as a PNG image and input into a Powerpoint slide
- slides not editable, no OCR - but may be presented as slides

## Getting started

- chosen language/method: Python3 and flask
- chosen method of file conversion: pdf2docx 0.5.3 library: https://pypi.org/project/pdf2docx/
- chosen method of file conversion:
- pdf2docx 0.5.3 library: https://pypi.org/project/pdf2docx/
- pdf2pptx 1.0.5 library: https://pypi.org/project/pdf2pptx/

## Time frame
Project completed June 2022. May be continued and improved upon in the future.
First part of the project completed in June 2022. Second part, with addition of .pptx feature completed in July. Project will be updated in the future.

## Documentation

This github repository serves as projects documentation.

## License and copyright notice
This project uses GPLv3 license. Part of this project is derived from other software, created by other programmers, community or made in different way also under the GNU General Public License v3.0:
[Source of pdf2docx library used for file conversion](https://github.com/dothinking/pdf2docx)
This project uses GPLv3 license and MIT license. Part of this project is derived from other software, created by other programmers, community or made in different way also under the GNU General Public License v3.0:

[Source of pdf2docx library used for file conversion to .doc](https://github.com/dothinking/pdf2docx)
[License](https://github.com/dothinking/pdf2docx/blob/master/LICENSE)

[Source of pdf2pptx library used for file conversion to .pptx](https://github.com/kevinmcguinness/pdf2pptx)
[License](https://github.com/kevinmcguinness/pdf2pptx/blob/master/LICENSE)