Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too slow loading #353

Open
foxjaw opened this issue Jun 10, 2024 · 7 comments
Open

Too slow loading #353

foxjaw opened this issue Jun 10, 2024 · 7 comments

Comments

@foxjaw
Copy link

foxjaw commented Jun 10, 2024

Takes like a minute to even open a document.

Screenshot_20240610-192939_Trebuchet

@andiwand
Copy link
Member

Hi @foxjaw, I fear the loading screen is not telling us much. Do you have a file to share?

I suspect that it is a PDF which tend to take quite some time to open if they are large. Is this something that was better in the past in your experience?

@foxjaw
Copy link
Author

foxjaw commented Jun 11, 2024

Yes. All PDFs greater than 1 mB load slowly. It's like a nightmare. Meanwhile I use MuPDF separately to open PDFs it's very fast. Why is open document taking a long time to load big documents?

@andiwand
Copy link
Member

I think this is an issue with pdf2htmlEX. Potentially we have some regression there with build flags / jpeg vs png? cc @TomTasche @ViliusSutkus89

@foxjaw
Copy link
Author

foxjaw commented Jun 11, 2024

Oh okay. So Open Document does not have a native pdf viewing library and instead relies on html conversion which is why it's inefficient ?

@andiwand
Copy link
Member

I think this does not necessarily mean that it is inefficient. pdf2htmlEX will render the PDF to images and put those into HTML. This is very much like what the browser does but one more step of indirection. I can remember that there was a problem in the past which caused the rendering being quite slow. A simple configuration change resolved that.

Apart from that we could also render the pages in parallel and display what is rendered already instead of waiting for the whole document to finish. But I fear we are lacking the personpower to achieve this in the near future.

@foxjaw
Copy link
Author

foxjaw commented Jun 11, 2024

I don't think browsers do this. If they were converted into images it would be impossible to select the text & ctrl+f the document, which I can with browsers.
See I'm technically weaker. But at least this is how I see.

@ViliusSutkus89
Copy link
Contributor

Hello y'all

So I don't think it's a regression, it was always slow and we have a few of reasons why it's slow.

First conversion is extra slow because we have to extract asset files from pdf2htmlEX, Poppler and FontForge. Ideally we should use them without extracting, but this requires some work ( opendocument-app/pdf2htmlEX-Android#9 , opendocument-app/pdf2htmlEX-Android#10 ). Currently these libraries expect assets to be found as regular files on a disk.

Second reason is the thing @andiwand mentioned - we convert the whole document and only then render it. Other viewers do conversion and rendering at the same time. Page by page conversion might be the lowest hanging fruit here. But the thing is, once opendocument-app/pdf2htmlEX-Android#93 is implemented, we can interface pdf2htmlEX from odr.core through C++ instead of odr.droid through Java. This means that whatever improvement we code up now, would probably be needed to be reimplemented.

Also, when pdf2htmlEX does conversion to HTML, it's not just images. Normally pdf texts end up as selectable html text elements

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants