Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF slow load #2784

Open
alpha-mipl opened this issue Feb 5, 2025 · 6 comments
Open

PDF slow load #2784

alpha-mipl opened this issue Feb 5, 2025 · 6 comments
Assignees
Labels
nuisance a bug that only shows in the logs; or: a bug with an obvious work-around every user finds quickly

Comments

@alpha-mipl
Copy link

alpha-mipl commented Feb 5, 2025

I need your help!
When I try to load a PDF with version 20.5.0-alpha.1, it loads in under 1.02 seconds, but with the latest version after 20, it takes up to 3 seconds. Even the basic PDF.js viewer can load this PDF in under 1 second.

Demo PDF file
Here you can check the PDF

demo.pdf

@alpha-mipl
Copy link
Author

I found that it works fine up to version 21.0.0-alpha.6, but after that, PDFs load slowly.

@alpha-mipl
Copy link
Author

I found the issue
that in viewer-4.7.728.mjs, there are two setDocument methods:

One at line 35,004
Another at line 30,450
Both setDocument methods are being called when loading a PDF.

If I add a return in the top of the second setDocument method, the PDF loads much faster and works perfectly.

@stephanrauh stephanrauh self-assigned this Feb 5, 2025
@stephanrauh stephanrauh added the nuisance a bug that only shows in the logs; or: a bug with an obvious work-around every user finds quickly label Feb 5, 2025
stephanrauh added a commit to stephanrauh/pdf.js that referenced this issue Feb 5, 2025
…attribute is set before using it (fixed a bug that caused some documents to crash in single-page mode)
stephanrauh added a commit to stephanrauh/pdf.js that referenced this issue Feb 5, 2025
…attribute is set before using it (fixed a bug that caused some documents to crash in single-page mode)
@stephanrauh
Copy link
Owner

stephanrauh commented Feb 5, 2025

Hm. As far as I can tell, you've disabled the setDocument method of the PdfScriptingManager:
https://github.com/stephanrauh/pdf.js/blob/15ec795de8ad7172ef6a815a24203b2336ca00f6/web/pdf_scripting_manager.js#L83

Can you confirm that?

I've add the return statement, but I don't see the difference you're describing. Loading the document still takes 3-4 seconds. Subjectively, I suspect it's faster with the return statement, but it's a far cry from the single second you're expecting. In other word: at the moment, I can't reproduce your solution, so I'm a bit puzzled.

BTW: I guess it's obvious that deactivating the scripting manager isn't the solution. It may be a good performance optimization for those who don't need JavaScript in PDF files, so adding an option to disable scripting makes a lot of sense to me. However, you describe it as a regression, so I'd like to find the root cause.

@alpha-mipl
Copy link
Author

Image

I noticed that when I added a return statement, the PDF loading became much faster. However, when I remove this return statement, I face slow PDF loading issues again.
The issue seems to be that during PDF loading, page 1 is being called twice, which appears to be blocking the main thread.

@alpha-mipl
Copy link
Author

However, after adding the return statement, the PDFThumbnailViewer stopped working. I think this is because the setDocument function is being used for the thumbnails as well.

@alpha-mipl
Copy link
Author

I tested this with different PDFs and noticed a significant performance difference of about 10 seconds. For example, I have another PDF that's 350MB in size - it usually takes 20.59 seconds to load, but with the return statement added, it only takes 10 seconds.

I think the slow performance is happening because setDocument is being called simultaneously for both the thumbnail and the main PDF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
nuisance a bug that only shows in the logs; or: a bug with an obvious work-around every user finds quickly
Projects
None yet
Development

No branches or pull requests

2 participants