PDF text results from a site search

I’m seeking a solution to enable OCR’d PDFs to appear in a Grav site search. Both SimpleSearch and TNT Search are able to search the html content of the site well. But when PDFs are present (Pdf-JS), no matter which embed technique is used, they don’t search within the PDF.

I appreciate they don’t have this functionality.

So I’m seeking an approach that will allow these or another search to search both the site and the PDF text in the same search.

Any thoughts or guidance is appreciated.

You would have edit your preferred search plugin for this functionality to exist. This is documented fairly well here in regards to pdf-js.

I would fork one of the search plugins and merge this functionality, and see if you could get it pulled in via a pull request.

Thanks for the tip, I’ll take a look.