Get collection of pages without media information

I’m using grav to create a media library. I’ve written an image cache system, and grav is really fast, despite serving hundreds of heavy media pages (hundreds of images per page, many video, audio etc…)

Now I’m trying to implement a search, and I’m running into the same problems why I created the imagecache; media initialization crashes the site. When I use the simplesearch plugin, somewhere in the plugin a new Collection is created, and with such a huge amount of media, it simply crashes. (on normal page calls all media is served from cache)

Now my question is; would it be possible to create a collection of pages with only text info, and no media info? I guess this would ask for some kind of modification of the page object too? I don’t know enough of Grav core yet to solve this. It would be a lot faster to search this kind of collection than the one we have now.

Any ideas on how to solve this are welcome.

The problem is that search calls content() on each page. This gets the content from the cache if available, else, it processes the content as required and caches the results for next time. This method also processes any media images (as they may be referenced in the page content).

It’s a bit troublesome to break this out because the content may have reference that media, and therefore that media needs to be processed. If you had a version or a flag that caused that media not to be processed, that content would be cached with invalid/incorrect media references.

The better thing to do would be to create a plugin (not simplesearch) that indexed offline, perhaps from a cron job’ed CLI command. This could do the indexing in smaller batches so things don’t crash.

Jep, I solved it by writing my own search plugin, and instead of storing the results in a collection, i just store them in my own array and pass it on to Twig. Which is already quite fast.

Thx for the feedback.

Cool :slight_smile: