Miva’s new Crawler Product Cache Module offers the ability to cache product pages, beyond the Redis caching ability already included in Miva 10.01 and later.
The module pre-caches all of the products in Redis and will serve those cached versions of the product pages to any user-agent you specify. This is also useful when aggressive bots attempt to crawl hundreds of website pages at the same time, which can cause site performance issues or even DDoS attacks with a high number of concurrent crawlers on the site.
You need to first have the standard Redis cache set up in Miva to use this module. For more information on Redis, click here.
If the module does not show up in on the modules page you will have to install the .mvc file first.
To do so, first download the module file here.
Click the Add button, and upload the .mvc file. When done, click Add to save.
Following that, you need to assign the module to your store. To do that, click the Settings header and then Modules.
On the Modules page, find the Crawler Product Cache module, and click Install.
The crawler runs as part of a store’s Scheduled Tasks.
To configure the crawler, click Settings in the bottom left corner, then Store Settings and then click the Scheduled Tasks tab. Find the Populate Crawler Product Cache entry in the batch list, and click on it.
As products are updated in Miva (via the admin, import or JSON API), they also force a new cache entry for that product, so the site's cache stays up to date.
In the box that opens, you can set the schedule for how frequently the crawler runs. You can choose what kind of log information you will see and see when the next run is scheduled and when the last run took place. You can also choose the Max number of Products to Cache Per Execution.
The crawler will note which products have been cached, and will pick up where it left off. For example, if you set it to log 1,000 products, and your store has 2,000, it will log 1,000 products the first time, and then products 1,001 to 2,000 the next time.
Finally, you will want to establish the User Agent Substrings that will allow the bots or processes are allowed to load out of the product cache.
To do that, go to Store Settings, check that you are in the Store Details tab and scroll to the bottom to the Crawler Product Cache section, and enter any substrings you want to use.
The User Agent Substring is a case insensitive setting, so it does not have to be an exact match.
You can also elect to Compress the HTML of the cached pages by clicking the box next to Compress HTML under the User Agent Substring field.
If Compression is enabled, the caching time does increase but the storage size is reduced. Miva research has shown that, for a store with, for example, 10,000 products, the storage space is reduced from 628 MB to 130 MB.
However, the caching time, which happens in the background via scheduled tasks, of caching all 10,000 products increases from six minutes, 40 seconds to 16 minutes, 40 seconds. And the page load time for reaching out of the cache, for a single product, increases from .6 milliseconds to 3.6 milliseconds.
This is an important consideration which should be based on the customer’s needs, and things like how valuable the customer considers saving storage space.