Coveo for Sitecore 4.0 vs Google Site Search for "site search" functionality
Recently a client requested some enhancements to their "site search" feature, which is currently implemented using Solr and the Sitecore Solr "Content Search" libraries - the requirements were:
1) Idioms/synonyms: The ability to manually configure idioms/synonyms (so that words which have the same meaning return the same results, e.g. a search for “trip” might return the same results as a search for “journey”.)
2) Spelling correction: The ability to return results as if the correct term was searched for (when an incorrectly spelled search term is entered.)
3) Autocomplete: Suggested search queries appear as you begin typing in the search box.
4) The chosen solution should work unhindered for consumers in China
Having established that Coveo for Sitecore 4.0 and Google Site Search seemed to be the two main "tried and trusted" alternatives for a Sitecore installation, I set about investigating the pros and cons of each in relation to these specific requirements. The requirements listed above seem fairly basic for modern search functionality, however they are not trivial to implement using any of the "out-of-the-box" Apache Lucene-based technologies. The UI components for the "site search" feature were already developed and in place, so it was expected that there would be no changes to the look and feel of the current user interface.
Google Site Search overview:
The data for this search method comes from pages crawled by Google (i.e. “live” data). You create your own “Custom Search Engine” (CSE), which is like a mini-Google, restricted just to a particular site (and which is tweakable to your requirements). This CSE can then be accessed via a REST API from the web application. In order to meet requirement 4), it's essential that we only interact with the CSE using server-side requests, as client-side requests to Google resources from within China will be blocked.
Coveo for Sitecore 4.0 overview:
The data for this search method is the application’s Sitecore items (just like Solr/Lucene.) Dedicated server(s) are used to host Coveo, and various tools are available to tweak the Coveo setup to your requirements. The index data on the Coveo server is then accessed via a REST API from the web application. Coveo comes with a lot of OOTB search UI components (and makes a big deal about how much you can achieve without writing any code), however this wasn't relevant for the above requirements (as we already have the UI, we just need to change the search behaviour.)
Coveo for Sitecore 4.0 cloud overview:
A variant of the Coveo approach is to use their cloud offering instead of an on-premises setup. This means that rather than set up dedicated Coveo infrastructure, Coveo manage this as part of a PaaS approach. The feature set is largely the same (there are some advanced features only available on the cloud solution, but these don’t seem relevant for the above requirements - I've written a little more about these features further down the page), so the main benefit here is the delegation of the Coveo infrastructure management/maintenance.
Suitability of each product for selected requirements:
Requirement 1 - Idioms/synonyms
The ability to manually configure idioms/synonyms (so that words which have the same meaning return the same results, e.g. a search for “trip” might return the same results as a search for “journey”.)
Possible, simple to configure by a non-technical user
Possible, simple to configure by a non-technical user
Requirement 2 - Spelling correction
The ability to return results as if the correct term was searched for (when an incorrectly spelled search term is entered.)
Seems to work out-of-the-box using the Google solution
Available, requires a small amount of development to use via the REST API (your application would need to make specific REST calls to get corrections)
Requirement 3 - Autocomplete
Suggested search queries appear as you begin typing in the search box.
Doesn’t seem currently supported for Google Site Search (without using their JS/front end components, which won’t run in China). Potentially you might opt for a 'hybrid' approach, using the provided front-end intergration for non-China autocomplete, and omitting the feature for Chinese users - however I've not investigated the feasability of this approach. An alternative would be a simple "roll your own" autocomplete, with a dictionary maintained e.g. as Sitecore content
Requires either the On Premises Usage Analytics Module (https://onlinehelp.coveo.com/en/ces/7.0/administrator/on-premises_usage_analytics_module.htm) or using Coveo Cloud Usage Analytics instead, if using the Cloud-based version of Coveo (https://onlinehelp.coveo.com/en/cloud/coveo_cloud_usage_analytics.htm). I’ve not tested either of the Coveo search suggest options as a lot of extra setup is required.
Requirement 4 - The chosen solution should work unhindered for consumers in China
This has been addressed whilst discussing the previous 3 requirements. The main restriction of this requirement in China is that only server-side calls can be made to the search provider's REST APIs, in the case of the Google product.
Below are a few other pros/cons (in addition to the consideration of the client’s specific requirements above):
Google additional pros/Cons:
· Potentially easy to demonstrate the abilities of this approach to the client without having to write any code - you can create a demo CSE which can be played around with by your client
· Various configuration options to tweak the behaviour of the CSE by logging in to the admin console with the relevant Google account credentials
· No additional infrastructure to add/maintain
· Seems a lot cheaper than Coveo
· Only indexes “online” data (how would you implement a QA/UAT setup, for instance? I'm hoping that you don't allow client QA/UAT sites to be indexed by Google :) )
Coveo additional pros/Cons:
· Includes some Sitecore renderings OOTB (although this wasn't relevant for my client's needs)
· Enables personalised searches, and personalisation *based on* searches - using xDB (again, this wasn't relevant for my client's needs)
· Gives the potential for the client to use more xDB features in the future, as it indexes Sitecore items, and integrates with the xDB stuff.
· Seems a lot more expensive than Google
· Somewhat complicated setup, needs new infrastructure. Failover/HA/DR considerations
· Documentation a bit patchy/vague
Coveo Cloud additional pros/Cons:
· No need for dedicated servers to host Coveo Enterprise Search (CES) and the Coveo Search API (https://onlinehelp.coveo.com/en/ces/7.0/administrator/coveo_platform_hardware_and_software_requirements.htm)
· No need to manage any Coveo infrastructure
· At time of writing, license costs are 50% more than the on-premise equivalent
· At time of writing, you are limited to 100K queries per month. Additional queries need to be paid for (the price is not specified online...)
It was easy to recommend Google Custom Search to the client in this specific case, as it seemed to provide the most benefit for the least pain and cost. It seems like Coveo is more complicated to set up, requires more ongoing maintenance (unless you use the cloud version), and is more expensive than the Google Custom Search solution. The only deficit of the Google solution (with reference to the client’s specific requirements) is support for autocomplete via their REST APIs - however, there are potential alternatives. Additionally, the Google Custom Search approach gives the opportunity to demo some of the requirements to your client fairly easily (before committing to writing any code which integrates this approach into your application – I.e. makes the process of matching the solution to the requirements a lot more ‘agile’). The Google product seems like it would be easier to "switch out" for another provider at a later date, whereas the Coveo approach integrates more fully with Sitecore, so there's a bit more potential for "vendor lock-in".
That said, it seems like the two products are pitched at slightly different needs, so Coveo might make a lot more sense if:
- You want to use the out-of-the-box UI components, which means the prospect of being able to set up a search feature without having to write any code. This approach might work well for UI-driven/faceted search, as Coveo looks strong in these areas.
- You want to leverage any of the more advanced Coveo features, such as Coveo Reveal, which has the ability to "learn" how to improve search results based on what queries users are making - see https://www.youtube.com/watch?v=H_TB3upFiO0 for a demo of this (starts around the 40 minute mark!)