Faceting using Elasticsearch Aggregations

Posted by Hariharan Vadivelu on

Facets are probably one of the most compelling reasons to use search engines for an ecommerce site, They can be used to render pages such as search results, browsing of categories etc, We will take a look at the real world example of how ES Aggregations (AKA Facets in older versions) can be used to build category navigation for an ecommerce site.
Similar results can be achieved using ES Facets, but in this blog we will look at "aggregations" which seems to be the way forward and has additional flexibility as compared to ES "Facets"

ES Aggregations will eventually replace ES Facets, but the nice thing is they have a lot in similarity so to learn or migrate to this new feature from ES "facets" to "aggregations" should not be complicated.
You can read more about the new feature on this GIT issue tracker https://github.com/elasticsearch/elasticsearch/issues/3300

Mock User Interface

The mock screen in this example is a typical category or search term navigation on an ecommerce site, at a very basic level you have 4 components of search results.

Facets - Facets help users to narrow down / or filter a search result, facet is built based on the search context.

Sort Order - Sort order impacts the search results components, it defines in what order the results should be listed on the page, for instance a user may sort by lowest to highest price or by product ratings.

Pagination of Results - Pagination component allows an user to navigate back and forth through a search results, this also guides the number of records that should be returned in ES query.

Search Result - Restricted to number of records that should be displayed on the landing page, perhaps this will be configurable based on your application needs.



Sample Data


We will begin with a schema less version of our products index, schema less support happens to be a nice thing about ES to get started quickly with design and testing, you can always add a schema latter on for production quality index and to better control the behavior, for the purpose of this demo we will go with following sample for our products index.


Sort Results

User can sort the search or category navigation results by "lowest to highest" price or by popular products, we can combine "sort" element with aggregations to achieve this.
In our example we have used "sort" by lowest to highest price as follows.

"sort" : [{"offerprice" : {"order" : "asc", "mode" : "avg", "ignore_unmapped":true, "missing":"_last"}},"_score"]

Sorting within Facets

The results within in the facets can be sorted using order types within the term definition, for instance in our example we are sorting the Brand  Facet by total count of each brand in descending order.
"order": { "_count" : "desc" }
Similarly we are sorting size facet in ascending using "order": { "_count" : "asc" }

Pagination Component

Pagination of results can be achieved by using "from" and "to" fields, these can be passed either in Query body or as a URL param, in our sample we have passed this in JSON body as follows.
"from" : 0, "size" : 5 or "from" : 5, "size" : 5 for the next page

They can also be used as URL params as follows.
curl -XGET 'http://localhost:9200/products/_search?pretty=true&from=5&to=5


Facet Selection

Search results are also influenced by the facet selection, for instance a user wants to see all products in men's category that are from Brand "diesel" and are of size "small", this can be achieved by using an "and" filter as follows.

....
"and": [
                {
                    "term": {
                        "Brand":"diesel"    
                    }
                    
                },
                {
                    "term": {
                        "size":"small"
                    }
                    
                }]
            }
...
...

What is missing

I could be completely wrong, but I have not been able to achieve following with in the ES Query, of course there are alternate ways of doing this within the application code, but I would love to see these added to ES aggregations in future.

#1 As you can see in our example, we are defining the price range in ES query, but then in a typical ecommerce model the price range may vary dynamically based on the browse category, so Instead of defining price ranges like, 1 to 5, 6 to 10.. etc. there should be a way to get an even spread by defining the number of buckets.



6 comments:

  1. Thanks for this great overview. Is it right that you need two queries / cURL calls -- one for the available facets, one for the actual data?

    ReplyDelete
    Replies
    1. I will reply to my own question:
      It seems that one query will be enough when using the post filter:
      http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-post-filter.html

      Delete
  2. In marketing to any consumer, when you can demonstrate that you understand their personal situation and speak to them with relevant content, response increases. The Hispanic market is no different. See more c sharp syntax cheat sheet

    ReplyDelete
  3. It was very nice blog to learn about SAP BASIS. Thanks for sharing.SAP basis

    ReplyDelete