elasticsearch - Elastic Search document modeling for history -
i want store products in elastic search each product has fields (description, quantity, price, name). every day price , quantity change.
how can store in elastic search able search product past prices?
should have document current value fields , document have product document parent, , there daily task add date , changed value in array ?
unfortunately, there's no built in way deal versioning in elasticsearch. built-in versioning isn't designed retrieval of previous versions. need control versioning @ application layer.
what we've elected store old copies of documents this:
{ "unversioned_prop1": "prop1", "unversioned_prop2": "prop2", ... "versions": [ { "version": "version_x", "version_metadata": { ... } "document": { "versioned_prop3": "prop3", "versioned_prop4": "prop4" ... } }, { "version": "version_y", "document": { ... versioned props ... } }, ... ] "current": { ... current versioned props ... } } unversioned properties
having unversioned properties outside of array useful because may want update properties versions of document. additionally, ensures search weights behave predictably.
it has downside of requiring seam of information in application layer.
current version
breaking out current version separate property allows use search filtering return recent version of document.
version metadata
this includes versioning information might want search on, such dates.
search
you can search versioned properties can subproperties. search ends looking this:
... { "match": {"versions.document.versioned_prop": "query string" } this search across versions of document, , return combined document if there's match.
updates
when need create new version, can use partial update insert new document , update current document.
alternative
the major downside approach can't filter down of search results based on things inside of versions - want filter them on application side.
if need documents behave independently, need index them independently. achieve can include "collection id" on versions. collection id unique document, , shared across versions.
the collection id approach ended having many issues, , moved approach outlined above, , have had higher level of success.
as side note, personally wouldn't recommend use elasticsearch primary storage of important records. if can live occasional data loss.
Comments
Post a Comment