elasticsearch - Elastic Search document modeling for history -


i want store products in elastic search each product has fields (description, quantity, price, name). every day price , quantity change.

how can store in elastic search able search product past prices?

should have document current value fields , document have product document parent, , there daily task add date , changed value in array ?

unfortunately, there's no built in way deal versioning in elasticsearch. built-in versioning isn't designed retrieval of previous versions. need control versioning @ application layer.

what we've elected store old copies of documents this:

{   "unversioned_prop1": "prop1",   "unversioned_prop2": "prop2",   ...   "versions": [     {       "version": "version_x",       "version_metadata": { ... }       "document": {         "versioned_prop3": "prop3",         "versioned_prop4": "prop4"         ...       }     },     { "version": "version_y", "document": { ... versioned props ... } },     ...   ]   "current": { ... current versioned props ... } } 

unversioned properties

having unversioned properties outside of array useful because may want update properties versions of document. additionally, ensures search weights behave predictably.

it has downside of requiring seam of information in application layer.

current version

breaking out current version separate property allows use search filtering return recent version of document.

version metadata

this includes versioning information might want search on, such dates.

search

you can search versioned properties can subproperties. search ends looking this:

... {   "match": {"versions.document.versioned_prop": "query string" } 

this search across versions of document, , return combined document if there's match.

updates

when need create new version, can use partial update insert new document , update current document.

alternative

the major downside approach can't filter down of search results based on things inside of versions - want filter them on application side.

if need documents behave independently, need index them independently. achieve can include "collection id" on versions. collection id unique document, , shared across versions.

the collection id approach ended having many issues, , moved approach outlined above, , have had higher level of success.


as side note, personally wouldn't recommend use elasticsearch primary storage of important records. if can live occasional data loss.


Comments

Popular posts from this blog

How to understand 2 main() functions after using uftrace to profile the C++ program? -

c# - Update a combobox from a presenter (MVP) -

How to put a lock and transaction on table using spring 4 or above using jdbcTemplate and annotations like @Transactional? -