Query-time Nonparametric Regression with Temporally Bounded Models

Discussion and demonstration of an architecture that knits several pieces of Solr’s infrastructure together, with further detail into Solr’s new Time Routed Aliases (TRAs). The system is a machine learning system based on a non-parametric regression methodology taken from habitat ecology. The model is partially pre-calculated and stored in Solr so that it can can be assembled on the fly to recommend what documents a user may be interested in based on recent data. The definition of “recent” is defined by a Solr filter query. Solr TRAs are used to help scale and sunset old data from the system. Technologies discussed in this talk include predictive modeling, Solr streaming expressions, indexing with JesterJ, and Solr Time Routed Aliases (TRAs). The latter half of this presentation goes into some depth regarding TRAs,. TRAs are useful for avoiding performance degradation due to index growth in systems based on continuously acquired timestamped data (similar to the system presented). Both presenters helped build Solr’s TRA capability.

Speakers

Patrick Heck

Owner, Needham Software LLC

Patrick (Gus) Heck is the Owner of Needham Software LLC and has been solving search problems since 2010, been an independent Solr Consultant since 2012, and a frequent contributor to the Apache Solr project since 2013.

David Smiley

D W Smiley LLC

I'm a Lucene/Solr committer/PMC member. I do search consulting/development work. My particular interests in search are geospatial/spatial.

Wednesday October 17, 2018 4:15pm - 4:55pm EDT
Drummond West

Solr Developer, Advanced

Activate 2018

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Patrick Heck

David Smiley