Better machine learning logistics with the rendezvous architecture

March 07, 2018

Most of the effort in machine learning goes into everything except the learning. Dealing with these overhead tasks well makes a large difference in results, if only because it increases the amount of time you can think about the real problems.

Ted Dunning offers an overview of the rendezvous architecture, developed to be the “continuous integration” system for machine learning models, describing the motivation and design of the rendezvous architecture and giving a user’s-eye view of how it feels to roll new services into production. The rendezvous architecture allows always-hot zero latency rollout and rollback of new models and supports extensive metrics and diagnostics so models can be compared as they process production data. It can even hot-swap the framework itself with no downtime. Best of all, the rendezvous architecture is simple and understandable.

More info here