Below are a collection of conference talks or other presentations that I’ve given, typically related to distributed systems or other software engineering topics:
- (2022) Improving Cassandra Client Load Balancing (slides): A talk Ammar Khaku and I from Netflix gave at ApacheCon 2022 on how we cut database latency by 30% or more using a novel weighting technique in coordinator selection. The talk is about Cassandra but the algorithm is generically useful for clients of stateful systems.
- (2021) How Netflix Provisions Optimal Cloud Deployments of Cassandra (slides): A talk at ApacheCon 2021 I gave on how Netflix uses our service-capacity-modeling system to mathematically model and plan for capacity for petabyte scale database systems. The talk is about Cassandra but the approach (and library) supports any stateful system.
- (2020) Towards Practical Self-Healing Distributed Databases: A talk at ApacheCon 2020 I gave on the self-healing database architecture and how to apply that to Cassandra. If you are trying to maintain a large scale database infrastructure this talk might have some useful tips.
- (2019) How Netflix Debugs and Fixes Apache Cassandra When it Breaks (slides): A talk at ApacheCon 2019 I gave about how to debug and scientifically analyze performance bottlenecks in Apache Cassandra. This is a very good “help I’m going oncall for Cassandra” introduction to basic tools and techniques.
- (2019) How Netflix manages petabyte scale Apache Cassandra: A talk at ApacheCon 2019 I gave with Vinay Chella about how we design declarative control planes to orchestrate thousands of independently scaling Cassandra Clusters. Essentially this talk is “how to make your own self-driving database”.
- (2018) Iterating on Stateful Services in the Cloud: A talk at re:Invent 2018 about how Netflix manages stateful services (such as datastores or databases) in the AWS cloud. Contains a lot of concrete advice for managing state in AWS.
- (2018) Repair Service and Cassandra: Part of Netflix’s OSS Meetup about Polyglot Persistence. Vinay Chella and I talk about how we built a repair scheduler for OSS Cassandra.
- (2016) Building a Powerful Data Tier from Open Source Databases: A talk I gave at OSCON London 2016 about how to build a polyglot datastore data tier using open source datastores.
- (2016) The Human Side of Service-Oriented Architectures: A talk that an old mentor, John Billings, and I gave at the first Microservice summit about how you have to scale the human side of microservices (as opposed to the technical sides).
- (2016) Automating Datastore Fleets with Puppet: A more detailed talk on how we automated datastores at Yelp specifically with Puppet and other common industry tools. Given at Puppetconf 2016.
- (2015) Writing a Polyglot Datastore Story: A talk I did with Josh Snyder at Velocity 2015 about how Yelp allowed developers to use polyglot datastores. Has architectural advice as well as practical advice.
- (2015) The Evolution of Elastic(Search) at Yelp: A talk I did with Chris Tidder at ElasicOn 2015 about how Yelp built self service Elasticsearch with sufficient abstractions to allow datastores to be maintained and scaled to large engineering organizations.