Hey all! it's been a while—hopefully everyone had a nice new year. I took the last ~6 weeks off, so there's quite a bit to catch up on. This week's issue has sixteen of the best articles from that time covering topics like Apache Kafka producers, distributed storage engines for Prometheus, Presto+Pinto, and Jepsen analysis of etcd. Also, Yelp writes about their Kafka architecture, and Teads writes about optimizing Spark applications using User Defined Aggregate Functions. Lots to read up on, whether you're looking for some tips to apply to your own system, new tools to try out, or learning more about how systems work under the hood.
Data Eng Weekly #333
Data Eng Weekly #333
Data Eng Weekly #333
Hey all! it's been a while—hopefully everyone had a nice new year. I took the last ~6 weeks off, so there's quite a bit to catch up on. This week's issue has sixteen of the best articles from that time covering topics like Apache Kafka producers, distributed storage engines for Prometheus, Presto+Pinto, and Jepsen analysis of etcd. Also, Yelp writes about their Kafka architecture, and Teads writes about optimizing Spark applications using User Defined Aggregate Functions. Lots to read up on, whether you're looking for some tips to apply to your own system, new tools to try out, or learning more about how systems work under the hood.