Interesting Stuff - Week 41, 2020

Posted by nielsb on Sunday, October 11, 2020

Throughout the week, I read a lot of blog-posts, articles, and so forth, that has to do with things that interest me:

  • data science
  • data in general
  • distributed computing
  • SQL Server
  • transactions (both db as well as non db)
  • and other “stuff”

This blog-post is the “roundup” of the things that have been most interesting to me, for the week just ending.

SQL Server 2019 Big Data Cluster

  • Stop and Start your AKS Big Data Cluster and Save $$$. In this post, my good friend Mohammad looks at new functionality in Azure Kubernetes Service, (AKS), whereby you can start and stop Kubernetes clusters at will. Mohammad looks at it from the perspective of SQL Server 2019 Big Data Clusters running in AKS.

Streaming

  • Kafka record tracing. Do you know how long it takes for messages to flow in your data pipeline? Do you know where the bottlenecks are? Why I ask these questions is that in a distributed system, the answers may not always be that easy to find. The post linked to here looks at solutions to these questions, and it looks at distributed tracing for Kafka.
  • Introducing Cluster Linking in Confluent Platform 6.0. As the title of this post says, the post is about linking Kafka clusters to each other. From the post: “Replicating topics between Kafka clusters has been a long-standing problem that’s seen a number of solutions, including MirrorMaker and Confluent Replicator. Although the utility of these projects has come a long way, they’re not without their respective issues.”. The issues referred to in the previous sentence, is what the new cluster linking functionality is supposed to fix. It looks very promising, the one question I have right now is if this is a Confluent Platform Enterprise only functionality?
  • Managing Kafka Connectors at scale using Kafka Connect Manager. A while back we, ( Derivco), looked at using CDC and Debezium to get data out from the database to Kafka. One of the problems we faced was how to manage the various Kafka Connectors, and ensure they were functioning correctly. The post I link to here would have been the perfect solution for us - a framework for managing Kafka Connectors. It would be interesting to know if it is open-sourced.

~ Finally

That’s all for this week. I hope you enjoy what I did put together. If you have ideas for what to cover, please comment on this post or ping me.


comments powered by Disqus