Silhouette: Birth of a new Apache Spark ML algorithm
Published:
How to implement a new algorithm with Apache Spark? The talk presented as a case study a new, scalable, distributed clustering evaluation algorithm that I contributed to Spark.
Published:
How to implement a new algorithm with Apache Spark? The talk presented as a case study a new, scalable, distributed clustering evaluation algorithm that I contributed to Spark.
Published:
Apache Spark introduced code generation in 2.0 to make queries up to 100x faster. Of course, there is also a dark side of code generation: which are the problems it brings? How to address them? Is it production ready? Let’s take a journey through the evolution of Spark code generation from its birth to the current status, with a glance to the future improvements.