Talks and presentations

Deep dive in Spark code generation

March 12, 2019

Talk, AgileLab, Torino, Torino, Italy

Apache Spark introduced code generation in 2.0 to make queries up to 100x faster. Of course, there is also a dark side of code generation: which are the problems it brings? How to address them? Is it production ready? Let’s take a journey through the evolution of Spark code generation from its birth to the current status, with a glance to the future improvements.

Silhouette: Birth of a new Apache Spark ML algorithm

October 17, 2017

Talk, Hortonworks Budapest, Budapest, Hungary

How to implement a new algorithm with Apache Spark? The talk presented as a case study a new, scalable, distributed clustering evaluation algorithm that I contributed to Spark.