Vadim Semenov

yeah, I have started. So initially, when I was hard, it was hard like to bring spark support to data log. Overall, we're at, we used to use big as our process and framework. And bringing spark support was pretty challenging in terms of I had to, we had in turn, we have internal platform that we use to launch jobs, like kind of like CueBall data breaks, and I had to like write like, lots of code there, like, figure out like, optimal settings for all the spark clusters, then write a bunch of tutorials, figure out how we're going to compile code, how code is going to be delivered has going to signal our workflow management framework that work is done and on lots of challenging aspects about that, and then once it was built like we started like, moving like some historical processing of towards it, and then the challenge was like how we use spark back then it was like 1.6 spark two point all was still in development and we were pretty hesitant to use it and spot 1.6 without like, yeah, it's, it's great, it's reliable and stable, but turns out it's not like there are like lots of place where it's crashed and over the past like four years, we've grown some expertise around Spark, but it still throws like some interesting problems on our plate and then like while we were working on like, different like systems, we had to dig deep into like how JVM works how memory is all fine, how does it work, a lighting works, how much space our data structures take how garbage collection works, how to put everything in off heap and so on so on. And we actually found like some bug in Scala itself and spark We helped to we, we didn't help to fix it. We just noticed that Yeah, actually arrays can be bigger than this number, and so on. So there are lots of things that I learned is that, for example, for spark turns out like not like at the scale that we use, not a lot of companies use it actually. And lots of like problems we've ran into, nobody have seen before. And we had to, I figured out like, oh, how are we going to do this and that, and on top of that, as well, we run most of our jobs on spot instances, which means that your cloud provider can take them away anytime. And this creates like such a violent environment. And when you have like clusters of like 5000 cores, and your instances are constantly dying and spark runs on it, you realize that you're probably not everyone's doing at what we're doing. And whenever you have a problem, you're like, Okay, I'm on my own. I need to get all this tech traces all the logs and figure like, what's up That's, that's just a small portion of problems you've run into.