#142 Bayesian Trees & Deep Learning for Optimization & Big Data, with Gabriel Stechschulte - Episode Artwork
Technology

#142 Bayesian Trees & Deep Learning for Optimization & Big Data, with Gabriel Stechschulte

In this episode of Learning Vision Statistics, host Alex Sandora speaks with software engineer Gabriel Stechschulte about his work on Bayesian additive regression trees (BART) and its applications in ...

#142 Bayesian Trees & Deep Learning for Optimization & Big Data, with Gabriel Stechschulte
#142 Bayesian Trees & Deep Learning for Optimization & Big Data, with Gabriel Stechschulte
Technology • 0:00 / 0:00

Interactive Transcript

spk_0 My guest today is Gabrielle Stechscholzzi, a software engineer passionate about probabilistic
spk_0 programming and optimization. Gabrielle recently re-implemented BART, Bayesian additive regression
spk_0 trees, in rest, making the algorithms faster, more flexible and more suitable for real-world
spk_0 applications. So if you are a PIMC BART user, I definitely recommend checking out his implementation
spk_0 that is in the show notes. In our conversation, we dive deep into what makes BART special. Its
spk_0 ability to quantify uncertainty, handle different likelihoods, and serve as a strong baseline
spk_0 in settings like optimization and time series. We also explain how BART compares with
spk_0 Gaussian processes and other tree-based methods, and talk about practical challenges like handling
spk_0 missing data, integrating BART into PIMC and embedding machine learning models into decision-making
spk_0 frameworks. Beyond the code, Gabrielle reflects on open source collaboration, the importance of
spk_0 community support, and where probabilistic programming is headed next. This is Learning Vision
spk_0 Statistics. Episode 142, recorded September 18, 2025.
spk_0 Welcome to Learning Vision Statistics, a podcast about patient inference, the methods,
spk_0 the projects, and the people who make it possible. I'm your host, Alex Sandora. You can follow me on
spk_0 Twitter at AlexUnderScoreAndora, like the country, for any info about the show. LearnBateStats.com
spk_0 is a lab class to be. Show notes becoming a corporate sponsor, a looking Bayesian merge,
spk_0 supporting the show on Patreon, everything is in there. Let's learnBaseStats.com. If you're
spk_0 interested in one-on-one mentorship, online courses, or statistical consulting, feel free to
spk_0 reach out and book a call at topmate.io-slash-AlexUnderScore. Andora, see you around, folks,
spk_0 and best Asian wishes to you all. And if today's discussion sparked ideas for your business,
spk_0 well, our team at PIMC Labs can help bring them to life. Check us out at PIMC-Labs.com.
spk_0 Gabrielle, check shortly. Welcome to Learning Vision Statistics. And I think I butchered your name
spk_0 D-9. No, no, it was quite good, yes. Okay, okay. It was Bayesian. I rehearsed it and that didn't work.
spk_0 But yeah, thanks a lot for being on the show. I've been meaning to have you here for a while
spk_0 because you do a lot of very interesting things. We know each other for the PIMC world, but that's
spk_0 the first time we actually meet almost in person. So that's great. I'm very happy you're here.
spk_0 Thanks a lot for taking the time. As usual, let's start with your background and your origin story.
spk_0 Can you tell us what you're doing nowadays and how did you end up doing that?
spk_0 Yeah, yeah, for sure. And thanks for having me on. And maybe next time we can be in person
spk_0 some hiking and the mountains. So my background, so originally, so currently I'm in an internet
spk_0 of things lab and IoT lab at the Lutzander University of Applied Sciences and Arts.
spk_0 Within the lab, I'm doing various modeling of engineering processes, but I wasn't always doing
spk_0 that. So I originally studied economics back in the US. And there we were primarily doing econometrics
spk_0 and so like frequentist based statistics. And then from there, I moved to Switzerland to do my
spk_0 masters and do with my girlfriend. And then in my masters, continued in data science. But that's
spk_0 really kind of when I started getting involved in probabilistic programming, invasion, statistics,
spk_0 in particular. And so after graduating, I just instantly immediately started working in the lab
spk_0 at the university. Okay, so so pretty kind of a random random road, right? Like this is rare
spk_0 that people end up in Switzerland, especially when coming from the US or where is my prior wrong?
spk_0 No, no, yeah, it's yeah, a bit odd. Yeah. And so what do you mean by an internet of things lab?
spk_0 I absolutely don't know what that is. Yeah, so pretty much any like a lot of things to do with
spk_0 you an activity of hardware. And so when you're when that hardware is connected to the internet,
spk_0 to provide some sort of connectivity. And then that's kind of what you can kind of think of
spk_0 internet of things. And so like all of your things labeled as like smart devices as kind of
spk_0 under that umbrella term, IOT. And so now everything is being today's being like is being IOTified. And so
spk_0 you have your disk washers connected to the internet, coffee machines and so forth. And so that's
spk_0 kind of really what IOT generally means. Okay, okay, so this is that sounds very algorithmic and
spk_0 deep learning does it? So I mean, it's pretty so with our group, we have a very widespread
spk_0 of knowledge of within our group, you have people that are really specialists and networking
spk_0 and hardware. And then you have and then like stored data storage and processing. And then maybe
spk_0 like me more on the machine learning or data analysis side. And so for me, it's how do you analyze
spk_0 the data coming from various sorts of machines, whether that's manufacturing machines and so forth?
spk_0 Okay, okay. And how did you end up doing that? Because that's not what you studied, right? So
spk_0 how? Yeah. How did that happen? Yeah. I think it kind of came from so back to my bachelor's,
spk_0 any kind of metrics, doing a lot of time series stuff. And then you can kind of think in IOT,
spk_0 it's also a lot of time series stuff as well. Because when you when you have these sensors
spk_0 hooked up to the machines, you're also logging time series, like every, depending on the frequency,
spk_0 like 10 hertz, 50 hertz, you're logging a measurement every second or or 60 measurements every
spk_0 second. And so with that, you get a really nice stream of time series data. And I don't know how I
spk_0 exactly got kind of like brought into the IOT field specifically, but it was kind of like
spk_0 stem from, oh, hey, do you know various time series methods like like a seasonal or moving average
spk_0 and state space models and so forth? And it's like, okay, yeah, some of that can be translated from
spk_0 econometrics over into this IOT row. And so it's kind of a bit of that kind of gradual shift.
spk_0 Okay, okay, that makes it. So a lot of time series, state space models,
spk_0 Gaussian processes, I'm guessing, or at least I'm hoping. Yeah, yeah.
spk_0 How did you end up working on Bay's stats in particular? Do you remember when you were first
spk_0 introduced to the Ant and how often do you use the Ant in your current work?
spk_0 Yeah, so I was first and it goes back to my bachelor's when I was doing the first couple statistics
spk_0 courses. And I just remembered like when doing some very basic regression models. And then
spk_0 in that course, they're like, oh, yeah, you reject the non-hapothesis because of the p-value.
spk_0 And I'm just like sitting there, I'm like, yeah, but who and why? Why is it 0.05? And who came up with
spk_0 that kind of like arbitrary metric? And it just kind of like was always unsatisfying to me like,
spk_0 this is kind of you follow this kind of like strict rule and diagram. And you kind of,
spk_0 oh, okay, yeah, you do this or you don't do that. And that was always kind of unsettling to me.
spk_0 And so from there, it was kind of like a more of a self discovery because there they never taught
spk_0 Bay's in my underground Bay's statistics. And so it's kind of like, okay, what else is out there?
spk_0 And that's kind of that's where I came across Richard McElrister statistical rethinking and then
spk_0 Andrew Gellman's Bayesian data analysis. And so that's kind of how I got more introduced into
spk_0 Bayesian statistics. But then in regards to how often I use it, it's pretty much
spk_0 every day. So almost every project that I've done in this lab has to do with probabilistic modeling
spk_0 in some form or another. And why? Why is that? So how how come Bayesian stance seem very interesting
spk_0 and important to your work and what what do they bring that you can't have with the frequency
spk_0 framework? Yeah. So the big thing I see in like with sensors in the IoT in general is
spk_0 particularly the problems that I'm solving is first off, you have a lot of sensor noise. So these
spk_0 sensors in the processes that they're measuring aren't perfect. And so for example, like if you're
spk_0 having much of a sensor going to manufacturing machine, the speeds that these sensors are logging
spk_0 aren't going to be necessarily exact. They could fluctuate a little bit. And then not only that,
spk_0 the process that it's measuring is always perfect. So if and so I look at that, am I okay, actually
spk_0 probabilistic programming is a really good kind of fit here because we can start to begin to model
spk_0 the uncertainty, some of the sensor noise in the process. And then in the kind of
spk_0 manufacturing process itself. And so being able to like quantify the uncertainty there is
spk_0 very powerful because it kind of lets you account for that for some of the noise in the process
spk_0 and in the measurements. But at the same time, it kind of it's really also difficult because you
spk_0 can imagine that we're some of these settings are logging a lot of data. And so traditionally,
spk_0 basions that computational methods aren't very good with big data. And so it's kind of like they
spk_0 often see kind of in my day to day, like you kind of have this friction between the big data
spk_0 and base. And so that's also kind of maybe we can talk about in a little bit, but you have that
spk_0 kind of friction. Yeah, yeah, exactly. That's where I was going. And that's where my my
spk_0 astonishment comes from. Yeah, so I actually do want to do when I talk about that. Yeah, how
spk_0 how do you manage to combine both like these need for uncertainty quantification and also
spk_0 intuitive uncertainty interpretation. But at the same time, also need to run the models. I don't
spk_0 have frequent you guys need to run the inference, but you have a lot of data. Yeah, that can be a
spk_0 bottleneck. So how do you how do you thread that needle? So I'd say there's kind of three general
spk_0 not really approaches, but maybe techniques that we do. And so the first one is probably what
spk_0 everyone can think of is you kind of have your raw data and then you perform just some aggregation
spk_0 on top, some sort of resampling to kind of reduce the size of the data. And then you just continue
spk_0 to apply your general maybe MCMC on on that. But then the second one is other inference methods,
spk_0 like variational difference. We've seen to be a very good fit because we have a lot of data,
spk_0 but we then with variational inference methods, we can apply that to that data because a lot of times
spk_0 you need like some sort of approximating strategy. And since we have a lot of data, we can come up
spk_0 with a nice sampling scheme to then use within the variational inference method. And then the last
spk_0 one is, yeah, luckily we do have some hardware at our lab that we can just throw GPUs at the problem.
spk_0 So we can use like maybe a lot of times like jets, so numpyro, pyro, and use these more traditional
spk_0 deep learning frameworks for GPU acceleration. Yeah, yeah, yeah, yeah. So that less approach is
spk_0 quite nice also because you don't have to think too much, right? Like a numpyro or PIMC model can
spk_0 just run out of the box on Jax and you get GPU acceleration and you don't have to do anything else.
spk_0 So I would say if you have the, yeah, if you have the computer available, when I do that,
spk_0 especially if you have to come up with a customized variational inference scheme, that's the biggest
spk_0 data is much more, much more intricate. And I'm curious what your experience with the different
spk_0 VI algorithm, maybe if you can give a layup of the land to the listeners, where do you see
spk_0 these methods and the different algorithms being useful or not and what your practical recommendations
spk_0 are in a way. Yeah, so in regards to that, I've just mainly used kind of these the standard
spk_0 implementations that numpyro offers. So like with their auto guides and their mean field and so
spk_0 forth. And so when I, when I look, when we were using those as primarily for like high article models,
spk_0 and we found that, or I found that out of the box, they worked quite well. So I had to kind of
spk_0 go off into some other tangents to figure out which ones work or not.
spk_0 Yeah, yeah, yeah, for sure. I know, I know we are making a lot of effort also in the PIMC
spk_0 to have more more VI. So there is a lot of a lot of effort made with with the Google Summer of
spk_0 Golden, improving out of the box VI. So a TVI also, Jessica Bowski did a lot of work on them.
spk_0 That plus approximation in PIMC extracts that you can use now with in in conjunction with
spk_0 Feet map where you can use the map estimate to initiate the Laplace approach approximation.
spk_0 There is also there the pathfinder algorithm that's already available in PIMC extracts.
spk_0 We will have a dedicated episode about that with Michael Cow who who developed the
spk_0 the pathfinder module in PIMC extracts. People stay tuned for these. And I think I'm forgetting even
spk_0 one VI method that we are adding right now, but maybe that will come back to me. But there is
spk_0 yeah, there is activity relevant on the PIMC sign too. And I feel this is really great because
spk_0 yeah, I think like we've been collecting in the in the last few years to improve that as a
spk_0 as a community or in case to make people more aware of these these different algorithms and
spk_0 and they can come in very handy. Yeah, yeah. Which who who was a kind of pioneer that
spk_0 is at least from from an outsider kind of seems a bit like pyro did stand also do some.
spk_0 Yeah, yeah, yeah, Stan has a lot of that pathfinder was developed by Bob Carpenter and his team
spk_0 actually at the beginning he was developed as an initialization method for Nets, but they realized
spk_0 that actually the the results in themselves were were really good. And so so they just released it
spk_0 also as a separate outcome. But something you can definitely do also is initialize Nets with
spk_0 the pathfinder results, which which can be very useful. Yeah, Mike, there is that. So Stan has a
spk_0 lot. Numpiro has a lot. We've had the advi module in PIMC for for a long time now. Now it's getting
spk_0 a bit more love and we have we have a lot less we have we have the pathfinder as I just said
spk_0 and I am sure I'm forgetting one method, but you know that's what we're going back to. Yeah,
spk_0 and actually if you want if you folks want an introduction to these in the different methods,
spk_0 there is a talk talk I co-wrote with Chris Fonsbach and actually Michael Gao who was talking about
spk_0 a few minutes ago for a pilot everginae and so I will put the the YouTube video
spk_0 and in the show notes and also the Github repo and Chris was the one I was supposed also to
spk_0 to flight over Ginae, but didn't find any affordable flights. So I had to I had to to teach Chris and
spk_0 he was the the sole presenter. I think he he doesn't hate me. He loves presenting and he's
spk_0 so good at that. Yeah, so yeah, yeah, so the video is awesome. He obviously did a great job
spk_0 presenting the material. So I will put that in the show notes folks because that's a very good
spk_0 lay of the land basically of what you can do with the eye what the different algorithms are
spk_0 and then keep an eye out. I think one also recently gave a talk in Piedetta Berlin. Yeah, exactly.
spk_0 That's what I was going to say. So it's not the videos I'm not really seeing it as the time of
spk_0 recording and I don't think they will anytime soon. I think they they usually take about two
spk_0 months to release them. So folks keep an eye out on the on the Piedetta Berlin YouTube channel
spk_0 and then watch Juan Ordus talk there where he basically builds up on our presentation at Piedetta
spk_0 Berlin at Piedetta, Virginia and then builds on then on top of that and shows practical implementation
spk_0 of the eye, especially as vi with an empire with really good practical advice. So yeah, really
spk_0 we're coming back to this is a very good one. And actually I'm curious if there is anything you
spk_0 do in particular Gabriel when you when you use vi to try and make sure that the results
spk_0 are getting back are reasonably close to the posterior because we have these these guarantees
spk_0 with nets with MCMC but we don't with vi algorithms. And so usually something I do is
spk_0 trying the model and fake data and make sure they can recover the parameters of interest in
spk_0 a reasonably close range or for the parameters they can't try and see if there is a pattern in the
spk_0 bias. And at least we know there is a bias in the model and we're and that's really very helpful
spk_0 because well if you can at least get a model running with vi even though you know there is a small
spk_0 bias. I would argue this is already better than having no model because you only want to do MCMC.
spk_0 So yeah, but so basically I think this is something useful but I am sure you're doing much better
spk_0 things than than because you have more experience than me on that front. No, not not front, not really
spk_0 because I also kind of take the same approach there like before kind of scaling out to like maybe
spk_0 a more complex model on our problem in industry. Like I try to simulate what that data or engineering
spk_0 process looks like on yeah on simulated data and then see just pretty much exactly what you said
spk_0 kind of a cow well the algorithm is kind of recovering the parameter or the posterior and is
spk_0 able to actually model the problem a hand. Yeah, it's funny. Yeah, that's pretty much kind of what I do
spk_0 as well. I've found that I do. Okay, cool. So I'm not I'm not doing something obviously stop it
spk_0 that that's good that that is rare. So anything you want to add about about vi things you've noticed
spk_0 in the wide that the word particularly well or particularly bad before we move on to some other
spk_0 topic. So yeah, something I really want to talk to you about is that something you've worked on
spk_0 for a long time like this has this was really a masterpiece. So I thank you first for for doing
spk_0 that and this is your your re-implantation of Bart patient and additive regression trees in rest.
spk_0 So people probably know you can do Bart models with point C in a package that's called
spk_0 Pint C Bart and this is not some package I use it whenever I can but it has the defaults of
spk_0 regression trees which is you have if I remember correctly you have as many parameters as you have
spk_0 rose in your desert. So that means it grows pretty fast in the computational demons. So
spk_0 yeah it makes periods when you start passing 200 k observations it starts to be to be
spk_0 resueled to infer. So what you did is re-implement the sampling algorithms which is Metropolis Gibbs
spk_0 I think or something like that what what algorithm is that like a particle Gibbs. Yes, that's
spk_0 particle Gibbs. Yeah. So particle Gibbs and you re-implement that in rest. So yeah, can you can you
spk_0 talk about that deck basically why why would you start doing that and basically give us the
spk_0 the elevator pitch for the project before we don't have it be deeper. Yeah, so the the Pint C Bart
spk_0 project does this really kind of I believe comes from it comes from Osvaldo and about a year ago
spk_0 he reached out and said hey we need to make this thing faster are you interested I'm like hey I'm
spk_0 all for it was do it and but before that I really hadn't used Bart or was too familiar with
spk_0 with the the method I mean I was familiar with more like gradient boosting techniques which is
spk_0 somewhat similar but I did have the experience with the rust and so that was kind of a good
spk_0 kind of complement to each other it's like okay I see kind of like maybe what like Adrian
spk_0 say bolts doing with nut pie and rust maybe we can kind of share some of the code that he's been
spk_0 doing and then use that within Pint C Bart to help kind of at least with the log probability
spk_0 evaluations and so forth and so yeah this thing this really stemmed from Osvaldo when wanting to
spk_0 make a more perform it and then me stepping on board and saying hey okay let's yeah we just
spk_0 implement this in rust and then share some of the code base from pie
spk_0 hmm okay okay and so how is the how is the experience like it was was rest all you to you
spk_0 how do you even start on such a huge project yeah so I do have prior experience with rust within
spk_0 some data processing pipelines in the IOT lab so the rust part wasn't entirely new to me
spk_0 but what was new was kind of interropping with Python having Python bindings so that way the
spk_0 Python user when they call the Bart code it calls then it executes the rust implementation
spk_0 but in regards to the implementation process essentially what the approach I kind of took was
spk_0 okay let's implement this kind of essentially one for one from the Python implementation into the
spk_0 rust implementation and then from there we can start to kind of optimize the different whether
spk_0 the different functions or methods and then that way we can get a more of a nice performance
spk_0 improvement instead of kind of just immediately rewriting something then we wouldn't know maybe like
spk_0 okay then now this isn't kind of working right where where did that go wrong and so forth and yeah
spk_0 I don't know if you want to talk about some of like the rust specifics so the algorithm specifics or
spk_0 yeah maybe so it's been a while since we talked about Bart and repression trees on on the show so
spk_0 maybe if you can introduce the methods the tree methods in general you mentioned great
spk_0 boosting we obviously mentioned Bart so maybe give us just the elevator pitch for Bart and tree
spk_0 methods in general and then I think it will be useful to dive into a bit more of the technical
spk_0 details of the algorithm to understand really how the methods work and why people could be
spk_0 interested in using Bart and in which cases yeah so yeah at a high level then like you have these
spk_0 tree based methods and at like the simplest level you have your decision tree and so that's kind of
spk_0 your logic like if this variable is greater than some value you kind of go down the tree and then
spk_0 you finally get to a leaf node and that's kind of like your prediction for a target or your response
spk_0 variable so building up off of the decision tree you can you can have like a random forest which is
spk_0 then like a bunch of those decision trees together and do a forest but then kind of even starting up
spk_0 all on top of that you have gradient like boosting methods and so this these methods are really where
spk_0 you attempt to like lure like you use kind of like the random forest but you learn you learn
spk_0 like the residual between the where the tree the difference in the tree's predictions and so
spk_0 when you start to do that you're kind of like it's kind of like a meta learner you're kind of learning
spk_0 where each tree is doing beta to kind of come up with a better producing or better predicting tree
spk_0 and so this is kind of really more where Bart is aligned with with the gradient boosted
spk_0 methods rather than a random forest because Bart is kind of doing the same thing as boosting
spk_0 these gradient boosted methods the way it's kind of assembling these trees is different
spk_0 and the way that the Bart assembles these trees is by taking random perturbations and then
spk_0 assessing the log likelihood of that tree. Okay yeah yeah so that's closer to gradient boosting
spk_0 way of doing things right yeah and okay so that's the that's the iVager beach now when are these
spk_0 models particularly particularly useful in your experience and why is there strength and drawbacks
spk_0 so one of the strengths I think with with the with so if you want to kind of compare like Bart and
spk_0 like a traditional maybe XG boost or light GBM model one of the big benefits of Bart is that you
spk_0 get the uncertainty quantification you have a posterior over over decision trees and so
spk_0 then with that you can actually kind of you can actually stick that model into maybe other things
spk_0 that you want to use uncertainty for for example like Bayesian optimization traditionally uses
spk_0 Gaussian processes but you can actually you can actually stick this Bart model into the Bayesian
spk_0 optimization routine as well because you also have the uncertainty there but one of the big drawbacks
spk_0 with Bart is that it's famously slow compared to the like XG boost or light GBM and so that's kind
spk_0 of one of the big drawbacks I see there with that method but another nice thing about Bart is that
spk_0 so with like XG boost and light GBM it's very easy to overfit on your data and so you're going to
spk_0 you need to look at a lot of like loss or about loss curves and figure out okay hey when do I
spk_0 stop training when how much how many trees do I use how many learners and so forth to kind of
spk_0 stop the tree stop the tree of training and stop the tree from overfitting but with Bart it's
spk_0 really nice because you can we have regular regularizing techniques so that way we avoid overfitting
spk_0 kind of inherently within the method and so that's one really nice kind of pro I see with with Bart
spk_0 over the others but yeah the big I'd say Khan is that it's significantly slower than the other ones
spk_0 and that's just for multiple reasons so yeah thanks so these these these is much better so and I
spk_0 hope it is too for listeners so now I think it's a good time to dive into why that would be like where
spk_0 the buttocks are in why like what's the algorithm per se and how does it work basically earned
spk_0 are the hood so that people really understand the models when when they use that
spk_0 yep so in regards to like pymc bar we implement as I think we stated before particle Gibbs whereas
spk_0 other implementations might implement like a metropolis has to approach and so with the particle
spk_0 Gibbs steps how will the algorithm works is that so we generate a set of trees so maybe 50 or in
spk_0 in pymc bar you define the number of trees and the number of particles and so for example you
spk_0 might say okay we want 50 trees and then 10 particles and so now we're gonna we're gonna perform
spk_0 a series of particles or particle Gibbs steps and so at the first step we want we're gonna
spk_0 loop through the for all 50 trees so for the first tree we're gonna initialize then text maybe 10
spk_0 particles whatever you define and then those 10 for all those 10 particles which are just decision
spk_0 trees we perturb each one maybe we decide for we sample for a variable a certain split value and
spk_0 another one another split value and then we assess the log likelihood and then at the end we say okay
spk_0 hey this particle maybe particle 5 out of the 10 that's gonna replace the current tree which is 1
spk_0 and then we proceed to the next tree tree number 2 we go to that same process initialize 10 particles
spk_0 perturb each one wait them according to the log likelihood and then replace that tree and we
spk_0 continue until all 50 trees are essentially replaced and then and so yeah that's really kind of
spk_0 at a high level the main algorithmic steps it's really quite a simple process which is quite
spk_0 surprising if you read kind of a lot of these papers yeah and were you already versed that much into
spk_0 into pot and in tree methods before working in that or or did you get that knowledge by working
spk_0 at least project not really so it was mainly knowledge from working on the project and reading
spk_0 the code base that Osvaldo and others did which was quite readable which is really nice nice
spk_0 procedural kind of line by line oh this is what it's doing yeah and so that really kind of helped
spk_0 with the intuitive understanding of oh hey this is what the particle gives us doing this is kind
spk_0 of yeah and I second that yeah the the code base is really really well really well done and written
spk_0 and it's quite easy to to start contributing to the package so this is the this really awesome
spk_0 because I've dabbed a bit with with part for working baseball and I haven't tried yet your
spk_0 rest implementation because that is very useful baseball because there is a lot of use cases for
spk_0 methods like bot but there is so much data that you often need you often need an acceleration somewhere
spk_0 so yeah whether it's using classic pine C part on a GPU or actually using your rest implementation
spk_0 and adding a GPU to put that that should probably be a really good boost to to some fixed speed
spk_0 yeah yeah and I and I must say one of the things that's really nice with the pine C part is that
spk_0 there's several thing like really nice enhancements that we have and so if you go look around
spk_0 online a lot of the other packages are specifically for Gaussian lightlihoods so that's the
spk_0 first one so you can't really model like a post on process or any other the second one is that we
spk_0 also offer various split rules so if you have in your design matrix numerical features and categorical
spk_0 features you can pass split rules specific for that data type and this is something that's common
spk_0 and other packages that it kind of just assumes everything is a numerical value and so that those
spk_0 are kind of the two really nice things I think differentiates kind of our package but then lastly is
spk_0 that like we have the bar kind of random variable and how this is embedded in PNP and PNC and so you
spk_0 can model the linear predictor you can model the sigma the uns that parameter and so that's really
spk_0 nice because you can build essentially arbitrary probabilistic programs with bar whereas other
spk_0 package it's kind of that's the that's you use that method and that is the method that you use
spk_0 yeah yeah exactly this is actually a very good point that you can module you can add that as a
spk_0 module in the PNC model so you could model your linear predictor with a classic linear regression
spk_0 and then your sigma your send adviation you could model that with a bot yeah random variable so
spk_0 this is very useful and yeah and I must say that recently in the current existing Python implementation
spk_0 support has been added for more than one part random variable which is which is really great and
spk_0 has been something that's been requested yeah yeah so you could do like you could do two different
spk_0 bots on two different parameters so this is really awesome in a way that's starting to look a lot like
spk_0 the gp server module gp's you can add them to PNC models really as you want in a new can have
spk_0 different gp's for any number of parameters you want to your model and you really you cannot do that
spk_0 with like gp packages gp focus packages most of gp focus pages is what you can just use it as
spk_0 that's all you're gonna do anything else and often also likelihood likelihood limits in
spk_0 PNC you can use a gp with any likelihood a distribution in most of the packages it's often
spk_0 normal normal likelihood biggest that's often hard good yeah how is it so I know on the bar
spk_0 Python pure Python bar we can use any likelihood we want how is it on the rest side now is
spk_0 I remember at the very beginning you had not included yet categorical multinomial ability to
spk_0 to use that kind of likelihood of course I always use that likelihood so I was like damn gonna
spk_0 choose that yet but yeah how how easy it right now when it comes to the likelihood especially the
spk_0 most multi-dimensional one which always I know much more of a pain to develop
spk_0 yeah so in regards to the current state of the rust implementation there are still some things
spk_0 that are implemented one to one and I'm still working on that but in regards to the likelihood
spk_0 that that's been resolved so you can model multiple multiple different likelihoods but I think
spk_0 the one you were specifically asking was in regards to the different split rules like the
spk_0 categorical and like a continuous split rules and that is also now implemented in the rust
spk_0 implementation but the one thing that's that's not yet is the multiple bar random variables I still
spk_0 and working out some bugs there and so that's still something that's kind of being implemented on
spk_0 our end okay yeah yeah so concretely we can do anything we PNC bot rest that we can with PNC bot
spk_0 except for having two more than one random variable bar trend variable in the PNC bot
spk_0 otherwise everything is on par right now yeah amazing yeah it's cool yeah yeah thanks so Gabriel
spk_0 yeah that means now we'll be able to use that much much more on baseball data this is gonna be
spk_0 super fun um and how do you squeeze that in actually like is it part is it part of your jump or
spk_0 is it something you do on the side and like maybe you have some advice for people who when it's
spk_0 down doing some open source line we do and and they would have some practical advice of how to
spk_0 squeeze that in in the work and and the free time because in the end this is really what research
spk_0 is about right that trying to push the envelope and on very frontier topics which are not only
spk_0 gonna pay off for your project but for your company as a whole and a lot of other people who are
spk_0 not used to have that one yeah and so yeah luckily with this kind of a lot of the stuff I do at work
spk_0 uses these tools and so if if I if like our team and me see kind of like hey it would be really nice
spk_0 if we could speed up bar because it would help our problem network then like doing that doing the
spk_0 open source at work kind of aligns quite nicely but if the problems are kind of aren't really
spk_0 related yeah that's kind of in my own time and so forth but in regards to contributing more generally
spk_0 I honestly the PMC in Bambi community is just I think one of the best in the scientific open
spk_0 source community does everyone is very inviting and willing to help but my advice kind of I'd say
spk_0 to people starting out is don't bite off more than you can chew pick kind of maybe the low hanging
spk_0 through and then work your way up from there I've found to be quite a fairly more safe approach
spk_0 and kind of goes goes better with the maintainers that way yeah yeah yeah yeah that that that
spk_0 if he sounds right and that's what I recommend to also to people who reach out to me maybe one last
spk_0 one last question on on Bart since you use that a lot in your work what's your experience on these
spk_0 models what do you find they are very useful for where do you see their limitations to be
spk_0 yeah so I've used them in two scenarios one of them is it within embedding bar and
spk_0 bashing optimization routine which you just you talk to with max but then the other one is
spk_0 specifically for a time series process that is that I'm going to use my hands but like
spk_0 exhibit kind of like a kind of like a partitioned kind of like blocky that's not really good the
spk_0 time series is continuous they kind of has like this block structure so like from time from point
spk_0 one to point B as a constant value and then the next the next time interval it goes up to another
spk_0 shoots up to another value and then it's constant for a little bit and so this is kind of quite nice
spk_0 because these three methods are essentially kind of like a piecewise linear function and so it's
spk_0 able to model that just kind of inherently quite nicely yeah and so that's just kind of a yeah like a
spk_0 very raw weird time series which is I mean no time series no time series is really continuous but
spk_0 it's like you don't even have enough point for it to look like it's continuous and so at the
spk_0 bottom the discreteness of the tree structure here is a free and asset yeah exactly yeah and you kind
spk_0 of you kind of see this come up in sensor based time series kind of quite often I think if you
spk_0 kind of look at maybe like our profiles over a time you kind of see that kind of like like block
spk_0 looking structure as well and then you can kind of like oh maybe like these tree based methods might
spk_0 work here yeah okay this is very interesting I love it thanks actually two other questions on that
spk_0 well what about the time intervals because a lot a lot of the time having fixed time intervals is
spk_0 much easier to deal with what's your experience here with bot models like does that do all the
spk_0 sensitive to the fact that sometimes the time intervals are not sensitive which I guess might be
spk_0 the case with sensor data related to that what about missing data and what about out of some
spk_0 preparations I know it's a big big question that it's it's a related so so the in regards to the
spk_0 time interval so you're you're saying when the time intervals are unequal like the time between
spk_0 okay yeah so in regards to like the bar or just just tree methods in general I think are very
spk_0 they're very good for interpolating missing values because you can kind of
spk_0 impute that or interpolate that inherently within the tree and so if you have like a sensor
spk_0 that's maybe didn't log a measurement over a certain time period but that all this certain kind
spk_0 of comes back online and then it continues logging with the tree methods you do you do get a nice
spk_0 interpolation there and so you don't really need to do any kind of feature like feature processing
spk_0 beforehand and so it's nice because that's handled inherently within the model.
spk_0 Okay yeah yeah so it's just great that's what I thought but yeah basically when there is no
spk_0 fixed time interval it's like a missing data problem yeah so yeah so they are very good at
spk_0 internal interpolation how how good are they at extra
spk_0 pollations so really doing out of some predictions how does that work here yeah so luckily I
spk_0 haven't really had to use it for out of sample predictions in my interest interesting yeah because
spk_0 I mean obviously I'm asking you that because I know tree methods are not good at it at other
spk_0 simple predictions so I'm glad I haven't had to use it so I mean I think that's one of the
spk_0 reasons why I did choose to use that because yeah because like in particular with some of the
spk_0 couple of the problems we were modeling for example like if you have like actuator limits of a robot
spk_0 that that's pretty clear upper and lower bounds that you have from the engineering process
spk_0 and then so you know you're not going to be extrapolating past that and so you know with the
spk_0 bar then you have advice interpolation within these actuator limits.
spk_0 Yeah exactly yeah so and that's actually why also I haven't been able to use
spk_0 thought methods yes yet in production other than for for exploring and teaching because most of
spk_0 the time I work on actual out of simple data so let's say if I if I work on players the age of
spk_0 players is not really out of sample mine though old players are human so you'll never have a player
spk_0 with 120 years old but if you were looking at season for instance well the years the years really
spk_0 are out of sample and so here it's a problem or players themselves are out of sample what about
spk_0 the player you never saw in your in your training that said so that's often why I couldn't
spk_0 choose tree methods or part methods because they don't don't extrapolate in comparison to
spk_0 caution processes which are really good in general at prediction and space based models.
spk_0 Okay awesome and so one last question I swear on on Bard what do you mean by using them
spk_0 in optimization routines I find that super interesting. Yep so in regards to the optimization
spk_0 routine I was specifically talking about Bayesian optimization so essentially this Bayesian
spk_0 optimization is a sequential optimization process where you typically have some sort of surrogate
spk_0 model and that can be typically it's a Gaussian process but it can really be any other method
spk_0 that provides a posterior and so I'm so I'm essentially kind of flopping out this GP
spk_0 putting in the bar model there and so using that to optimize for some industrial process
spk_0 and so with the iterative method essentially what we're kind of doing is we're training the model
spk_0 in the historical data and then we're using you don't have to get to the details but some sort of
spk_0 function or generator to generate a new set of features feature values or design points and then
spk_0 then evaluating that back with bar and then running the loop again retrain the model generates
spk_0 some new values evaluate with bar and so forth and so that's kind of what I mean generally with
spk_0 the Bayesian optimization and it's just Bayesian because we're using probabilistic methods from
spk_0 what I can tell okay so it is your like bar did it include it in your last function
spk_0 it's it's it's the it's the surrogate model so it's so for example
spk_0 in a lot of and so if you think about if you want to optimize a
spk_0 machine for like the scrap rate how much scrap an industrial machine is producing you probably
spk_0 don't know the physical like the equations that generate that that will govern or produce the
spk_0 scrap and so what's the next best thing we can do we can turn to data driven methods and so there
spk_0 we collect data about the process maybe you have sensor measurements on how fast the robot
spk_0 arm is moving how fast is material being fed into the machine and then you also have measurements
spk_0 okay this scrap was produced no scrap was produced and so forth and so we use then the bar or the
spk_0 GP and to learn the association between the parameters governing the process and your whatever metric
spk_0 your track and then so now that you have that that's kind of now your your your your function that's
spk_0 your mapping from inputs to outputs and then so with the Bayesian optimization framework
spk_0 or loop you're then kind of deciding oh hey we we want to optimize we want to produce as little
spk_0 scrap as possible so we're going to use this model that we just trained on to propose or to select
spk_0 the values that produce the least amount of scrap would that make sense okay okay yeah yeah I think
spk_0 it does so so this is not really that you are using bar inside of a loss function when doing
spk_0 optimization this is more this is something different that necessarily no okay nice yeah do you have
spk_0 driven the public writing about that people who look at if they are interested in these kind of
spk_0 methods we are writing a paper but it's not published yet so unfortunately now okay well let me know
spk_0 when it is because then we'll we'll publish that in the in the LBS in the LBS sphere which as you
spk_0 is extremely powerful in the world you know making my short to do great so I think I think
spk_0 it's a good summary of everything about do you have anything to add on that on that I forgot to ask you
spk_0 where do you think we did a good job already to give people an idea of how they can use that
spk_0 no so I mean our goal is to essentially provide backwards compatibility with the rest
spk_0 implementation so it's just a drop in replacement but I think the things that we maybe didn't touch
spk_0 on too much maybe for some of the rust people out there maybe like what were some of the interesting
spk_0 rust like rusty bits that kind of resulted in some nice performance gains I think could be kind
spk_0 of fun to talk about yeah yeah yeah um one so a couple of the things or especially one of the
spk_0 areas that was nice to implement with rust is is in the the tree proposals and so what we do with
spk_0 pi and c bar is we have a prior probability over the depth of the tree and so if you think of a
spk_0 binary tree as a like as you add nodes to it it'll then the depth of the tree will increase
spk_0 and so we have a prior probability of how deep a tree can be and you can actually set this as a
spk_0 user using the two two parameters alpha and beta and so in the tree proposals we propose a variable
spk_0 to split on and a value to split and so based off of that and the prior probability of the depth
spk_0 of the tree we can essentially we then can say how likely a tree is to be grown so essentially how
spk_0 likely is it to the depth to increase and so traditionally then in the original Python implementation
spk_0 um we performed in the tree in the tree proposal we would we would always perform a tree
spk_0 proposal in a systematic resampling to propose the particle to replace the tree but with rust
spk_0 implementation we take a lazy approach and so we use a smart pointer called reference counting
spk_0 to essentially defer or wait to essentially materialize the the growing the tree until we know
spk_0 we've until we know we will accept that tree to grow and so so we kind of like beforehand we'll
spk_0 we'll calculate the proposal we'll compute the proposal and we'll say hey this is what it would do
spk_0 if it were to be chosen or selected and then if it is selected okay materialize actually compute the
spk_0 results and so it's a bit of a lazy lazy kind of way of doing it and so there it's it's really nice
spk_0 because in the systematic resampling we resample according to the the weights or the log likelihood
spk_0 of the trees and so if you have 20 particles and then after systematic resampling you say we select
spk_0 like 10 10 of those new particles all come from the same one because I had a high weight we essentially
spk_0 have to copy or perform a deep copy or a clone of that tree structure which can be very expensive
spk_0 but since we're now kind of using these smart pointers for an only copy if we know we're going to
spk_0 accept the tree proposal then we get a really nice performance boost there and that's kind of
spk_0 a just a lower level detail that that's I think quite cool yeah yeah that that is definitely
spk_0 super cool and so if people want to want to get started on Pimacy Bart and especially the
spk_0 rest implementation what should they do what should they do can't what should they download
spk_0 yeah so so if I think if people want to get we just read the code base to start helping
spk_0 currently the code base is under my repository and I think we can link that in the show notes
spk_0 and and I have several issues there where things that should be need implemented or cleaned up
spk_0 but yeah and so I think that would be a very good place to start is just going to the repo and
spk_0 looking at some of the issues I have I have them all tagged as good first issue and various other tags
spk_0 yeah yeah yeah yeah so I put that already in the in the show notes so if you look at that folks
spk_0 that's if you want to start getting involved I was asking what if you're when at what if people
spk_0 want to start using it what I would advise is and that's going to be in the show note too
spk_0 good is the Pimacy Bart website look at the tutorial notebooks and then just install Bart RAS
spk_0 and just run these notebooks but using the rest implementation and you'll see it's literally
spk_0 dropping your replacement except when you need to use two Bart random variables as we're seeing but
spk_0 otherwise this is literally the same and I think it's amazing is makes it so easy for people to
spk_0 start and Bart models are really good because they are super flexible they are very easy to
spk_0 understand and they are usually a very good baseline if you are in a case where trim methods are
spk_0 applicable so if you're in these case practical advice would be definitely try Pimacy Bart because
spk_0 the model is going to be super easy to write it's going to be just like this but just figures out
spk_0 the functional form so it's just like one Bart variable you fit that into your likelihood
spk_0 and you're done and you see how the height works if you're in the cases we say we talked about
spk_0 before where it doesn't work I will then that's going to be for next time but if you are looking
spk_0 after being these kind of cases I think it's a very good shot too yep absolutely and so to play
spk_0 a sound Gabriel I saw that you recently worked on some other optimization problem which is
spk_0 reproducing uber's mouth cat place optimization and you have a really good block pause to put
spk_0 in that I put in the show notes and you put also the code into a guitar repo that is also in the
spk_0 show notes folks if you want to look at it do you want to do in a touch on that briefly basically
spk_0 what it is what it does and how why would people be interested in that uber has a system in place
spk_0 that performs resource allocation and so what their problem is is they have a what uber can do is
spk_0 there are a ride ride hailing service with a bunch of different programs like uber eats and your normal
spk_0 driving scenarios but what uber can do is that they can actually they can influence the marketplace
spk_0 by by essentially allocating money to different programs to stimulate supply and demand but this
spk_0 results from a business problem as in a as a company we have a finite amount of money how much should
spk_0 we allocate to each city in each program with the city such that we can maximize some sort of
spk_0 business metric and such as like gross bookings which then maybe influence with the profit of the
spk_0 company and so I am interested I was interested in in how you can how this even works so how do you
spk_0 perform resource allocation with optimization methods but then what I found out quite interesting
spk_0 was was that they were embedding a neural network into the optimization algorithm to model the
spk_0 forecasting problem and so you kind of have these two kind of interesting components you have the
spk_0 optimization algorithm but then the fact that they're embedding a neural network into the system to
spk_0 help learn the association between how much money they're allocating and how much this influences
spk_0 a business outcome such as gross bookings and so is that the same idea as the what you talked about
spk_0 and then you can go back to the optimization yeah yeah so it's kind of the same and because I
spk_0 think I don't think a lot of people I mean maybe a lot of people know this but like you you can really
spk_0 embed like any machine learning model into like an optimization kind of program and then optimize
spk_0 for those features that you're using in the model and so essentially this is kind of what I
spk_0 want us to do here is like okay what's embed a machine learning model into an algorithm
spk_0 where optimization component to produce an optimal allocation scenario but what's really
spk_0 interesting with the optimization algorithm used here the alternating direction method of multipliers
spk_0 is that it's for it's a distributed optimization algorithm and so you can
spk_0 so it happens kind of in three steps and so in the first step you use the neural network to
spk_0 predict essentially how much gross bookings each city is going to have given a certain amount
spk_0 of allocation to this program and then you select the ones that optimize that objective and then
spk_0 the next step you perform a consensus step where you are trying to get the cities to kind of
spk_0 agree with each other to satisfy the constraint typically you have the constraint of maybe
spk_0 uber can only allocate a million dollars and we need to divvy up that a million dollars to each
spk_0 city such that the sum is is not exceeding million and so you have that consensus step and then the
spk_0 last step is just kind of a dual update step and then you kind of the new iterate over this and so
spk_0 it was really kind of a nice exploration but I think what could be more interesting and something I
spk_0 talked to with Warren was what if we embedded more of a probabilistic model in D here and so then
spk_0 instead of just and said so now we can have the entire posterior over the this over our decision
spk_0 space and we can say hey you should allocate like between 200,000 to 150,000 to city a and program
spk_0 a b and so that's just kind of kind of where we I kind of see this going in a way is kind of
spk_0 instead of now replacing your neural network with a more of a probabilistic model to have uncertainty
spk_0 over our decisions. Really cool yeah yeah this is really amazing uh very low net well that's
spk_0 actually a public writing that we can refer people to if they are interested in this idea you were
spk_0 explicitly before also embedding a barred model into an optimization algorithm I think I think
spk_0 it's very close at least using this is using a neural network but this is also very very cool so
spk_0 yeah um and I definitely some some application event in the baseball in the baseball for sure I mean
spk_0 the spots in the spots world in general so yeah this is this is super cool yeah thanks thanks
spk_0 Gabriel so yeah all the links to an ad during the show notes any any other current or upcoming projects
spk_0 you want to talk about so before we close up the show
spk_0 something you're excited to so not really current projects on the play maybe there are a couple
spk_0 previous projects where I see more probabilistic programming could be in play but yeah nothing
spk_0 up thing at the moment okay and I'm curious also what do you what are you curious to see
spk_0 in the coming months then night like what is it something you would really like to see in the
spk_0 Bayesian world maybe but in the in the data world in general in the data science world in general that would
spk_0 have a huge impact and potential on your on your work you know or things you are able to do
spk_0 I think because I think a recurring theme of a lot of the problems I work on is optimization
spk_0 and I think better sub better tooling around yeah embedding or using machine learning models
spk_0 probabilistic models within an optimization framework whether that's Bayesian optimization whether
spk_0 that's in the traditional convex optimization or sequential decision making I think because
spk_0 typically now especially at like work I need to hand roll all of that together myself and I think
spk_0 it would be really nice to have a package or a framework that really helps with that process
spk_0 yeah yeah it's I agree it would it would be something very interesting and very useful amazing
spk_0 damn thanks odd Gabriel I am very very excited to try these new these new things I like the
spk_0 the bot rest part and also the optimization thing so folks if you want to contribute to
spk_0 barter s the links are in the show notes we're always looking for people who want to make this
spk_0 better for themselves and everybody at the same time and I'm sure that Gabriel will welcome any help
spk_0 on that anything to as Gabriel before I still ask the questions
spk_0 you know no nothing nothing for my seat good I'll take it as a sign I did a good job
spk_0 so I'm gonna ask you the last two questions I give a guest at the end of the show if you had
spk_0 a limited time and resources which problem would you try to solve
spk_0 mm-hmm so I think I'm going to defer back to what I was just saying in regards to the tooling
spk_0 and in particular and as a specific problem space is sequential decision making and so kind of
spk_0 the big idea or the big pitch there is like what so what decision should you take now such that
spk_0 you're a immediate action or you're immediate reward is maximize but also takes into account
spk_0 the future contribution the expectation of the future contribution and this is kind of this
spk_0 problem space of sequential decision making sequential optimization is really kind of quite formal
spk_0 in the control theory world but in regards to kind of like business applications I think is quite
spk_0 lacking especially in the open source world and so developing like a library or framework for that
spk_0 I think would be a great step forward for modeling sequential decision problems that's something I
spk_0 would really like to work on hmm yeah that definitely sounds like it would be very useful and second
spk_0 question if you could have a dinner with any great scientific mind dead alive or fictional
spk_0 so I already had dinner with Tommy Catrata in Buenos Aires
spk_0 and I recommend you to rest friends so I feel like I was at that dinner two you know yeah no
spk_0 probably I would say probably Richard Feynman because I've read some of the biographies of him
spk_0 and not only would it be like I think it just be a fun dinner right like yeah a lot of technical
spk_0 people can be quite boring or socially awkward but Feynman being technical and fun I think it would
spk_0 be a very good dinner experience yeah yeah yeah definitely a great on all these accounts
spk_0 that Feynman sounded very interesting and cool and that the technical people can be what you said
spk_0 so yeah this is a great choice your different these dinner is getting grounded I can tell you
spk_0 this is a popular choice so we're gonna have to to scooch at the dinner table but you know we should
spk_0 go to Buenos Aires to the same restaurant I'm sure Feynman would have a lot of things to say about it
spk_0 I think so too I forget I forget the name otherwise I would recommend it right now to
spk_0 yeah me too actually and it's like I don't know I'm blanking on the name um Tommy come to
spk_0 our rescue yeah yeah awesome well Gabriel and that was a great show thank you so much for taking
spk_0 the time show notes are gonna be full for these one folks so make sure to take a look at them
spk_0 and well Gabriel next time you have a fun and useful project like that you are welcome anytime
spk_0 on the show otherwise really looking forward to meeting you in person in Switzerland at some point
spk_0 I definitely gonna come and do some hiking over there which my wife and I love Gabriel thank you again
spk_0 for taking the time and being on this show yeah thank you so much it's been a lot of time
spk_0 this has been another episode of learning Bayesian statistics be sure to rate, review and follow
spk_0 the show on your favorite put catcher and visit learnbasedats.com for more resources about today's
spk_0 topics as well as access to more episodes to help you reach true Bayesian state of mind that's
spk_0 learnbasedats.com our theme music is good Bayesian by BebeBerrytman fit MCLoss and Megeran
spk_0 check out his awesome work at BebeBerrytman.com I'm your host Alex Endora you can follow me on
spk_0 Twitter at Alexander's or Endora like the country you can support the show and unlock exclusive
spk_0 benefits by visiting patreon.com slash learnbasedats thank you so much for listening and for your
spk_0 support you're truly a good Bayesian change your predictions after taking information and if you
spk_0 put in number less than the Bayesian let's adjust those expectations let me show you how to be a good Bayesian
spk_0 change calculations after taking fresh pain in those predictions that your brain is making
spk_0 let's get them on a solid foundation