Technology
#142 Bayesian Trees & Deep Learning for Optimization & Big Data, with Gabriel Stechschulte
In this episode of Learning Vision Statistics, host Alex Sandora speaks with software engineer Gabriel Stechschulte about his work on Bayesian additive regression trees (BART) and its applications in ...
#142 Bayesian Trees & Deep Learning for Optimization & Big Data, with Gabriel Stechschulte
Technology •
0:00 / 0:00
Interactive Transcript
spk_0
My guest today is Gabrielle Stechscholzzi, a software engineer passionate about probabilistic
spk_0
programming and optimization. Gabrielle recently re-implemented BART, Bayesian additive regression
spk_0
trees, in rest, making the algorithms faster, more flexible and more suitable for real-world
spk_0
applications. So if you are a PIMC BART user, I definitely recommend checking out his implementation
spk_0
that is in the show notes. In our conversation, we dive deep into what makes BART special. Its
spk_0
ability to quantify uncertainty, handle different likelihoods, and serve as a strong baseline
spk_0
in settings like optimization and time series. We also explain how BART compares with
spk_0
Gaussian processes and other tree-based methods, and talk about practical challenges like handling
spk_0
missing data, integrating BART into PIMC and embedding machine learning models into decision-making
spk_0
frameworks. Beyond the code, Gabrielle reflects on open source collaboration, the importance of
spk_0
community support, and where probabilistic programming is headed next. This is Learning Vision
spk_0
Statistics. Episode 142, recorded September 18, 2025.
spk_0
Welcome to Learning Vision Statistics, a podcast about patient inference, the methods,
spk_0
the projects, and the people who make it possible. I'm your host, Alex Sandora. You can follow me on
spk_0
Twitter at AlexUnderScoreAndora, like the country, for any info about the show. LearnBateStats.com
spk_0
is a lab class to be. Show notes becoming a corporate sponsor, a looking Bayesian merge,
spk_0
supporting the show on Patreon, everything is in there. Let's learnBaseStats.com. If you're
spk_0
interested in one-on-one mentorship, online courses, or statistical consulting, feel free to
spk_0
reach out and book a call at topmate.io-slash-AlexUnderScore. Andora, see you around, folks,
spk_0
and best Asian wishes to you all. And if today's discussion sparked ideas for your business,
spk_0
well, our team at PIMC Labs can help bring them to life. Check us out at PIMC-Labs.com.
spk_0
Gabrielle, check shortly. Welcome to Learning Vision Statistics. And I think I butchered your name
spk_0
D-9. No, no, it was quite good, yes. Okay, okay. It was Bayesian. I rehearsed it and that didn't work.
spk_0
But yeah, thanks a lot for being on the show. I've been meaning to have you here for a while
spk_0
because you do a lot of very interesting things. We know each other for the PIMC world, but that's
spk_0
the first time we actually meet almost in person. So that's great. I'm very happy you're here.
spk_0
Thanks a lot for taking the time. As usual, let's start with your background and your origin story.
spk_0
Can you tell us what you're doing nowadays and how did you end up doing that?
spk_0
Yeah, yeah, for sure. And thanks for having me on. And maybe next time we can be in person
spk_0
some hiking and the mountains. So my background, so originally, so currently I'm in an internet
spk_0
of things lab and IoT lab at the Lutzander University of Applied Sciences and Arts.
spk_0
Within the lab, I'm doing various modeling of engineering processes, but I wasn't always doing
spk_0
that. So I originally studied economics back in the US. And there we were primarily doing econometrics
spk_0
and so like frequentist based statistics. And then from there, I moved to Switzerland to do my
spk_0
masters and do with my girlfriend. And then in my masters, continued in data science. But that's
spk_0
really kind of when I started getting involved in probabilistic programming, invasion, statistics,
spk_0
in particular. And so after graduating, I just instantly immediately started working in the lab
spk_0
at the university. Okay, so so pretty kind of a random random road, right? Like this is rare
spk_0
that people end up in Switzerland, especially when coming from the US or where is my prior wrong?
spk_0
No, no, yeah, it's yeah, a bit odd. Yeah. And so what do you mean by an internet of things lab?
spk_0
I absolutely don't know what that is. Yeah, so pretty much any like a lot of things to do with
spk_0
you an activity of hardware. And so when you're when that hardware is connected to the internet,
spk_0
to provide some sort of connectivity. And then that's kind of what you can kind of think of
spk_0
internet of things. And so like all of your things labeled as like smart devices as kind of
spk_0
under that umbrella term, IOT. And so now everything is being today's being like is being IOTified. And so
spk_0
you have your disk washers connected to the internet, coffee machines and so forth. And so that's
spk_0
kind of really what IOT generally means. Okay, okay, so this is that sounds very algorithmic and
spk_0
deep learning does it? So I mean, it's pretty so with our group, we have a very widespread
spk_0
of knowledge of within our group, you have people that are really specialists and networking
spk_0
and hardware. And then you have and then like stored data storage and processing. And then maybe
spk_0
like me more on the machine learning or data analysis side. And so for me, it's how do you analyze
spk_0
the data coming from various sorts of machines, whether that's manufacturing machines and so forth?
spk_0
Okay, okay. And how did you end up doing that? Because that's not what you studied, right? So
spk_0
how? Yeah. How did that happen? Yeah. I think it kind of came from so back to my bachelor's,
spk_0
any kind of metrics, doing a lot of time series stuff. And then you can kind of think in IOT,
spk_0
it's also a lot of time series stuff as well. Because when you when you have these sensors
spk_0
hooked up to the machines, you're also logging time series, like every, depending on the frequency,
spk_0
like 10 hertz, 50 hertz, you're logging a measurement every second or or 60 measurements every
spk_0
second. And so with that, you get a really nice stream of time series data. And I don't know how I
spk_0
exactly got kind of like brought into the IOT field specifically, but it was kind of like
spk_0
stem from, oh, hey, do you know various time series methods like like a seasonal or moving average
spk_0
and state space models and so forth? And it's like, okay, yeah, some of that can be translated from
spk_0
econometrics over into this IOT row. And so it's kind of a bit of that kind of gradual shift.
spk_0
Okay, okay, that makes it. So a lot of time series, state space models,
spk_0
Gaussian processes, I'm guessing, or at least I'm hoping. Yeah, yeah.
spk_0
How did you end up working on Bay's stats in particular? Do you remember when you were first
spk_0
introduced to the Ant and how often do you use the Ant in your current work?
spk_0
Yeah, so I was first and it goes back to my bachelor's when I was doing the first couple statistics
spk_0
courses. And I just remembered like when doing some very basic regression models. And then
spk_0
in that course, they're like, oh, yeah, you reject the non-hapothesis because of the p-value.
spk_0
And I'm just like sitting there, I'm like, yeah, but who and why? Why is it 0.05? And who came up with
spk_0
that kind of like arbitrary metric? And it just kind of like was always unsatisfying to me like,
spk_0
this is kind of you follow this kind of like strict rule and diagram. And you kind of,
spk_0
oh, okay, yeah, you do this or you don't do that. And that was always kind of unsettling to me.
spk_0
And so from there, it was kind of like a more of a self discovery because there they never taught
spk_0
Bay's in my underground Bay's statistics. And so it's kind of like, okay, what else is out there?
spk_0
And that's kind of that's where I came across Richard McElrister statistical rethinking and then
spk_0
Andrew Gellman's Bayesian data analysis. And so that's kind of how I got more introduced into
spk_0
Bayesian statistics. But then in regards to how often I use it, it's pretty much
spk_0
every day. So almost every project that I've done in this lab has to do with probabilistic modeling
spk_0
in some form or another. And why? Why is that? So how how come Bayesian stance seem very interesting
spk_0
and important to your work and what what do they bring that you can't have with the frequency
spk_0
framework? Yeah. So the big thing I see in like with sensors in the IoT in general is
spk_0
particularly the problems that I'm solving is first off, you have a lot of sensor noise. So these
spk_0
sensors in the processes that they're measuring aren't perfect. And so for example, like if you're
spk_0
having much of a sensor going to manufacturing machine, the speeds that these sensors are logging
spk_0
aren't going to be necessarily exact. They could fluctuate a little bit. And then not only that,
spk_0
the process that it's measuring is always perfect. So if and so I look at that, am I okay, actually
spk_0
probabilistic programming is a really good kind of fit here because we can start to begin to model
spk_0
the uncertainty, some of the sensor noise in the process. And then in the kind of
spk_0
manufacturing process itself. And so being able to like quantify the uncertainty there is
spk_0
very powerful because it kind of lets you account for that for some of the noise in the process
spk_0
and in the measurements. But at the same time, it kind of it's really also difficult because you
spk_0
can imagine that we're some of these settings are logging a lot of data. And so traditionally,
spk_0
basions that computational methods aren't very good with big data. And so it's kind of like they
spk_0
often see kind of in my day to day, like you kind of have this friction between the big data
spk_0
and base. And so that's also kind of maybe we can talk about in a little bit, but you have that
spk_0
kind of friction. Yeah, yeah, exactly. That's where I was going. And that's where my my
spk_0
astonishment comes from. Yeah, so I actually do want to do when I talk about that. Yeah, how
spk_0
how do you manage to combine both like these need for uncertainty quantification and also
spk_0
intuitive uncertainty interpretation. But at the same time, also need to run the models. I don't
spk_0
have frequent you guys need to run the inference, but you have a lot of data. Yeah, that can be a
spk_0
bottleneck. So how do you how do you thread that needle? So I'd say there's kind of three general
spk_0
not really approaches, but maybe techniques that we do. And so the first one is probably what
spk_0
everyone can think of is you kind of have your raw data and then you perform just some aggregation
spk_0
on top, some sort of resampling to kind of reduce the size of the data. And then you just continue
spk_0
to apply your general maybe MCMC on on that. But then the second one is other inference methods,
spk_0
like variational difference. We've seen to be a very good fit because we have a lot of data,
spk_0
but we then with variational inference methods, we can apply that to that data because a lot of times
spk_0
you need like some sort of approximating strategy. And since we have a lot of data, we can come up
spk_0
with a nice sampling scheme to then use within the variational inference method. And then the last
spk_0
one is, yeah, luckily we do have some hardware at our lab that we can just throw GPUs at the problem.
spk_0
So we can use like maybe a lot of times like jets, so numpyro, pyro, and use these more traditional
spk_0
deep learning frameworks for GPU acceleration. Yeah, yeah, yeah, yeah. So that less approach is
spk_0
quite nice also because you don't have to think too much, right? Like a numpyro or PIMC model can
spk_0
just run out of the box on Jax and you get GPU acceleration and you don't have to do anything else.
spk_0
So I would say if you have the, yeah, if you have the computer available, when I do that,
spk_0
especially if you have to come up with a customized variational inference scheme, that's the biggest
spk_0
data is much more, much more intricate. And I'm curious what your experience with the different
spk_0
VI algorithm, maybe if you can give a layup of the land to the listeners, where do you see
spk_0
these methods and the different algorithms being useful or not and what your practical recommendations
spk_0
are in a way. Yeah, so in regards to that, I've just mainly used kind of these the standard
spk_0
implementations that numpyro offers. So like with their auto guides and their mean field and so
spk_0
forth. And so when I, when I look, when we were using those as primarily for like high article models,
spk_0
and we found that, or I found that out of the box, they worked quite well. So I had to kind of
spk_0
go off into some other tangents to figure out which ones work or not.
spk_0
Yeah, yeah, yeah, for sure. I know, I know we are making a lot of effort also in the PIMC
spk_0
to have more more VI. So there is a lot of a lot of effort made with with the Google Summer of
spk_0
Golden, improving out of the box VI. So a TVI also, Jessica Bowski did a lot of work on them.
spk_0
That plus approximation in PIMC extracts that you can use now with in in conjunction with
spk_0
Feet map where you can use the map estimate to initiate the Laplace approach approximation.
spk_0
There is also there the pathfinder algorithm that's already available in PIMC extracts.
spk_0
We will have a dedicated episode about that with Michael Cow who who developed the
spk_0
the pathfinder module in PIMC extracts. People stay tuned for these. And I think I'm forgetting even
spk_0
one VI method that we are adding right now, but maybe that will come back to me. But there is
spk_0
yeah, there is activity relevant on the PIMC sign too. And I feel this is really great because
spk_0
yeah, I think like we've been collecting in the in the last few years to improve that as a
spk_0
as a community or in case to make people more aware of these these different algorithms and
spk_0
and they can come in very handy. Yeah, yeah. Which who who was a kind of pioneer that
spk_0
is at least from from an outsider kind of seems a bit like pyro did stand also do some.
spk_0
Yeah, yeah, yeah, Stan has a lot of that pathfinder was developed by Bob Carpenter and his team
spk_0
actually at the beginning he was developed as an initialization method for Nets, but they realized
spk_0
that actually the the results in themselves were were really good. And so so they just released it
spk_0
also as a separate outcome. But something you can definitely do also is initialize Nets with
spk_0
the pathfinder results, which which can be very useful. Yeah, Mike, there is that. So Stan has a
spk_0
lot. Numpiro has a lot. We've had the advi module in PIMC for for a long time now. Now it's getting
spk_0
a bit more love and we have we have a lot less we have we have the pathfinder as I just said
spk_0
and I am sure I'm forgetting one method, but you know that's what we're going back to. Yeah,
spk_0
and actually if you want if you folks want an introduction to these in the different methods,
spk_0
there is a talk talk I co-wrote with Chris Fonsbach and actually Michael Gao who was talking about
spk_0
a few minutes ago for a pilot everginae and so I will put the the YouTube video
spk_0
and in the show notes and also the Github repo and Chris was the one I was supposed also to
spk_0
to flight over Ginae, but didn't find any affordable flights. So I had to I had to to teach Chris and
spk_0
he was the the sole presenter. I think he he doesn't hate me. He loves presenting and he's
spk_0
so good at that. Yeah, so yeah, yeah, so the video is awesome. He obviously did a great job
spk_0
presenting the material. So I will put that in the show notes folks because that's a very good
spk_0
lay of the land basically of what you can do with the eye what the different algorithms are
spk_0
and then keep an eye out. I think one also recently gave a talk in Piedetta Berlin. Yeah, exactly.
spk_0
That's what I was going to say. So it's not the videos I'm not really seeing it as the time of
spk_0
recording and I don't think they will anytime soon. I think they they usually take about two
spk_0
months to release them. So folks keep an eye out on the on the Piedetta Berlin YouTube channel
spk_0
and then watch Juan Ordus talk there where he basically builds up on our presentation at Piedetta
spk_0
Berlin at Piedetta, Virginia and then builds on then on top of that and shows practical implementation
spk_0
of the eye, especially as vi with an empire with really good practical advice. So yeah, really
spk_0
we're coming back to this is a very good one. And actually I'm curious if there is anything you
spk_0
do in particular Gabriel when you when you use vi to try and make sure that the results
spk_0
are getting back are reasonably close to the posterior because we have these these guarantees
spk_0
with nets with MCMC but we don't with vi algorithms. And so usually something I do is
spk_0
trying the model and fake data and make sure they can recover the parameters of interest in
spk_0
a reasonably close range or for the parameters they can't try and see if there is a pattern in the
spk_0
bias. And at least we know there is a bias in the model and we're and that's really very helpful
spk_0
because well if you can at least get a model running with vi even though you know there is a small
spk_0
bias. I would argue this is already better than having no model because you only want to do MCMC.
spk_0
So yeah, but so basically I think this is something useful but I am sure you're doing much better
spk_0
things than than because you have more experience than me on that front. No, not not front, not really
spk_0
because I also kind of take the same approach there like before kind of scaling out to like maybe
spk_0
a more complex model on our problem in industry. Like I try to simulate what that data or engineering
spk_0
process looks like on yeah on simulated data and then see just pretty much exactly what you said
spk_0
kind of a cow well the algorithm is kind of recovering the parameter or the posterior and is
spk_0
able to actually model the problem a hand. Yeah, it's funny. Yeah, that's pretty much kind of what I do
spk_0
as well. I've found that I do. Okay, cool. So I'm not I'm not doing something obviously stop it
spk_0
that that's good that that is rare. So anything you want to add about about vi things you've noticed
spk_0
in the wide that the word particularly well or particularly bad before we move on to some other
spk_0
topic. So yeah, something I really want to talk to you about is that something you've worked on
spk_0
for a long time like this has this was really a masterpiece. So I thank you first for for doing
spk_0
that and this is your your re-implantation of Bart patient and additive regression trees in rest.
spk_0
So people probably know you can do Bart models with point C in a package that's called
spk_0
Pint C Bart and this is not some package I use it whenever I can but it has the defaults of
spk_0
regression trees which is you have if I remember correctly you have as many parameters as you have
spk_0
rose in your desert. So that means it grows pretty fast in the computational demons. So
spk_0
yeah it makes periods when you start passing 200 k observations it starts to be to be
spk_0
resueled to infer. So what you did is re-implement the sampling algorithms which is Metropolis Gibbs
spk_0
I think or something like that what what algorithm is that like a particle Gibbs. Yes, that's
spk_0
particle Gibbs. Yeah. So particle Gibbs and you re-implement that in rest. So yeah, can you can you
spk_0
talk about that deck basically why why would you start doing that and basically give us the
spk_0
the elevator pitch for the project before we don't have it be deeper. Yeah, so the the Pint C Bart
spk_0
project does this really kind of I believe comes from it comes from Osvaldo and about a year ago
spk_0
he reached out and said hey we need to make this thing faster are you interested I'm like hey I'm
spk_0
all for it was do it and but before that I really hadn't used Bart or was too familiar with
spk_0
with the the method I mean I was familiar with more like gradient boosting techniques which is
spk_0
somewhat similar but I did have the experience with the rust and so that was kind of a good
spk_0
kind of complement to each other it's like okay I see kind of like maybe what like Adrian
spk_0
say bolts doing with nut pie and rust maybe we can kind of share some of the code that he's been
spk_0
doing and then use that within Pint C Bart to help kind of at least with the log probability
spk_0
evaluations and so forth and so yeah this thing this really stemmed from Osvaldo when wanting to
spk_0
make a more perform it and then me stepping on board and saying hey okay let's yeah we just
spk_0
implement this in rust and then share some of the code base from pie
spk_0
hmm okay okay and so how is the how is the experience like it was was rest all you to you
spk_0
how do you even start on such a huge project yeah so I do have prior experience with rust within
spk_0
some data processing pipelines in the IOT lab so the rust part wasn't entirely new to me
spk_0
but what was new was kind of interropping with Python having Python bindings so that way the
spk_0
Python user when they call the Bart code it calls then it executes the rust implementation
spk_0
but in regards to the implementation process essentially what the approach I kind of took was
spk_0
okay let's implement this kind of essentially one for one from the Python implementation into the
spk_0
rust implementation and then from there we can start to kind of optimize the different whether
spk_0
the different functions or methods and then that way we can get a more of a nice performance
spk_0
improvement instead of kind of just immediately rewriting something then we wouldn't know maybe like
spk_0
okay then now this isn't kind of working right where where did that go wrong and so forth and yeah
spk_0
I don't know if you want to talk about some of like the rust specifics so the algorithm specifics or
spk_0
yeah maybe so it's been a while since we talked about Bart and repression trees on on the show so
spk_0
maybe if you can introduce the methods the tree methods in general you mentioned great
spk_0
boosting we obviously mentioned Bart so maybe give us just the elevator pitch for Bart and tree
spk_0
methods in general and then I think it will be useful to dive into a bit more of the technical
spk_0
details of the algorithm to understand really how the methods work and why people could be
spk_0
interested in using Bart and in which cases yeah so yeah at a high level then like you have these
spk_0
tree based methods and at like the simplest level you have your decision tree and so that's kind of
spk_0
your logic like if this variable is greater than some value you kind of go down the tree and then
spk_0
you finally get to a leaf node and that's kind of like your prediction for a target or your response
spk_0
variable so building up off of the decision tree you can you can have like a random forest which is
spk_0
then like a bunch of those decision trees together and do a forest but then kind of even starting up
spk_0
all on top of that you have gradient like boosting methods and so this these methods are really where
spk_0
you attempt to like lure like you use kind of like the random forest but you learn you learn
spk_0
like the residual between the where the tree the difference in the tree's predictions and so
spk_0
when you start to do that you're kind of like it's kind of like a meta learner you're kind of learning
spk_0
where each tree is doing beta to kind of come up with a better producing or better predicting tree
spk_0
and so this is kind of really more where Bart is aligned with with the gradient boosted
spk_0
methods rather than a random forest because Bart is kind of doing the same thing as boosting
spk_0
these gradient boosted methods the way it's kind of assembling these trees is different
spk_0
and the way that the Bart assembles these trees is by taking random perturbations and then
spk_0
assessing the log likelihood of that tree. Okay yeah yeah so that's closer to gradient boosting
spk_0
way of doing things right yeah and okay so that's the that's the iVager beach now when are these
spk_0
models particularly particularly useful in your experience and why is there strength and drawbacks
spk_0
so one of the strengths I think with with the with so if you want to kind of compare like Bart and
spk_0
like a traditional maybe XG boost or light GBM model one of the big benefits of Bart is that you
spk_0
get the uncertainty quantification you have a posterior over over decision trees and so
spk_0
then with that you can actually kind of you can actually stick that model into maybe other things
spk_0
that you want to use uncertainty for for example like Bayesian optimization traditionally uses
spk_0
Gaussian processes but you can actually you can actually stick this Bart model into the Bayesian
spk_0
optimization routine as well because you also have the uncertainty there but one of the big drawbacks
spk_0
with Bart is that it's famously slow compared to the like XG boost or light GBM and so that's kind
spk_0
of one of the big drawbacks I see there with that method but another nice thing about Bart is that
spk_0
so with like XG boost and light GBM it's very easy to overfit on your data and so you're going to
spk_0
you need to look at a lot of like loss or about loss curves and figure out okay hey when do I
spk_0
stop training when how much how many trees do I use how many learners and so forth to kind of
spk_0
stop the tree stop the tree of training and stop the tree from overfitting but with Bart it's
spk_0
really nice because you can we have regular regularizing techniques so that way we avoid overfitting
spk_0
kind of inherently within the method and so that's one really nice kind of pro I see with with Bart
spk_0
over the others but yeah the big I'd say Khan is that it's significantly slower than the other ones
spk_0
and that's just for multiple reasons so yeah thanks so these these these is much better so and I
spk_0
hope it is too for listeners so now I think it's a good time to dive into why that would be like where
spk_0
the buttocks are in why like what's the algorithm per se and how does it work basically earned
spk_0
are the hood so that people really understand the models when when they use that
spk_0
yep so in regards to like pymc bar we implement as I think we stated before particle Gibbs whereas
spk_0
other implementations might implement like a metropolis has to approach and so with the particle
spk_0
Gibbs steps how will the algorithm works is that so we generate a set of trees so maybe 50 or in
spk_0
in pymc bar you define the number of trees and the number of particles and so for example you
spk_0
might say okay we want 50 trees and then 10 particles and so now we're gonna we're gonna perform
spk_0
a series of particles or particle Gibbs steps and so at the first step we want we're gonna
spk_0
loop through the for all 50 trees so for the first tree we're gonna initialize then text maybe 10
spk_0
particles whatever you define and then those 10 for all those 10 particles which are just decision
spk_0
trees we perturb each one maybe we decide for we sample for a variable a certain split value and
spk_0
another one another split value and then we assess the log likelihood and then at the end we say okay
spk_0
hey this particle maybe particle 5 out of the 10 that's gonna replace the current tree which is 1
spk_0
and then we proceed to the next tree tree number 2 we go to that same process initialize 10 particles
spk_0
perturb each one wait them according to the log likelihood and then replace that tree and we
spk_0
continue until all 50 trees are essentially replaced and then and so yeah that's really kind of
spk_0
at a high level the main algorithmic steps it's really quite a simple process which is quite
spk_0
surprising if you read kind of a lot of these papers yeah and were you already versed that much into
spk_0
into pot and in tree methods before working in that or or did you get that knowledge by working
spk_0
at least project not really so it was mainly knowledge from working on the project and reading
spk_0
the code base that Osvaldo and others did which was quite readable which is really nice nice
spk_0
procedural kind of line by line oh this is what it's doing yeah and so that really kind of helped
spk_0
with the intuitive understanding of oh hey this is what the particle gives us doing this is kind
spk_0
of yeah and I second that yeah the the code base is really really well really well done and written
spk_0
and it's quite easy to to start contributing to the package so this is the this really awesome
spk_0
because I've dabbed a bit with with part for working baseball and I haven't tried yet your
spk_0
rest implementation because that is very useful baseball because there is a lot of use cases for
spk_0
methods like bot but there is so much data that you often need you often need an acceleration somewhere
spk_0
so yeah whether it's using classic pine C part on a GPU or actually using your rest implementation
spk_0
and adding a GPU to put that that should probably be a really good boost to to some fixed speed
spk_0
yeah yeah and I and I must say one of the things that's really nice with the pine C part is that
spk_0
there's several thing like really nice enhancements that we have and so if you go look around
spk_0
online a lot of the other packages are specifically for Gaussian lightlihoods so that's the
spk_0
first one so you can't really model like a post on process or any other the second one is that we
spk_0
also offer various split rules so if you have in your design matrix numerical features and categorical
spk_0
features you can pass split rules specific for that data type and this is something that's common
spk_0
and other packages that it kind of just assumes everything is a numerical value and so that those
spk_0
are kind of the two really nice things I think differentiates kind of our package but then lastly is
spk_0
that like we have the bar kind of random variable and how this is embedded in PNP and PNC and so you
spk_0
can model the linear predictor you can model the sigma the uns that parameter and so that's really
spk_0
nice because you can build essentially arbitrary probabilistic programs with bar whereas other
spk_0
package it's kind of that's the that's you use that method and that is the method that you use
spk_0
yeah yeah exactly this is actually a very good point that you can module you can add that as a
spk_0
module in the PNC model so you could model your linear predictor with a classic linear regression
spk_0
and then your sigma your send adviation you could model that with a bot yeah random variable so
spk_0
this is very useful and yeah and I must say that recently in the current existing Python implementation
spk_0
support has been added for more than one part random variable which is which is really great and
spk_0
has been something that's been requested yeah yeah so you could do like you could do two different
spk_0
bots on two different parameters so this is really awesome in a way that's starting to look a lot like
spk_0
the gp server module gp's you can add them to PNC models really as you want in a new can have
spk_0
different gp's for any number of parameters you want to your model and you really you cannot do that
spk_0
with like gp packages gp focus packages most of gp focus pages is what you can just use it as
spk_0
that's all you're gonna do anything else and often also likelihood likelihood limits in
spk_0
PNC you can use a gp with any likelihood a distribution in most of the packages it's often
spk_0
normal normal likelihood biggest that's often hard good yeah how is it so I know on the bar
spk_0
Python pure Python bar we can use any likelihood we want how is it on the rest side now is
spk_0
I remember at the very beginning you had not included yet categorical multinomial ability to
spk_0
to use that kind of likelihood of course I always use that likelihood so I was like damn gonna
spk_0
choose that yet but yeah how how easy it right now when it comes to the likelihood especially the
spk_0
most multi-dimensional one which always I know much more of a pain to develop
spk_0
yeah so in regards to the current state of the rust implementation there are still some things
spk_0
that are implemented one to one and I'm still working on that but in regards to the likelihood
spk_0
that that's been resolved so you can model multiple multiple different likelihoods but I think
spk_0
the one you were specifically asking was in regards to the different split rules like the
spk_0
categorical and like a continuous split rules and that is also now implemented in the rust
spk_0
implementation but the one thing that's that's not yet is the multiple bar random variables I still
spk_0
and working out some bugs there and so that's still something that's kind of being implemented on
spk_0
our end okay yeah yeah so concretely we can do anything we PNC bot rest that we can with PNC bot
spk_0
except for having two more than one random variable bar trend variable in the PNC bot
spk_0
otherwise everything is on par right now yeah amazing yeah it's cool yeah yeah thanks so Gabriel
spk_0
yeah that means now we'll be able to use that much much more on baseball data this is gonna be
spk_0
super fun um and how do you squeeze that in actually like is it part is it part of your jump or
spk_0
is it something you do on the side and like maybe you have some advice for people who when it's
spk_0
down doing some open source line we do and and they would have some practical advice of how to
spk_0
squeeze that in in the work and and the free time because in the end this is really what research
spk_0
is about right that trying to push the envelope and on very frontier topics which are not only
spk_0
gonna pay off for your project but for your company as a whole and a lot of other people who are
spk_0
not used to have that one yeah and so yeah luckily with this kind of a lot of the stuff I do at work
spk_0
uses these tools and so if if I if like our team and me see kind of like hey it would be really nice
spk_0
if we could speed up bar because it would help our problem network then like doing that doing the
spk_0
open source at work kind of aligns quite nicely but if the problems are kind of aren't really
spk_0
related yeah that's kind of in my own time and so forth but in regards to contributing more generally
spk_0
I honestly the PMC in Bambi community is just I think one of the best in the scientific open
spk_0
source community does everyone is very inviting and willing to help but my advice kind of I'd say
spk_0
to people starting out is don't bite off more than you can chew pick kind of maybe the low hanging
spk_0
through and then work your way up from there I've found to be quite a fairly more safe approach
spk_0
and kind of goes goes better with the maintainers that way yeah yeah yeah yeah that that that
spk_0
if he sounds right and that's what I recommend to also to people who reach out to me maybe one last
spk_0
one last question on on Bart since you use that a lot in your work what's your experience on these
spk_0
models what do you find they are very useful for where do you see their limitations to be
spk_0
yeah so I've used them in two scenarios one of them is it within embedding bar and
spk_0
bashing optimization routine which you just you talk to with max but then the other one is
spk_0
specifically for a time series process that is that I'm going to use my hands but like
spk_0
exhibit kind of like a kind of like a partitioned kind of like blocky that's not really good the
spk_0
time series is continuous they kind of has like this block structure so like from time from point
spk_0
one to point B as a constant value and then the next the next time interval it goes up to another
spk_0
shoots up to another value and then it's constant for a little bit and so this is kind of quite nice
spk_0
because these three methods are essentially kind of like a piecewise linear function and so it's
spk_0
able to model that just kind of inherently quite nicely yeah and so that's just kind of a yeah like a
spk_0
very raw weird time series which is I mean no time series no time series is really continuous but
spk_0
it's like you don't even have enough point for it to look like it's continuous and so at the
spk_0
bottom the discreteness of the tree structure here is a free and asset yeah exactly yeah and you kind
spk_0
of you kind of see this come up in sensor based time series kind of quite often I think if you
spk_0
kind of look at maybe like our profiles over a time you kind of see that kind of like like block
spk_0
looking structure as well and then you can kind of like oh maybe like these tree based methods might
spk_0
work here yeah okay this is very interesting I love it thanks actually two other questions on that
spk_0
well what about the time intervals because a lot a lot of the time having fixed time intervals is
spk_0
much easier to deal with what's your experience here with bot models like does that do all the
spk_0
sensitive to the fact that sometimes the time intervals are not sensitive which I guess might be
spk_0
the case with sensor data related to that what about missing data and what about out of some
spk_0
preparations I know it's a big big question that it's it's a related so so the in regards to the
spk_0
time interval so you're you're saying when the time intervals are unequal like the time between
spk_0
okay yeah so in regards to like the bar or just just tree methods in general I think are very
spk_0
they're very good for interpolating missing values because you can kind of
spk_0
impute that or interpolate that inherently within the tree and so if you have like a sensor
spk_0
that's maybe didn't log a measurement over a certain time period but that all this certain kind
spk_0
of comes back online and then it continues logging with the tree methods you do you do get a nice
spk_0
interpolation there and so you don't really need to do any kind of feature like feature processing
spk_0
beforehand and so it's nice because that's handled inherently within the model.
spk_0
Okay yeah yeah so it's just great that's what I thought but yeah basically when there is no
spk_0
fixed time interval it's like a missing data problem yeah so yeah so they are very good at
spk_0
internal interpolation how how good are they at extra
spk_0
pollations so really doing out of some predictions how does that work here yeah so luckily I
spk_0
haven't really had to use it for out of sample predictions in my interest interesting yeah because
spk_0
I mean obviously I'm asking you that because I know tree methods are not good at it at other
spk_0
simple predictions so I'm glad I haven't had to use it so I mean I think that's one of the
spk_0
reasons why I did choose to use that because yeah because like in particular with some of the
spk_0
couple of the problems we were modeling for example like if you have like actuator limits of a robot
spk_0
that that's pretty clear upper and lower bounds that you have from the engineering process
spk_0
and then so you know you're not going to be extrapolating past that and so you know with the
spk_0
bar then you have advice interpolation within these actuator limits.
spk_0
Yeah exactly yeah so and that's actually why also I haven't been able to use
spk_0
thought methods yes yet in production other than for for exploring and teaching because most of
spk_0
the time I work on actual out of simple data so let's say if I if I work on players the age of
spk_0
players is not really out of sample mine though old players are human so you'll never have a player
spk_0
with 120 years old but if you were looking at season for instance well the years the years really
spk_0
are out of sample and so here it's a problem or players themselves are out of sample what about
spk_0
the player you never saw in your in your training that said so that's often why I couldn't
spk_0
choose tree methods or part methods because they don't don't extrapolate in comparison to
spk_0
caution processes which are really good in general at prediction and space based models.
spk_0
Okay awesome and so one last question I swear on on Bard what do you mean by using them
spk_0
in optimization routines I find that super interesting. Yep so in regards to the optimization
spk_0
routine I was specifically talking about Bayesian optimization so essentially this Bayesian
spk_0
optimization is a sequential optimization process where you typically have some sort of surrogate
spk_0
model and that can be typically it's a Gaussian process but it can really be any other method
spk_0
that provides a posterior and so I'm so I'm essentially kind of flopping out this GP
spk_0
putting in the bar model there and so using that to optimize for some industrial process
spk_0
and so with the iterative method essentially what we're kind of doing is we're training the model
spk_0
in the historical data and then we're using you don't have to get to the details but some sort of
spk_0
function or generator to generate a new set of features feature values or design points and then
spk_0
then evaluating that back with bar and then running the loop again retrain the model generates
spk_0
some new values evaluate with bar and so forth and so that's kind of what I mean generally with
spk_0
the Bayesian optimization and it's just Bayesian because we're using probabilistic methods from
spk_0
what I can tell okay so it is your like bar did it include it in your last function
spk_0
it's it's it's the it's the surrogate model so it's so for example
spk_0
in a lot of and so if you think about if you want to optimize a
spk_0
machine for like the scrap rate how much scrap an industrial machine is producing you probably
spk_0
don't know the physical like the equations that generate that that will govern or produce the
spk_0
scrap and so what's the next best thing we can do we can turn to data driven methods and so there
spk_0
we collect data about the process maybe you have sensor measurements on how fast the robot
spk_0
arm is moving how fast is material being fed into the machine and then you also have measurements
spk_0
okay this scrap was produced no scrap was produced and so forth and so we use then the bar or the
spk_0
GP and to learn the association between the parameters governing the process and your whatever metric
spk_0
your track and then so now that you have that that's kind of now your your your your function that's
spk_0
your mapping from inputs to outputs and then so with the Bayesian optimization framework
spk_0
or loop you're then kind of deciding oh hey we we want to optimize we want to produce as little
spk_0
scrap as possible so we're going to use this model that we just trained on to propose or to select
spk_0
the values that produce the least amount of scrap would that make sense okay okay yeah yeah I think
spk_0
it does so so this is not really that you are using bar inside of a loss function when doing
spk_0
optimization this is more this is something different that necessarily no okay nice yeah do you have
spk_0
driven the public writing about that people who look at if they are interested in these kind of
spk_0
methods we are writing a paper but it's not published yet so unfortunately now okay well let me know
spk_0
when it is because then we'll we'll publish that in the in the LBS in the LBS sphere which as you
spk_0
is extremely powerful in the world you know making my short to do great so I think I think
spk_0
it's a good summary of everything about do you have anything to add on that on that I forgot to ask you
spk_0
where do you think we did a good job already to give people an idea of how they can use that
spk_0
no so I mean our goal is to essentially provide backwards compatibility with the rest
spk_0
implementation so it's just a drop in replacement but I think the things that we maybe didn't touch
spk_0
on too much maybe for some of the rust people out there maybe like what were some of the interesting
spk_0
rust like rusty bits that kind of resulted in some nice performance gains I think could be kind
spk_0
of fun to talk about yeah yeah yeah um one so a couple of the things or especially one of the
spk_0
areas that was nice to implement with rust is is in the the tree proposals and so what we do with
spk_0
pi and c bar is we have a prior probability over the depth of the tree and so if you think of a
spk_0
binary tree as a like as you add nodes to it it'll then the depth of the tree will increase
spk_0
and so we have a prior probability of how deep a tree can be and you can actually set this as a
spk_0
user using the two two parameters alpha and beta and so in the tree proposals we propose a variable
spk_0
to split on and a value to split and so based off of that and the prior probability of the depth
spk_0
of the tree we can essentially we then can say how likely a tree is to be grown so essentially how
spk_0
likely is it to the depth to increase and so traditionally then in the original Python implementation
spk_0
um we performed in the tree in the tree proposal we would we would always perform a tree
spk_0
proposal in a systematic resampling to propose the particle to replace the tree but with rust
spk_0
implementation we take a lazy approach and so we use a smart pointer called reference counting
spk_0
to essentially defer or wait to essentially materialize the the growing the tree until we know
spk_0
we've until we know we will accept that tree to grow and so so we kind of like beforehand we'll
spk_0
we'll calculate the proposal we'll compute the proposal and we'll say hey this is what it would do
spk_0
if it were to be chosen or selected and then if it is selected okay materialize actually compute the
spk_0
results and so it's a bit of a lazy lazy kind of way of doing it and so there it's it's really nice
spk_0
because in the systematic resampling we resample according to the the weights or the log likelihood
spk_0
of the trees and so if you have 20 particles and then after systematic resampling you say we select
spk_0
like 10 10 of those new particles all come from the same one because I had a high weight we essentially
spk_0
have to copy or perform a deep copy or a clone of that tree structure which can be very expensive
spk_0
but since we're now kind of using these smart pointers for an only copy if we know we're going to
spk_0
accept the tree proposal then we get a really nice performance boost there and that's kind of
spk_0
a just a lower level detail that that's I think quite cool yeah yeah that that is definitely
spk_0
super cool and so if people want to want to get started on Pimacy Bart and especially the
spk_0
rest implementation what should they do what should they do can't what should they download
spk_0
yeah so so if I think if people want to get we just read the code base to start helping
spk_0
currently the code base is under my repository and I think we can link that in the show notes
spk_0
and and I have several issues there where things that should be need implemented or cleaned up
spk_0
but yeah and so I think that would be a very good place to start is just going to the repo and
spk_0
looking at some of the issues I have I have them all tagged as good first issue and various other tags
spk_0
yeah yeah yeah yeah so I put that already in the in the show notes so if you look at that folks
spk_0
that's if you want to start getting involved I was asking what if you're when at what if people
spk_0
want to start using it what I would advise is and that's going to be in the show note too
spk_0
good is the Pimacy Bart website look at the tutorial notebooks and then just install Bart RAS
spk_0
and just run these notebooks but using the rest implementation and you'll see it's literally
spk_0
dropping your replacement except when you need to use two Bart random variables as we're seeing but
spk_0
otherwise this is literally the same and I think it's amazing is makes it so easy for people to
spk_0
start and Bart models are really good because they are super flexible they are very easy to
spk_0
understand and they are usually a very good baseline if you are in a case where trim methods are
spk_0
applicable so if you're in these case practical advice would be definitely try Pimacy Bart because
spk_0
the model is going to be super easy to write it's going to be just like this but just figures out
spk_0
the functional form so it's just like one Bart variable you fit that into your likelihood
spk_0
and you're done and you see how the height works if you're in the cases we say we talked about
spk_0
before where it doesn't work I will then that's going to be for next time but if you are looking
spk_0
after being these kind of cases I think it's a very good shot too yep absolutely and so to play
spk_0
a sound Gabriel I saw that you recently worked on some other optimization problem which is
spk_0
reproducing uber's mouth cat place optimization and you have a really good block pause to put
spk_0
in that I put in the show notes and you put also the code into a guitar repo that is also in the
spk_0
show notes folks if you want to look at it do you want to do in a touch on that briefly basically
spk_0
what it is what it does and how why would people be interested in that uber has a system in place
spk_0
that performs resource allocation and so what their problem is is they have a what uber can do is
spk_0
there are a ride ride hailing service with a bunch of different programs like uber eats and your normal
spk_0
driving scenarios but what uber can do is that they can actually they can influence the marketplace
spk_0
by by essentially allocating money to different programs to stimulate supply and demand but this
spk_0
results from a business problem as in a as a company we have a finite amount of money how much should
spk_0
we allocate to each city in each program with the city such that we can maximize some sort of
spk_0
business metric and such as like gross bookings which then maybe influence with the profit of the
spk_0
company and so I am interested I was interested in in how you can how this even works so how do you
spk_0
perform resource allocation with optimization methods but then what I found out quite interesting
spk_0
was was that they were embedding a neural network into the optimization algorithm to model the
spk_0
forecasting problem and so you kind of have these two kind of interesting components you have the
spk_0
optimization algorithm but then the fact that they're embedding a neural network into the system to
spk_0
help learn the association between how much money they're allocating and how much this influences
spk_0
a business outcome such as gross bookings and so is that the same idea as the what you talked about
spk_0
and then you can go back to the optimization yeah yeah so it's kind of the same and because I
spk_0
think I don't think a lot of people I mean maybe a lot of people know this but like you you can really
spk_0
embed like any machine learning model into like an optimization kind of program and then optimize
spk_0
for those features that you're using in the model and so essentially this is kind of what I
spk_0
want us to do here is like okay what's embed a machine learning model into an algorithm
spk_0
where optimization component to produce an optimal allocation scenario but what's really
spk_0
interesting with the optimization algorithm used here the alternating direction method of multipliers
spk_0
is that it's for it's a distributed optimization algorithm and so you can
spk_0
so it happens kind of in three steps and so in the first step you use the neural network to
spk_0
predict essentially how much gross bookings each city is going to have given a certain amount
spk_0
of allocation to this program and then you select the ones that optimize that objective and then
spk_0
the next step you perform a consensus step where you are trying to get the cities to kind of
spk_0
agree with each other to satisfy the constraint typically you have the constraint of maybe
spk_0
uber can only allocate a million dollars and we need to divvy up that a million dollars to each
spk_0
city such that the sum is is not exceeding million and so you have that consensus step and then the
spk_0
last step is just kind of a dual update step and then you kind of the new iterate over this and so
spk_0
it was really kind of a nice exploration but I think what could be more interesting and something I
spk_0
talked to with Warren was what if we embedded more of a probabilistic model in D here and so then
spk_0
instead of just and said so now we can have the entire posterior over the this over our decision
spk_0
space and we can say hey you should allocate like between 200,000 to 150,000 to city a and program
spk_0
a b and so that's just kind of kind of where we I kind of see this going in a way is kind of
spk_0
instead of now replacing your neural network with a more of a probabilistic model to have uncertainty
spk_0
over our decisions. Really cool yeah yeah this is really amazing uh very low net well that's
spk_0
actually a public writing that we can refer people to if they are interested in this idea you were
spk_0
explicitly before also embedding a barred model into an optimization algorithm I think I think
spk_0
it's very close at least using this is using a neural network but this is also very very cool so
spk_0
yeah um and I definitely some some application event in the baseball in the baseball for sure I mean
spk_0
the spots in the spots world in general so yeah this is this is super cool yeah thanks thanks
spk_0
Gabriel so yeah all the links to an ad during the show notes any any other current or upcoming projects
spk_0
you want to talk about so before we close up the show
spk_0
something you're excited to so not really current projects on the play maybe there are a couple
spk_0
previous projects where I see more probabilistic programming could be in play but yeah nothing
spk_0
up thing at the moment okay and I'm curious also what do you what are you curious to see
spk_0
in the coming months then night like what is it something you would really like to see in the
spk_0
Bayesian world maybe but in the in the data world in general in the data science world in general that would
spk_0
have a huge impact and potential on your on your work you know or things you are able to do
spk_0
I think because I think a recurring theme of a lot of the problems I work on is optimization
spk_0
and I think better sub better tooling around yeah embedding or using machine learning models
spk_0
probabilistic models within an optimization framework whether that's Bayesian optimization whether
spk_0
that's in the traditional convex optimization or sequential decision making I think because
spk_0
typically now especially at like work I need to hand roll all of that together myself and I think
spk_0
it would be really nice to have a package or a framework that really helps with that process
spk_0
yeah yeah it's I agree it would it would be something very interesting and very useful amazing
spk_0
damn thanks odd Gabriel I am very very excited to try these new these new things I like the
spk_0
the bot rest part and also the optimization thing so folks if you want to contribute to
spk_0
barter s the links are in the show notes we're always looking for people who want to make this
spk_0
better for themselves and everybody at the same time and I'm sure that Gabriel will welcome any help
spk_0
on that anything to as Gabriel before I still ask the questions
spk_0
you know no nothing nothing for my seat good I'll take it as a sign I did a good job
spk_0
so I'm gonna ask you the last two questions I give a guest at the end of the show if you had
spk_0
a limited time and resources which problem would you try to solve
spk_0
mm-hmm so I think I'm going to defer back to what I was just saying in regards to the tooling
spk_0
and in particular and as a specific problem space is sequential decision making and so kind of
spk_0
the big idea or the big pitch there is like what so what decision should you take now such that
spk_0
you're a immediate action or you're immediate reward is maximize but also takes into account
spk_0
the future contribution the expectation of the future contribution and this is kind of this
spk_0
problem space of sequential decision making sequential optimization is really kind of quite formal
spk_0
in the control theory world but in regards to kind of like business applications I think is quite
spk_0
lacking especially in the open source world and so developing like a library or framework for that
spk_0
I think would be a great step forward for modeling sequential decision problems that's something I
spk_0
would really like to work on hmm yeah that definitely sounds like it would be very useful and second
spk_0
question if you could have a dinner with any great scientific mind dead alive or fictional
spk_0
so I already had dinner with Tommy Catrata in Buenos Aires
spk_0
and I recommend you to rest friends so I feel like I was at that dinner two you know yeah no
spk_0
probably I would say probably Richard Feynman because I've read some of the biographies of him
spk_0
and not only would it be like I think it just be a fun dinner right like yeah a lot of technical
spk_0
people can be quite boring or socially awkward but Feynman being technical and fun I think it would
spk_0
be a very good dinner experience yeah yeah yeah definitely a great on all these accounts
spk_0
that Feynman sounded very interesting and cool and that the technical people can be what you said
spk_0
so yeah this is a great choice your different these dinner is getting grounded I can tell you
spk_0
this is a popular choice so we're gonna have to to scooch at the dinner table but you know we should
spk_0
go to Buenos Aires to the same restaurant I'm sure Feynman would have a lot of things to say about it
spk_0
I think so too I forget I forget the name otherwise I would recommend it right now to
spk_0
yeah me too actually and it's like I don't know I'm blanking on the name um Tommy come to
spk_0
our rescue yeah yeah awesome well Gabriel and that was a great show thank you so much for taking
spk_0
the time show notes are gonna be full for these one folks so make sure to take a look at them
spk_0
and well Gabriel next time you have a fun and useful project like that you are welcome anytime
spk_0
on the show otherwise really looking forward to meeting you in person in Switzerland at some point
spk_0
I definitely gonna come and do some hiking over there which my wife and I love Gabriel thank you again
spk_0
for taking the time and being on this show yeah thank you so much it's been a lot of time
spk_0
this has been another episode of learning Bayesian statistics be sure to rate, review and follow
spk_0
the show on your favorite put catcher and visit learnbasedats.com for more resources about today's
spk_0
topics as well as access to more episodes to help you reach true Bayesian state of mind that's
spk_0
learnbasedats.com our theme music is good Bayesian by BebeBerrytman fit MCLoss and Megeran
spk_0
check out his awesome work at BebeBerrytman.com I'm your host Alex Endora you can follow me on
spk_0
Twitter at Alexander's or Endora like the country you can support the show and unlock exclusive
spk_0
benefits by visiting patreon.com slash learnbasedats thank you so much for listening and for your
spk_0
support you're truly a good Bayesian change your predictions after taking information and if you
spk_0
put in number less than the Bayesian let's adjust those expectations let me show you how to be a good Bayesian
spk_0
change calculations after taking fresh pain in those predictions that your brain is making
spk_0
let's get them on a solid foundation
Topics Covered
Bayesian additive regression trees
BART implementation
probabilistic programming
optimization techniques
time series analysis
Gaussian processes
missing data handling
open source collaboration
community support in data science
IoT data modeling
sensor noise quantification
Bayesian statistics
variational inference methods
machine learning integration
PIMC Labs