#220 - Gemini 2.5 Flash Image, Claude for Chrome, DeepConf

Interactive Transcript

spk_0 Hello and welcome to the last week in AI podcast week in here chatbot what's going on with AI in this case a bit more like the last month in AI unfortunately we've had to skip a few weeks Jeremy is busy as always with exciting reporting and I don't know his national security.

spk_0 Work and I've been traveling so as I always say sorry for the missed weeks will try to be back and regular schedule going forward.

spk_0 And in this episode we will summarize and discuss some of last week's most interesting news and a bit of also the week before you can go to last week in that AI for our text newsletter which does go out every week most weeks for other stuff we are not covering in this episode.

spk_0 And in this episode I'm one of your regular co-hosts on Rick Rancov Jeremy once again could not make it so we have one of our regular guest co-hosts Daniel Bashir.

spk_0 Hey yeah I'm Daniel you may have heard me on this podcast before if you have explored the last week in AI substack world you might have also listened to the gradient which if you haven't we have lots of interview episodes which are I think pretty cool.

spk_0 We'd love to check those out yeah great to be here.

spk_0 Yeah and me and Daniel were just chatting there's this idea being floated or reviving the gradient podcast which has been on ice for a little while now so yeah last week in AI listening you might hear some news on that in a few weeks we'll see.

spk_0 But in this episode we'll be covering some primarily exciting news regarding new tools and apps some releases from Google and frock not any major applications or business stories some pretty cool open source stuff and just a couple notable policy stories not been a super busy month so far luckily and we haven't missed too much.

spk_0 So starting out in tools and apps first story has got to be the new image editing model by Google they have released Gemini 2.5 flash image which is like by far the most impressive model for editing images that has been available so far it was kind of hyped up for a while it was being used under this student of not oh banana.

spk_0 And yeah after a little bit of that it was revealed to be in fact from Gemini and you know sadly this is an audio format so I'll have to describe what you can do of it.

spk_0 But just a bit is you can be very accurately take a subject like a person and then change the clothing of this person change your posture change the setting and it is very convincingly retaining the features of this person and very successfully kind of following your instructions about what you want to do.

spk_0 So you can combine different images as well it's by far beyond anything else we've seen to a point that some people are saying Photoshop is an trouble now so still to me you know we've had very powerful models for image gen for while this one is still you know next level.

spk_0 Yeah and this is also coming kind of just off the heels of Genie 3 which was released earlier this month also by Google DeepMind which is really quite impressive as well it's got this sort of ability for you to look into a world and it's actually quite stable and sort of maintains some of the features of the physical properties like if you look at something you make some adaptation to a part of the environment.

spk_0 Like painting a wall you turn away and you turn back it is still there and you know this is quite interactive with the notion of a world model that's pretty hotly debated in AI circles whether models have them what a world model is things like this so I'm pretty excited to see more work like this that forces us to think about these notions.

spk_0 Yeah there's been many fun examples of things you can do with this it's a multi turn kind of conversational model of course so you can take an empty room and decorate it you know paint the walls at a couch at a table and the room will be successfully sort of populated without all the details changing just having the very specific thing you wanted executed.

spk_0 And on the note of a world model there was a fun example I saw online where someone pointed out that they gave the model an image of a road in Dallas or something and asked it to show what is the opposite view what's behind the view and the model apparently was able to show the view from the other side of the same area which definitely kind of speaks to this being sort of world model like being able to

spk_0 understand physical properties locations things like that.

spk_0 Next big piece of news on this front is that anthropic has launched a Claude AI agent that Liz and Chrome and this is pretty much something that you'd expect the AI exists in a side car window it maintains context of browser capabilities the agent can perform tasks on behalf of the user which is pretty exciting there's a lot of AI companies out there developing similar

spk_0 power browser solutions I feel like this is an interesting direction and maybe there is a question out there of like what would an AI native browser look like how do how does the interaction for the interaction design for that look different from how we use browsers today is it like this agent in a side car the way that anthropic is doing right at the moment or is there a world where browsers actually look pretty different in a more fundamental way that's

spk_0 pretty unclear but I think we're in the beginning stages of something that could be really interesting.

spk_0 Right yeah this is launched via Claude for Chrome so it's an extension coming pretty quickly I think open AI launched their agent model maybe a month or two ago a little while ago and that is very similar you give it an instruction and it does web stuff for you for open AI is thing it has its own little dedicate environment and it's

spk_0 creates its own browser and sort of does it in the chat GPT interface here you have this plugin for Chrome and you actually use it within the Chrome browser which is a little different and to your point this is also coming pretty soon after perplexity has launched their browser I think called DL or something like that that is also pitching this sort of agentic browsing stuff so it's yet another sort of competitive area we've seen this with

spk_0 search with deep research with every single kind of use case of AI opening AI and anthropic and a few hours going head to head and yeah I think these are going to be pretty powerful I have one fun example where I had

spk_0 a spreadsheet with some links where you got to open each link and check kind of the website check for some quality assurance and in the past I would have to do this myself click and go look and do this very manual labor I was able to use

spk_0 this agent like tell it go to this Google Doc open it click on these links look at the site check for these things it took like half an hour it took a long time to get through it but I was able to do it so there's going to be I think similar to

spk_0 just the child bots these kind of agentic web browsing agents are going to be used in a million different ways to speed up all sorts of boring stuff and speaking of on

spk_0 anthropic they have another slightly notable update the clawed child bot can now remember your past conversations so somebody gets been available on chat GPT for a long time you can activate it by going to settings and labeling search and reference chats it's

spk_0 the news to see a topic adding this a long long time after open the idea the open AI as far as I remember was must have been last year when it started remembering details from your conversations to sort of

spk_0 personalize it to each user and that speaks I think to open AI having had a more consumer oriented focus and anthropic targeting much more of a code and enterprise and business people but yeah I'm sure this makes it

spk_0 more compelling offering yeah there's a coupling with another story here that Google's Gemini will also be more personalized by remembering details automatically as with other chat bot offerings you have options like temporary chats where you can have private conversations I won't be saved the use for personalization area training my take on this and you can kind of see if you look at the pieces we're referencing here that

spk_0 and the topic has taken a slightly different pattern where they will only use chat memories if you prompt them to and I think that we are in the pretty early innings of what memory and

spk_0 personalization are going to look like for these systems I think that there are a lot of different contentious issues that come up here I think that memory in its current form is not perfect and any of the

spk_0 simple implementations and I do think that there's going to have to be a lot of considerations and hard work on how does the interaction pattern it affords right now how does that make a difference to the model behavior in a way that's relevant to different things you

spk_0 just care about and what sorts of principles should we have about how that evolves again feels like a very early discussion as these things are just beginning to be rolled out but something to pay attention to.

spk_0 Yeah and I think this is interesting to me as a topic because I think this I've started realizing first of all the magnitude of chat GPT usage right we've covered how they have 700 million active users.

spk_0 And you know in my mind I was sort of assuming or even not imagining how people are using it like I use it for work I use it to like help brainstorm and write some code and whatever but many people use it in many different ways some people use it as

spk_0 like a therapist or like a life coach some people use it just to talk to and think through problems so for people who do talk to chat GPT a lot like talk to it that kind of memory feature I think probably matters a lot more and so Gemini launching it in particular where Gemini is becoming I think the main competitor to chat GPT as far as chat box people actively use that could matter quite a bit.

spk_0 And speaking of Gemini there is another launch from Google they launched guided learning which is available in Gemini and is designed to teach rather than simply answering questions so it's meant to have you learn things have you build a deep understanding help you work through problems step by step all that sort of stuff.

spk_0 Again we keep saying this I find it interesting this is happening very very soon after chat GPT launch study mode we know that all these services are used heavily by students there's no I don't know how what percent of high school students and college students aren't using chat GPT at this point must be in low digits so makes a lot of sense for this to launch and hopefully these sort of study oriented things will make it so students.

spk_0 I actually try to learn as opposed to just have the AI do the work for them yeah I hope so too I think there's a really interesting set of questions here some of them are around how do we ask people to still do the hard and effortful work that is learning and developing a deep understanding of things because I think that to really cause the sorts of changes in your brain and

spk_0 time you need to mull over something to really get it and have deep intuition and understanding there just isn't a shortcut to that I think that the way our education systems work there's different forms of legibility that we have that indicate what it looks like for a student to have a team master year to have a deep understanding of something and I don't think it's news to anybody that these forms of legibility or

spk_0 pretty imperfect and don't always indicate that and increasingly they can be gained and when you're a student you lose out on something you lose out on this not just generalization of understanding that might come to matter later on but a sort of satisfaction you might get personally from deeply understanding something in such a way that in my intellectual stimulate you want make you want to consider

spk_0 different paths or things like this later on in your life and so that deep and effortful work looks or feels quite important also just for the development of a person is getting too long so far.

spk_0 I mean it's a deep topic I think and it's very interesting to consider how if you're on the younger side you know starting to grow up and you wouldn't remember a time before AI before child bots like your experience will be very different from our experiences where we had the internet at least but like we had nothing to learn but it was very different.

spk_0 Yeah, moving on to a couple stories about opening I we've had what now in topic Google as the main guys of this section so far next we have a news that Apple intelligence will be integrating with GP 5 starting with iOS 26 so see we already integrate with opening a GP 40 as far as I know like you ask a question and then it decides to pass forward that.

spk_0 Topic to chat GPT and that's a perhaps not surprising that we're going to be upgrading it to GP 5 relatively soon but that speak to be kind of continued partnership between Apple and open AI one more piece of news on open AI they are adding new features to codex their coding assistance so they are introducing an IDE extension and extension to the standard coding tool which is.

spk_0 Also something that cloud code has it is introducing GitHub code reviews yeah generally kind of expanding the feature set of their cloud code competitor and this is I guess for non programmers this night not be very exciting but I think cloud code is really seemingly major huge impact in the programming world and these kind of agente code or tools are pretty rapidly being adopted.

spk_0 And making a big shift so opening I managing to compete managing to get some user share with codex as you know for a rare occasion kind of entering this space later on topic it's a pretty significant struggle yeah lot of stuff going on in the coding world right now as you've seen from the many startups involved in this our applications and business story for today is also about a

spk_0 very startup sort of in the space it's about a company called lovable which tech crunch refers to as a vibe coding startup and if you haven't seen lovable before basically it's used to create full stack web applications and websites so that's the specific area that they're in and they are projecting some pretty big numbers they're aiming to achieve one billion dollars and annual recurring revenue within the next 12 months which is quite soon.

spk_0 And it's currently growing that AR by at least eight million dollars each month it's already surpassed a hundred million in AR just eight months after reaching its first one million dollars which again goes to show obviously many of these companies have lots and lots of spend the kind of user and revenue growth that they can experience is quite on a different level from what we've been seeing before.

spk_0 Yeah lovable has been kind of clear winner so far in this entire space and they did launch quite a while ago so they pretty much took off this year as AI got good enough to be usable basically about knowing code about reading code lovable is one of these very user friendly you know I don't know if I allow you even to see the code there are some competitors like replete which are more friendly to technical users that expose the data.

spk_0 There's much more kind of techy stuff and it's a very busy space as you said so replete as one competitor was also bolt there's V zero from Bersel base 44 there's like at least 10 significant players at this point I think and yeah it's probably going to be a major market assuming the economics of it start working because I think this regulation is these companies are acquiring all the data.

spk_0 All the revenue by burning through cash and not even trying to be profitable at this point and speaking of big numbers the next one is about raise the cart the company that we recently covered as having launched this real time sort of filter real time video to video model that was very powerful you can give it like normal stream of regular kind of world normal.

spk_0 Video and it can turn it into GTA or I don't know Simpsons or any sort of our style with real time streaming which would mean that if you're like playing a game it can completely change their art style for instance or you can even have a very low res game and then make graphics whatever you want so they have raised a hundred million and they have now hit a 3.1 billion dollar valuation.

spk_0 And that's pretty significant like there's no large set of users service yet and this entire idea of streaming video to video their model mirage LSD yeah again still sort of a preview stage so investors seem to be pretty optimistic on this having a lot of potential.

spk_0 Yeah it's one of those where it feels quite early to say anything substantive we have another story here that's also about a pretty big raise and the company you surely heard about before that is not too new co here has raised 500 million dollars from investors with a new valuation of 5.5 billion dollars lots of different players involved here co here is hoping to use those funds for accelerated growth.

spk_0 They plan to expand their technical teams and developing enterprise AI solutions again unlike many AI startups co here's less focus on consumer applications and much much more on customizing that models for enterprise clients like Oracle and notion.

spk_0 So this is again a pretty different approach that some of these labs are taking with their technical talent where they are trying to look at different enterprises and businesses thinking about how can AI be useful for your sort of vertical and using both sort of general versions of that like a here but then also ones that want to develop deep expertise and a very specific area.

spk_0 Next up we have a story about pony AI not active in the US but is aiming to roll out to the European market so this report from Bloomberg is kind of saying that this is their aim so far apparently they've already rolled out 200 Gen 7 robot taxi vehicles just over past two months where I mean to get to a total of 1000 vehicles and

spk_0 this is notable because in the US we've definitely seen a speed up of competition and deployment of robot axes this year in particular way most entering new markets Tesla's robot taxi service just launched and is also at least aiming to expand rapidly and it's very clearly going to be a huge deal like this problem is starting to be at the point where it's solved where robot axes are quite reliable people seem to prefer them to Ubers and just

spk_0 general from what I've seen in discussions so pony AI being another significant player coming from China has a potential to really break into the European market and it has the case that's going to be a big deal right last story on applications is about another big lab and a bit of changing of the guard Igor Babuskin who is a co founder of Elon Musk's X AI and who I recognize as having some

spk_0 kind of bird at his ex profile photo I don't know if it's a bird I can't look at his wings but it's a memorable profile anyway besides a point he has announced his departure from X AI to start his adventure capital firm Babuskin ventures which will focus on supporting AI safety research and backing startups that aim to advance humanity and explore the universe this was inspired by a discussion with Max Tagmark about building AI systems safely for future generations

spk_0 and there's also following several scandals at X AI involving your chap on GROC which included controversial responses and an appropriate content generation many of you if you are extremely online understand basically any time on X probably remember the GROC for release and what happened around them.

spk_0 Yeah it's been a tumultuous few months for X AI to be sure a lot of impressive results with GROC for launch just very impressive LLM X AI in general since launching I think towards the end of 2023 since the team coming together just caught up incredibly rapidly it will be fun to speculate if this means that X AI is not doing so well typically you don't see people departing from startups like co founded

spk_0 in less than two years but here obviously it's hard to say if Babuskin just wanted to go off and started this venture initiative or if it indicates anything about X AI internally but still significant to have a shake up in leadership in general and X AI is in an interesting time in its life so moving on to projects and open source first we have an open source release from meta AI

spk_0 this was I think for a couple weeks ago the release is dyno v3 a state of art vision model trained with supervised self supervised learning which is able to generate high resolution image features so basically it allows you to process any given image and output representation of it that's useful for all sorts of stuff and that you can use to for things like object detection semantics

spk_0 and meditation video tracking etc. about any fine tuning and this is a pretty large model has seven billion parameters which is unusually large for just pure image models trained on 1.7 billion images this is very much just like taking the image processing model to the biggest place it's been we don't talk too much about just pure image models for things like semantics and

spk_0 meditation object object detection video tracking these are like semi solve problems at this point used to be like you know a decade ago these were significant tasks and computer vision but it's pretty important to remember that I think as far as using it I applying it object detection segmentation just general video understanding and image understanding tasks are pretty significant so having a really cutting edge model that is free for the image model

spk_0 for academic use that has a commercial license as well comes with a lot of code could be very useful for certain people.

spk_0 Yeah these sorts of models clearly have pretty important impacts out there in the world for this specific model if you orgs like the world resources institute and NASA's depth propulsion laboratory have been using it this is also improved the accuracy of some pretty specific tasks like forestry monitoring

spk_0 support division from Mars exploration robots and the fact that you can do this with minimal compute overhead and you don't have to rely too much on web captions or curation so that you're able to sort of apply this universal feature learning when you're bought like by annotation is a really good advance fund I think next up we have a specific set of foundation models GLM 4.5 this is an LM with 355 billion parameters

spk_0 designed to excel in agente reasoning coding tasks employees and mixture of experts architecture which is pretty familiar to a lot of people spend some time in ML research but basically lets it select different subsets of its parameters for different tasks which is quite good for efficiency and performance what this also means is when you hear the number of parameters in the model that's not quite the same as the effective number of parameters so the number of parameters that are actually being used when the model makes an inference about something

spk_0 and the training is sort of multi stage here it pre-trains on a diverse data set this is followed by fine tuning on specific tasks and previous capabilities nothing too crazy here there's RL thrown in the training process especially when it's working on decision making problem solving sort of tasks just a pretty interesting model

spk_0 yeah it's kind of interesting we have a figure here figure three and there's pre training on a general corpus then pre training on a code and reasoning corpus vendors

spk_0 mid training which has free steps repo level code data synthetic reasoning and long context and agente data and then there's RL and stuff so there's a lot going on and this is very much following in a footsteps of R1

spk_0 I want to sort of introduce I think this approach at least in terms of published research of having these multiple kind of stages for training agente and reasoning models and the notable thing about this model aside from being big is they are doing quite well like we're claiming on the benchmarks to be beating opus 4 to be up there with 03 and rock 4 almost to be quite perfect for this model

spk_0 the performance at a smaller number of parameters so you know 350 3 billion parameters is a lot but it's less than deep seek R1 is less than kimi k2 and on coding tasks they are similar on the benchmark front so very much your continuation of a trend you've seen all throughout this year of open source models coming out of China starting with R1 and really proceeding ever since

spk_0 that are getting better and better that are getting really on par with the close source offerings from on topic and open AI for many things which is new right like until this year you could not yet an open source of alarm that was anywhere near competitive with Claude or chat GPT now that's different

spk_0 and speaking of open source releases from China next story is about deep seek releasing its V 3.1 model so this is you know bump in the version as probably title it has a longer context window and not like any sort of substantial jump in any sense but I think notable to see deep seek continuing to release and continuing to up there

spk_0 and we are one model sort of incrementally and still being competitive although apparently deep seek fans are waiting for the release of R2 which would be the successor to R1 so this is kind of leading up to that

spk_0 and speaking of open weight LLM's we have kind of an interesting story about the overall market so artificial analysis did a benchmark evaluating the performance of GPT or assess 120B the recent open source release from open AI

spk_0 and they evaluated the performance of this model across different providers on the cloud so you can run these open source models through various companies like cerebrus fireworks deep in front together that AI, rock, Amazon, Azure, a bunch of them

spk_0 and the funny thing that they found in this is on a particular benchmark Amy they have very different outcomes across with different providers so on some of them cerebrus, maybe as deep in front they get a high score 95% then you go to rock, Amazon, Azure, they go down by 10% maybe even more than 10% which speaks to hard to say what

spk_0 these providers are doing are they like making smaller versions of a quantizing or using different hardware but definitely a surprising result you would think that if it's the same model and all these people are serving it lending you use it via their hardware you expect roughly the same performance but apparently that's not the case

spk_0 our last story on this front is an open source text to speech model from Microsoft called vibe voice 1.5B this is capable of generating up to 90 minutes of speech with four distinct speakers supporting cross-lingual synthesis and singing it's primarily trained on English and Chinese and available under an MIT license

spk_0 there's a decent amount of work going on right now in audio synthesis and I think that this is like a pretty exciting advancement like 90 minutes of speech is quite a long time I think there's still questions about general coherence of the audio over that stretched period of time but it does seem as though again we're making pretty quick advancements

spk_0 yeah and this is one of these notable things where audio in general historically has kind of lacked behind in the open source front in terms of data sets in terms of models is just this kind of area where you don't have as many options as for instance image generation so having powerful text to speech means that you know under one hand as a company you can use it fine to you need for various applications

spk_0 on the other hand we know now that people use these kinds of things for scams and so on and that would just mean that you have to really be on the lookout whenever you hear someone in audio these days like it's at to a point where you cannot tell the difference between AI generation and actual recorded audio and onto research and advancements just a couple of stories for this episode

spk_0 the first one is deep think with confidence which is a new approach that basically makes test time scaling more efficient and more effective so they are looking at the type of test time scaling where you want to do several parallel reasoning paths you want to have the model try to solve the problem multiple times

spk_0 and get to different results and then you might take sort of a majority output or a combined output of your various reasoning traces and this paper introduces a fairly straightforward idea so it's titled deep think of confidence as you are doing your rollouts of different reasoning paths towards getting to an answer you can evaluate roughly speaking the confidence of a model in terms of the kind of prediction

spk_0 what they call token confidence which is looking at the probabilities of the tokens it's actually outputting and they also define an average trace confidence that they call self certainty and basically they evaluate this thing as you roll out the model and if you have little confidence they kill the run they kind of stop it so you end up being able to do many parallel runs

spk_0 cut off ones that seem on promising and then if you get to high confidence you're now able to combine these results from multiple models and get to a combined kind of confident output and in benchmarks they show that with this method they're able to improve performance

spk_0 and then it's pretty substantially able to improve performance by 10% getting on some of these benchmarks like Amy a couple percent boost for GPT OSS 5% boost for TPC basically making it so for things where you're not reliably getting an output necessarily you're now going to get more significant ratio of getting to the right answer

spk_0 and yeah it speaks to I think the place where we are of reasoning and test I'm scaling there's a lot of locking fruit probably in this whole area of test I'm scaling in terms of ways to do it more reliably efficiently this one is a fairly straightforward algorithm method that can be applied widely

spk_0 while we're thinking of test times scaling and a lot of these improvements maybe a natural question is to ask what happens to jobs and as it happens a couple of days ago a Stanford study found that the adoption of generative AI is significantly affecting job prospects for young US workers particularly those age 22 to 25

spk_0 and this cannot quite recently and there's been a lot of commentary I would actually recommend taking a look at Noah Smith's recent blog on this specific paper and also of course reading the paper itself because it's worth trying to understand and contextualize those claims but just to get up a bit on a soap box about this paper I feel like despite the fact that the people who wrote this paper are pretty careful economists and like very dessert

spk_0 and serving respect it does feel like this finding is a bit of a specification search as job markets rise and fall there's always some group of people who are doing worse than the rest and it's a little bit unclear that it is always justified to tie this to there's a new technology on the block like AI what's we're saying is that sure it's possible that AI is impacting job prospects to some effect but it's a little bit hard to disentire the data

spk_0 that's not going to be able to hang out this entirely from other economic factors one really great thing that Noah Smith does in this post where he takes a look is he looks at the data about how AI exposure relates to job prospects for people at different ages and this is specifically for people who reach 22 to 25 but the workers who are in their 30s 40s 50s who were judged to be most heavily exposed to AI actually have seen robust employment growth since last year

spk_0 and you can maybe score this fact with the story about AI destroying jobs but again it's kind of unclear like why would companies be rushing to hire new 40 year old workers in AI exposed occupations again just a lot of question marks here

spk_0 six facts about the recent employment effects of artificial intelligence so they're examining the effects of AI on labor market on employment on people being able to get jobs the first fact is they uncover substantial declines in employment for early career workers aged 22 to 25 as we say in

spk_0 Occupation occupations most explored exposed to AI such as software developers and customer service representatives second key fact is that overall employment continues to grow but employment growth for young workers has been stagnant since late 2020

spk_0 first fact is not all uses of AI are associated with declines in employment for if they find that employment declines for these workers remained after conditioning on firm time effects so they they do try to be careful as you said this is analysis from labor data we're not doing experiments here we're just looking at various statistics and trying to conclude what AI effect may have had

spk_0 so they try to account for these other factors that could explain statistics fifth they say that labor market adjustments are visible in employment more of a compensation and six the above facts are largely consistent across various alternative sample structures so as you said like economics research is tricky there's no careful experimentation going on here they are working with data that can have various

spk_0 interpretations like in the case of software development for instance which is one of the major areas where employment has been much harder for early career professionals obviously there's many factors going on during COVID there was arguably over employment many of the big tech companies really higher like crazy and then there was a large amount of layoffs going on in software development over the last couple years

spk_0 there's economic conditions all sorts of stuff so this is a very early piece of research and they do to be fair kind of position it as such they call this in the coal mine to indicate that this might be a sign of what's happening but it's kind of still early and it's hard to tell but as far as analysis as far as sort of actual research that is able to do

spk_0 tell us anything about employment with AI do my knowledge this would be the first sort of major work or have a similar economist maybe there's been some prior research on this but this is coming from a Stanford group that is pretty oriented on this one of the lead offers is Eric Brin Jolvesson who has done previous research on AI economics so as you said Daniel if you find this interesting

spk_0 probably worth to follow up and see some more deep analysis and possible interpretations of this.

spk_0 Yeah our next story is in the policy and safety space and this one's actually really interesting about an unpublished report on AI safety from the US government back in October a red teaming exercise was conducted at a computer security conference in Arlington, Virginia

spk_0 where AI researchers dressed tested some against AI systems they identified 139 novel ways these systems could misbehave like generating misinformation leaking person data things like this the key upshot of the exercise was it revealed significant shortcomings and a new US government standard designed to help companies test AI systems but the national instituted standards of technology didn't publish a report on those findings

spk_0 the reason for that according to some sources was that along with other AI documents from NIST was withheld because there were some concerns about conflicting with the incoming administrations policies wired now has this unpublished report

spk_0 and I guess one of the key takeaways here is that this is an area that feels like it should be nonpartisan and ideally not to influence by politics but it seems like there have been challenges faced in publishing AI research under the Biden administration.

spk_0 So just an interesting story in terms of the confluence of politics and AI safety.

spk_0 Right, this is a report from NIST the National Institute of Standards at Technology which was tasked with this kind of thing with creating standards and technology for AI they created this NIST AI 600-1 framework to assess AI tools this is an artificial risk management framework general artificial intelligence profile

spk_0 so this ready me exercise basically was to evaluate this framework that they published or like a year ago I think made 2024 so probably yeah not too surprising we know it the trumpet information reversed by den's actions on AI recently published their own agenda on AI and it's very likely that these kinds of AI security initiatives are going to see less interest was promotion with current administration.

spk_0 And another story about the US government and the kind of surprising one the US government is going to take a cut of Nvidia and AMD chip sales to China so we have talked quite a lot about expert controls about restrictions for Nvidia to be able to sell GPUs to China it's been a very evolving area and a Trump administration there will a time where the age 20 chip which for a long time we're going to take a cut of Nvidia and AMD AI chip sales to China so.

spk_0 And so this is kind of reversing that now Nvidia apparently is able to sell the age 20 again but will have to pay to the US government so Jeremy unfortunately would be the guy would give the most insight on this development but seems a bit surprising as far as kind of.

spk_0 We approach to exports restrictions moving on to something unrelated to the government going to another topic we've talked about quite a lot lawsuits ongoing about copyright for the major Ellen providers so on topic has settled a high profile AI copyright lawsuit brought by book offers so this was initiated by offers Andrea Bard's Charles Graber and.

spk_0 Kirk Wallace Johnson who accused on topic of using their books about permission there was some let's say conflicting developments in here California district charge ruled that and probably could use a books with fair use but found that the acquisition method the shadow libraries instituted.

spk_0 piracy and this is just one of multiple lawsuits ongoing basically for years now that would have major implications about basically how you can use how you can acquire data for training AI models open AI and a proper can others kind of took the.

spk_0 The maximum the permissive approach of using a bunch of data about asking any permission and so this settlement part to say as not lawyers how significant of effect it has on other ongoing law developments but does mark one kind of.

spk_0 piece of progress in this long ongoing story where you know at least this lawsuit has reached an end our last story is about AI companion apps which are on track to pull in 120 million dollars in 2025.

spk_0 In the first half of the year these apps already generated 82 million dollars with downloads of up by 88% year over year reaching 60 million dollars and the top 10% of these apps account for 89% of the revenue with 33 apps are passing 1 million dollars in lifetime consumer spending.

spk_0 The popular ones in the space include replica character AI poly buzz chai with a significant portion of users seeking AI girlfriends you may have also seen commentary on Twitter about AI boyfriends being very popular this is a really interesting Harry space to me because I think it represents or sort of portray something pretty fundamental about the kind of companionship that people seek.

spk_0 And are willing to accept in the different ways in which it can be met and not met personally I find AI companions a bit troubling for numerous reasons that I won't get up on the soap box about it here.

spk_0 Yeah, I did include it in the policy and safety section very much because it has pretty let's say concerning or significant implications for society for people's psychology we know in the modern age there's been very much degradation and amount of socializing and the amount of close connections people have it's arguably one of major health crises of the modern age like people.

spk_0 And so this market growing significantly getting a lot of revenue according to this report there have been 112 apps published just in the first half of 2025 with the names of those apps having girlfriend in 56 of them fantasy boyfriend anime soul soulmate lover wife who.

spk_0 Yeah, clearly romantic interaction apps and it's coming also in a screen paradigm of dating apps I think the general consensus is it's hard and unenjoyable process to try and find a human girlfriend was all made so yeah I mean it's a little concerning I think it's fair to say under one hand you can treat it as a video game as like a role playing exercise.

spk_0 This is a fun thing by the way character day I want to play as a space for a while which isn't focus on girlfriends which is general role play still has millions of monthly active users like this is a very big space.

spk_0 So it's likely to keep growing I mean XA I recently launched on and like their own rock based companions.

spk_0 I don't know it's an interesting phenomena for sure and fun fact the movie her which was all about this thing directed by spike Jones where the main character played by Phoenix falls enough with an AI character said in 2025.

spk_0 Lots of people are saying that this movie was incredibly prescient and I think it's very if you haven't seen her highly recommend.

spk_0 Well that is it for this episode as I've said hopefully we are going to get back to the weekly schedule.

spk_0 Thank you Dalinio for fulfilling the guest go host duties always fun to have you on here.

spk_0 Thanks for having me I always really loved doing this and thank you to listeners as always we appreciate you tuning in and bearing with us as we skip some weeks at an unpredictable rate.

spk_0 Always appreciate it if you leave reviews if you share it with friends and more than anything if you just keep tuning in.

spk_0 From girl Netsu robot the headlines pop the driven dreams they just don't stop every break through every code unwritten on the edge of change we're

spk_0 for machine learning marvels to coding kings futures unfolding see what it brings.

#220 - Gemini 2.5 Flash Image, Claude for Chrome, DeepConf

Interactive Transcript

Topics Covered

Related Episodes

Out on a Limb | Nov 2nd | Hillhurst United Church

How Do You Respond to Jesus? John 5:16–47 :Daily Devotional, Daily Bible Study

Episode 240: Otto Aerospace CEO Paul Touw talks teardrop, laminar flow, and Phantom 3500

Burnout, Breakthroughs, and the Power of Getting Back Outdoors | Gun Talk Hunt

Share Episode