Check out the latest episode of AI Daily where Conner, Ethan, and Farb discuss the most exciting updates in the AI world. In this episode, they cover Meta's groundbreaking AI model called I-JEPA, Hugging Face's collaboration with AMD, and Paul McCartney's creation of a final Beatles song using AI technology.
Meta's I-JEPA is the first AI model based on Jan Lac Koon's vision for more human-like AI, with its own internal abstraction of how models work and how the real world works.
The model achieves state-of-the-art performance on ImageNet and is significantly more efficient, requiring only a 10th of the GPU hours compared to similar models.
The model aims to address core problems in generative models by focusing on understanding common sense and abstract reasoning instead of pixel-perfect generation.
This innovative approach has the potential to improve AI's ability to understand the world and tackle complex problems with more intricate details.
Hugging Face + AMD
Hugging Face has partnered with AMD to integrate AMD GPUs into the hugging face platform, which is a unique collaboration considering most AI companies work with Nvidia due to its performance advantage.
The partnership aims to bring popular transformer architectures like BERT and Stable Diffusion to work efficiently on AMD GPUs, bridging the gap between AMD and the AI community.
This collaboration highlights the potential of AMD in the AI space, dispelling any misconceptions that AMD may not be competitive, and may lead to rapid advancements in AI solutions with the support of the open-source community.
The partnership is beneficial for hugging face as it demonstrates their seriousness and expands their capabilities by working with a prominent player like AMD in a significant partnership.
New Beatles Song
Paul McCartney is creating a final Beatles song by using AI to extract John Lennon's voice from a cassette player that Lennon gave him, which is an exciting development.
There seems to be a recurring trend of Paul McCartney working on songs using John Lennon's voice, with new compositions or additions to previous recordings, which may suggest that there will be more "last" Beatles songs in the future.
The prospect of new Beatles songs is thrilling for fans, and the longevity of their music speaks to its enduring popularity.
The conversation also references the TV show "Black Mirror" and speculates about the upcoming season, adding an element of excitement and anticipation.
Follow us on Twitter:
Subscribe to our Substack:
Conner: Good morning and welcome to another episode of AI Daily. We're back again today. I'm your host Connor, joined by Ethan and Farb, which is another three great stories. First, we have Meta's I-JEPA, and then we have Hugging Face and a M D, and then we have a new Beatles song. Um, first up, I, yeah, You heard it here first guys.
First up, Meta’s I-JEPA is the first AI model based on Jan Lac Koon's vision for more human-like ai. Uh, it's called I jpa cuz it's joint embedding predictive architecture. And the really interesting part here is that instead of a normal, like instead of how a normal transformer would go over images by exactly pixels, this model's completely different because it.
Has its own internal abstraction of how models work. Of how the real world, real world works. Exactly. Kinda like the human brain works far. You've reads into this some. What do you think about this?
Farb: You know, I think we talked about this before. This is another example of trying to do more with the same amount of computation.
And applying this sort of human model to create an efficiency that wasn't there before. I think they trained it on, you know, a handful or a, a, a dozen GPUs in like 72 hours. Kind of wacky numbers, you know, compared to, you know, what you would've saw a, a year ago or even just three months ago. So that, that's the, you know, I think a big takeaway there is that.
It's a much more efficient model, allows you to do, uh, a lot more with the GPUs than you would've otherwise been able to do. And it does so by sort of trying to copy what humans do. It's a really in interesting approach and it's, it, it's cool to see that the human mind is not totally inefficient yet.
Conner: It achieves state-of-the-art performance on ImageNet, which is a classification. Benchmark and only a 10th of the G P U hours as other similar models would do. So it's a pretty big jump. Ethan, what does it mean to have a jump like this in a model?
Ethan: Yeah, I think it's clear that this model is better.
Um, it's cheaper to train, it's more efficient, it's getting state-of-the-art representations. But like you said, their entire vision with this model is to really fix some of the core problems in generative models. So in a generative model, you're saying, Hey, I want you to generate this picture. And as you're training it, you know, you're hiding a piece of the picture and it's trying to figure out what could be in that section.
It has a lot of problems. Most definitely. People have always seen the problem with hands, right? You have stable diffusion, you're generating hands, and it pops out six, seven fingers. So this kind of. At the meta layer is saying, Hey, these models don't understand common sense. At the end of the day, they don't have an abstract representation of what the rest of the image is.
So instead of doing that kind of generative approach, what Gepa is doing is it's instead saying, Hey, here's this piece of the picture. Guess what is in the rest of the picture now? Mm-hmm. So in this type of architecture, in this type of model, you're trying to improve, what does it mean for these things to actually.
Understand the world. And that's what they mean when they say, Hey, these models, we're trying to embed this world model into it, right? We're trying to let it understand common sense abstract reasoning. So instead of these pixel perfect, hey, fix this section of an image, fix this section of image. It's like, no, what else is in the image?
Um, so a really cool model. This is their first implementation of it, but I think we're gonna see these types of models go across video and possibly even go across text. Um, it's a really interesting way of trying to tackle this common sense problem. Yeah.
Conner: A person's brain, if you're painting a portrait, they don't see someone's wrist to then go, okay.
Pixels, pixels, pixels. They have the abstraction of wrist connects to hand, five fingers. Yeah. So again, hopefully this type of model, like you said, will help with more intricate problems and intricate details like that. Yeah. Very pretty. Cool. Next step, we have Hugging Face and a MD partnership. Uh, hugging face is working with a MD to bring AMD GPUs directly into the hugging face platform.
This is a pretty new. Type of deal. Usually all these AI companies are working within Nvidia because Cuda has such a big performance update over a md. They have, it's way easier to work with and the gps are a lot better. Ethan, you've talked about before on the podcast how a MDGs are really difficult to work with.
What do you think this means for for am MD?
Ethan: Yeah. Well, we have two pieces going on here, right? We have the chip shortage that every startup and AI company is dealing with, and then you have AMD over here who has the factories and kind of has the chips, but is not really connected to the AI community as much.
They've done a lot of gaming work. Their frameworks aren't ready, so. I think it's a really interesting partnership. Um, hugging Face's goal is to get a lot of these main transformer architectures, you know, like Berts and like Stable Diffusion and Wave Tovex. Some of the audio models they're saying, Hey, we want to get these working on amd.
AMD's gonna provide us a bunch of GPUs. So it's really a match made in heaven. I think hugging face is such a critical part of the AI community and AMD needs to get going in ai. So exciting.
Farb: Yeah, I think we predicted this when we spoke about a M D last time. Not that it's a bold prediction, but you know, people kind of, if you'd written off a md I think you'd be sort of mi mistaken.
There's, there's too much opportunity here. They have too much sort of already set up and this is an example of, you know, Will this be a m d plus the open source community versus Nvidia and the closed source community Pro? Probably not, but it's fun to imagine some drama in the AI world. I would bet that a MD is going to, you know, come a long way here in probably a pretty short time.
Uh, and I think, and you know, for some reason it just made me sort of think of the. There was a time when people had sort of written off Apple for Microsoft, you know, they were like, well, apple will never catch Microsoft. Apple is, you know, a small company compared to Microsoft. And you know, if you look at it now, well they're, you know, Apple's, I think probably twice the value of Microsoft these days.
Or, or, or, or something around that. So don't write off a m d. They have a ton of potential, uh, the open source community. You could see come flying, flock around this to solve the problem and accelerate these solutions rapidly in a very short amount of time.
Conner: It's very nice for hugging face too. Some people have kind of not ridden them off, but see them as like, oh, they're the open source guys.
They do some open source models. They might host your model code, nor a lot of people don't see them as super serious because they're not like making their own huge models or making their own huge ML ops platforms. But working with AMD in a very serious partnership is also good for faith here. So, Okay, well lastly, we have the Beatles song.
We have Paul McCartney saying he's creating one final Beatles song, creating John Lennon and his last songs of Last Voice. Um, apparently he was AI to extricate Lennon's voice from a cassette player that Lennon gave him. Ethan, what do you, what do you think about this? What's your thoughts here?
Ethan: I think it's absolutely amazing and I'm super excited to, you know, I wish we could play the song.
It's, it's sitting here with me. We can't play it yet, but it's, it's amazing.
Farb: You know, I, I think they, it is, it is amazing and I'm glad they're doing it. And, and, and hopefully they'll just be Beatles songs forever to be quite frank. You know, they, I think maybe they have to say, this is the last one to get people excited or something.
But the thing is, I think they've done this before, it seems like every 10 years, uh, Paul McCartney grabs John Lennon's voice. Uh, with, uh, and get somebody to help them and they, and they remake a new song, or they sort of add Paul to some recording that John had done some time ago. Not that it's a bad thing, it's, it, it's an amazing thing.
It's just kind of when I heard it, I'm like, wait, I, I thought this, this happened 10, 10 years ago and that was the LA they said that was the last, uh, Beatles song that was ever gonna be published. So I think there will be many last Beatles songs published, uh, which is great for all of us. Uh, if you like The Beatles, which a lot of people do, obviously, I.
I grew up, uh, you know, listening to them even though they were long gone by the time I was growing up, but obviously their, their music has lasted a half a century or more. And it seems that somebody taught Paul McCartney how to use 11 labs, and that's about all you really need to do because Paul McCartney can do everything when it comes to music.
He can play every literal instrument. He can produce the whole track. He can, he can do whatever he wants, so help him get an 11 labs, uh, account. And, and the Beatles are back.
Conner: It's exciting to see another Black Mirror episode coming true. We finally have our, our last Ashley o soundtrack, so
Farb: I'm excited. Um, new Black Mirror season coming on the 15th.
I think you've heard it here first, folks. We're breaking, we're breaking the news.
Conner: Cool. Cover that on what we're saying tomorrow. Um, speaking of what we're saying, what have you guys been seeing?
Farb: Well, I saw this, uh, They're applying Adobe Firefly to video. Now it, I'm, I've almost like not even looked into it too much cuz I'm, I'm filled with n nervous, excited energy about it.
Uh, AI and video I think is, you know, I. Film is the most visceral art form. It is the combination of every other art form from choreography to music, to language, to, uh, you know, pictures. It's, it's, it's everything put together. So the impact on, you know, entertainment and just. What you can do to influence other people through this medium of film and, and video, uh, can't be understated.
And, and to create a technology that democratizes people's ability to create compelling video content, uh, again, can't be understated. I think, and this is gonna be the obsession of a lot of people for many, many years, for sure.
Conner: Ethan, what have you even seen?
Ethan: Yeah, Mistral AI out of France has raised 113 million seed rounds at a 260 million valuation.
Um, congrats to them. I, I think some people were, you know, a little like, oh, they're pre-product and X, Y, Z. These are huge seed rounds. A lot of this money's gonna go to GPUs. I think we're seeing a lot of these kind of individual. Companies at a country level. You know, we have stability over in Britain. We have Mistral now in France, we have China making different ones.
A lot of this money's gonna go to GPUs. I think it's a fantastic team. So yeah, just wanted to point out congratulations to Mistral and we'll see some more competition at the foundation model level.
Farb: Cool. Mistral.
Conner: Yeah, I saw that Vercel is launching a six week AI accelerator. So Versace, of course, is probably the best way to build applications on the web.
And now they're working with philanthropic banana chroma, cohere 11 labs, hugging face modal pine cone, and replicate stability Lang chain and open AI offering between the 12 of those $850,000 in credits to. I think 40 people they're gonna let in. Um, so not a huge amount of money, but it's exciting to have a mix of dynamic platforms for whoever they let in to choose from.
So very exciting to see what comes out of it. Startup weekend.
Farb: AI, here it comes.
Conner: Very true. Okay guys. Well thanks for tuning in. Uh, have a great day tomorrow. Have a great day today, and we'll see you guys tomorrow.