Welcome to AI Daily! In this episode, we bring you three exciting stories: MusicGen, Meta’s plans for AI, and the impressive ClipDrop by Stability AI.
Key Points:
MusicGen
MusicGen from Meta is a big improvement over music. Lm, allowing users to generate music with text or a combination of text and audio.
MusicGen does not require self-supervised representation and operates effectively in one pass, producing efficient and high-quality results.
MusicGen is open source, unlike Google's closed-source music. Lm, and Meta provides the code and resources for users to try and enjoy.
The examples of generated music showcase its potential for creating base-level, lo-fi sounds and even replicating jazz songs, making it an exciting tool for music creation.
Meta's AI Plans
Meta's plans for AI involve integrating AI capabilities into all of their flagship products, including Facebook, Instagram, and possibly WhatsApp and more.
Mark Zuckerberg is particularly interested in generative AI and envisions features like photo modification and AI assistants in messaging apps.
Meta is committed to open-source development, allowing other developers to build on their models and shaping the future of AI development.
There seems to be a shift in Zuckerberg's attitude towards openness and embracing chaos, possibly driven by the anticipation of upcoming AI wars and Meta's accumulation of talented AI researchers.
ClipDrop
ClipDrop is a well-designed product with a range of AI-powered features reminiscent of Firefly.
It offers an API for integration into other startups' products, indicating a strategic business plan for Stability AI.
Stability AI emphasizes open-sourcing their models while also building user-friendly products and APIs.
The API angle is considered smart, following Stability AI's pattern of open-sourcing products and then building upon them.
Episode Links:
Follow us on Twitter:
Subscribe to our Substack:
Transcript:
Farb: Good morning and welcome to AI Daily. I apologize for not having my iPhone camera here in. Forcing you to look at me in FaceTime hd. Uh, we have three great stories for you today, starting with MusicGen from Meta, uh, and then another story about Meta, including Mr. Zuckerberg. Uh, and then some cool, uh, stuff from Stability.AI a, a very, you know, tightly put together little product.
We'll, we'll take a look here in a second. So the first story we have is Music Gen from Meta. It is a. Seemingly big improvement from music. Lm uh, they show off some cool examples where you can generate music with text or generate music with a combination of text and audio. I tried it out, uh, had some issues.
Ethan, you said you tried it out. What do you, what do you think about this? Is this is this important news?
Ethan: Uh, definitely is. I couldn't get past the hugging face, so I, my demo is still, you know, it's working on it. We're generating, but at the end of the day, this is, A lot different than music. Lm, which we saw from Google a bit ago.
Um, so music gen does not require this like self supervised representation, so you, it does it all in one pass. So a much more effective model. Right. The examples I got to see were efficient. They are high quality and I think music is always a space, you know, we're interested in. We're not to vocals yet, but just a better way to generate these base level.
Lo-fi sounds and you know, copying some of the jazz songs I think is a fantastic application of it. So really excited for music. Jan, if you can get the hugging, face them little work, do try it.
Farb: Connor, what are your, what are your thoughts? And bass level is always important in music, so yeah.
Conner: Yeah, this is another exciting audio transformer to have.
Uh, music, lm, of course, is closed source from Google. They keep proprietary and it's just like a playground thing from them. But music gen is open source. This is, again, meta full, sending the open source. They put up a hugging face base. This isn't just a research paper. This is. Here's our paper, here's our hugging face space, here's our code.
Use it, try it, enjoy it. And it's a, looks like a pretty salt and model. The melodies, all the soundtracks can create pretty solid.
Farb: Yeah. Let me see if I can share my, my window here. This is, uh, I think the song that, uh, I got, I, I gave it a clip of some audio. I told it to make me a hip hop song. We're hearing it live for the first time. Here on AI Daily,
Ethan: I'm inspired.
Conner: You know, I think that's the part of your media career for, it's pretty good.
Farb: And I love the name Tim. 60 Ones of Friction. Yes. Uh, is a great song name too. Uh, So that's pretty cool. I, I gotta say that's, that's definitely pretty awesome.
Conner: Exciting to see what people do with it.
Farb: You guys don't care.
You guys don't care. Don't care about my music. That's fine. I'll find somebody. It's the start of something great. The start of something new. Okay, well, great work, uh, to the folks on, on music, Jen. Pretty, pretty impressive. Uh, instead of thing obviously just gonna get even more wild, like, like Ethan said, with, with lyrics and.
Voices and stuff like that. Onto our second story, Mr. Zuckerberg did a, uh, podcast with Mr. Friedman and, uh, they talked about a lot of things including ai, and they talked about open source and they talked about META'S plans for ai. Connor, can you tell us what Meta's plans for AI looked like?
Conner: Yeah, apparently he had a all hands with employees just yesterday and they said, Jenn of ai, text, images, videos, is going to go into every product at the flagship features from Facebook to Instagram.
Probably WhatsApp and more, um, everything from, Hey, let's modify my photo before I upload it. Not just a filter, but completely regenerate the photo. Or, Hey, WhatsApp or Facebook Messenger. I can have an assistant, I can have a little agent that I can talk to and help build with. Zuckerberg really is all in on generative ai is what it looks like here.
Farb: An agent to go along with whatever FBI agent's watching you through your. Through your camera.
Conner: You're, you're DMing the undercover FBI agents and you're DMing the Facebook AI agents, right?
Farb: Maybe they're DMing each other. Keep me out of the loop. What do you think about this, Ethan?
Ethan: Uh, yeah. I'd say the most important thing I got to see in it, similar to our first story is Zucks, just, he's recommitting his focus to maybe not focus, but they're ACY and care for open source.
So, you know much of what we talked about for the past two weeks, a lot of open source developers are building on llama. They're building possibly soon on music. Jen, All these applications are coming out of meta, and I think him reconfirming that this is something they continually want to do much different positioning than say, OpenAI or Google.
So there's a lot of things he announced on a product side, but I think just from a culture standpoint, meta's research team, which has. Unbelievable AI researchers, really talented team over there. The fact that they are allowed to and releasing these models that other people can build on and will continue to do so is really gonna shape the next 3, 4, 5, 10 years of AI development.
So I was really happy to hear that.
Farb: It's interesting, there seems to, I, I get, I have this weird sense that there is some shift in zucks. Worldview or Facebook or, or meta view? It, it's, it almost feels as though, okay, the metaverse kind of like we were thinking, uh, and thinking about dominating is not coming to fruition.
Um, Let me go get jacked, uh, kick some people in the head in an NIK ring, and then just be a total chaos agent and just be like, you know what? We're going full based over here at Meta. We're opening everything up. Open source AI to the world. I'm gonna be on the the matte choking people out. Um, it's pretty, it it, there's this weird sense that like, there's a, there's a shift in his attitude.
Uh, and I, I, you know, I mean, I mean it positively, he seems like l leaned in and, you know, like toned up and, and, and ready for what's coming. That, I don't know, maybe this's just me, but that's the sense, I guess a wartime c e Yeah. Maybe he is. He's, he is, he is ready for the coming AI wars and he feels like he can.
Make, make some moves and make some waves here. They, they, you know, one, one of the things that meta has been pulling together for a long time, and I think you're, you're sort of seeing it here, is just unbelievable talent when it comes to people who are doing ai and, you know, one way that you can sort of commandeer the space and eat up the air in the room if, if you don't have all the products there yet, is to sort of come out and be.
Hey, here's what we do have. Show off what you do have, because it's pretty impressive what they have in terms of talent and the research and things they've done there.
Conner: Yeah, open source is really helping their own talent and pushes like that, like you said, exactly like Opensourcing Llama and then now G gml AI with llama c plus plus.
And then like he said in the Lex Friedman podcasts, um, they're now using llama c plus plus internally, so Yeah.
Farb: Yeah. And, and and they get that, hey, if they make it seem as though this is where you come to as a researcher, To get your stuff turned into, you know, get, get attention, uh, for the things that you're doing.
That's a good, that's a good way to recruit as well.
Conner: I believe we're hosting a internal hackathon this summer too, so Yeah, I'm sure things will come outta that.
Farb: We'll be cool to see. Uh, nice work, Mr. Zuckerberg. And then our final story, we have clip drop, which it seems like. I, I don't even, when I saw it today, I didn't know if it had been around for a long time because there just didn't seem like there was a lot of noise about it.
But it seems like a beautifully put together product. Uh, it's a whole host of, you know, Firefly style, uh, things you can do by stability. AI will show off the, uh, webpage here and, uh, I don't know, what do you, what do you think of this, Ethan? Did you play around with it at all?
Ethan: Um, a little bit. Yeah. They, they've put in a lot of what you could do with stable diffusion or what other people are building into clip drop.
And there's also an API as well. So if you're, you know, another startup or building a product, you can put clip drop into your product. I think, you know, the story here is really. Stability, of course, releases all their models, open source. So in terms of a business plan and where they're gonna sit in the space in the future, we see them putting these products out, like real products people can use, and much easier APIs people can use.
So they're utilizing their position in the space to, you know, create clean products of stuff people want, and I'm happy to see it.
Farb: Connor, what do you think? Have you, have you had a chance to play with it yet? Does the a, what do you think of the API angle? The
Conner: API angle is pretty smart. Uh, like Ethan said, I think they're really always going open source and then they build a little more product around that.
And then so far what we've seen, they open source the product and then build more product. So they had stable studio a bit back and then they open, sourced that as Dream Studio, I believe right? Was Dream Studio. Then they open sourced it as stable studio and. I wouldn't be surprised if we saw something like that with Coro, if we saw some parts of it open sourced as they continue building better models and better open source, um, or better closed source after this.
Farb: So it's a clean product. It's a very clean product. Paige, I was impressed. Uh, alright, well let's, uh, let, that's it for our three stories. I hope you enjoyed them. Uh, what are you guys seeing out there, Ethan?
Ethan: Uh, yeah, I saw a really good article. Um, it's called What will G P T 2030 look like? Um, you know, some of it, you could read it as common sense, but I think it's really well written of like what the world could look like in 2030, both from a technical standpoint of, hey, we're gonna be able to do inferencing much faster and cheaper on device.
We've talked about some of this before, but also what does it really mean when you have, you know, full human capabilities across coding and hacking and reasoning and finances. For a model that's, you know, 20, 30 sounds far away, but we're si almost six and a half years from it. Um, so it was a really well-written piece on what the future could look like.
So we'll link it below, but I liked it. Very cool. Connor?
Conner: Yeah, apparently, uh, Adobe, this is a very similar move to what GitHub did with GitHub co-pilot, but Adobe is confident in fireflies generative ai, so, If it's you're sued for breaching copyright when using Firefly, then Adobe will pay your legal bills.
They'll back you as long as you're using Firefly correctly and by their standards, then they'll cover you. This is the exact same move. We covered a bit back with GitHub and GitHub Co-Pilot Enterprise. If you're using co-pilot in the right way and you're sued for. Copyright infringement, GitHub will protect you.
So it's nice to see these big model, big companies standing by their models and not just throwing 'em out in the world and hiding behind laws.
Farb: So yeah, it seems like a great example of the market regulating itself. Mm-hmm. You know? Sure. Okay. Well, you know what I, what I saw was cool. Um, our, our friend, um, Joseph Jacks, as he goes on, uh, Twitter has, uh, started talking about an open source framework for AI since.
The open source framework for operating systems, for example, may not be the exact fit for what you need for ai, and specifically is focused on creating open source weights, foris, and, uh, creating a framework for a manifesto, for a foundation to, you know, push the work of open, open source ais forward. What, what do you guys think about that?
Conner: Yeah, I liked it. O open source code is very different from open source models because open source code, you can study, you can run, you can really analyze and learn a lot from it. But these models and like the weights of them, of course, specifically, you can't really do that. You can, we're starting to see a little bit of that, but it's still very expensive and very impractical for the average person.
So we kind of do need a different licensing and a different way of thinking about. Um, open source weights.
Ethan: Do you think it's more of like a branding and ethical play? Because I mean, at the end of the day, this is the, the model weights are just unreadable code. Do you, like, maybe from a branding perspective, it's important to touch on like law or privacy or something.
Farb: But is who's branding? Say again? Who's branding?
Ethan: Just the branding of Open Source maybe for, you know, this new organization trying to lead this or just people wanting to release, you know, models, get them popular and say, we're under this new licensing that, you know, provides open source. But I, I just couldn't fully understand how, you know, it's just unreadable code.
Does that warrant something completely new?
Conner: I mean it's, I see it kind of as like a, like just transparency and knowing what you're doing play, instead of just throwing an M I T license on whatever weights you have up there, you're like, no, I mean to say that this is open weights, this isn't just like code, this is different.
And I'm being clear about that.
Farb: It seems like they feel there is a value or some win in the specificity. Yes. Uh, And I'm not even sure if they're a hundred percent certain they're even right about that. It seems that, that they're saying, Hey, this might be the case. Maybe we can provide some framework and structure to make sure that it goes in the direction that it should instead of kind of leaving it open ended.
Conner: I think they're just trying to have a dialogue about specificity, as you've said.
Yeah. Uh, if you've read these licenses before, if you looked at them, they're very specific licenses. So having something more specific on neural networks and what you mean when they're open, I think it's important.
Farb: I mean, my, I imagine the foundation's ideal would be that these things are included in maybe existing licenses as opposed to some whole, whole new type of license, which could also be the case. Fascinating. Okay, well, we'll end it on that. I was waiting for something to just kind of wrap it all up there and that was perfect.
It is fascinating and thank you all for joining us today. We'll see you not tomorrow, but Monday. Have a great weekend everyone. See you guys. Peace.
Meta's Future AI Plans, MusicGen, and ClipDrop