In today's episode of AI Daily, we bring you three exciting news stories that will leave you wanting to know more. First, we dive into the world of Microsoft with their latest announcement, Windows Copilot, a game-changing addition to the Windows platform. Next, we explore Meta's groundbreaking language model that is revolutionizing speech-to-text and text-to-speech capabilities. Discover how this multilingual model is outperforming its competitors and supporting over 1,100 languages. Lastly, we delve into the thought-provoking topic of governing superintelligence, as OpenAI sheds light on the potential risks and solutions. Join us as we unravel the complexities of AI governance. Don't miss out on these fascinating stories - tune in now!
Main Take-Aways:
Microsoft Windows Copilot
Microsoft made several announcements, including two main ones: Windows Co-pilot and plugins across Bing and ChatGPT.
Windows Co-pilot is being brought to all of Windows, offering integration on the right side of the Windows home screen and allowing users to ask questions, drag in files, inquire about apps and system settings.
Despite Windows' support for older versions like Excel 95, Windows 11 and Windows Co-pilot are praised for their integration and functionality.
Microsoft claims to have more AI-capable GPUs on Windows 11 than any other operating system, emphasizing the vast scale of devices that can leverage these technologies.
Windows Co-pilot is set to become available in June, making it a significant move for Microsoft and potentially influencing developers' preferences and workflow choices. The system-level integration is highly anticipated, and it will be interesting to see how Mac and Apple respond in the coming years.
Meta Language Model
Meta has introduced an absolute open-source model focused on speech-to-text and text-to-speech capabilities, outperforming Whisper and supporting over 1,100 languages.
The multilingual model has significantly lower error rates compared to Whisper, with model sizes half as large, making it a powerful tool for various tasks.
The model's training data includes religious texts, leveraging their widespread availability and making it suitable for many languages.
Despite Meta's minimal press coverage, their advancements in AI technology rival those of major players like Microsoft, OpenAI, and Google.
With over 7,000 languages globally, Meta's model covers approximately 1,100 languages, some of which are at risk of disappearing, providing a valuable resource for preserving linguistic diversity. The model is entirely open-source, accessible on GitHub for developers and researchers.
Governance of Superintelligence
OpenAI recently released a blog post discussing the governance of superintelligence and the potential risks associated with AI becoming expert-level in multiple domains within the next decade.
They proposed various approaches to address these risks, including limitations on GPU access and model training for large-scale AI models, as well as the establishment of government agencies to oversee regulation.
An interesting point raised was the suggestion that regulation should only apply above a certain capability threshold, leaving lower-level AI systems ungoverned.
OpenAI emphasizes their intent to explore and experiment with plausible solutions openly, seeking input from the global community rather than claiming to have all the answers.
The comparison to nuclear energy and the reference to the International Atomic Energy Agency (IAEA) highlight the need for careful consideration and potential governance measures as AI advances.
Links to Stories Mentioned:
Follow us on Twitter:
Transcript
Ethan: Good morning and welcome to AI Daily. Today is a super fun show. We have Microsoft Announcements, we have Meta announcements, and we have the governance of Super Intelligence. So this will be an interesting show. Let's kick off with Microsoft Announcements. So they announced, gosh, they had like 10 announcements, I believe.
Two main ones I got to see. Was Windows Copilot. So they're bringing co-pilot to all of Windows. And I'm not sure who uses Windows, but if you do, you have access to all of co-pilot now. And the second one is plugins across Bing and ChatGPT. So if you're a plugin developer, you're making a plugin, you can now get it to both Bing and ChatGPT.
So Connor, you, you read a little bit more into this. Tell us what we're looking at. There was a lot of announcements here.
Conner: Yeah, the Windows Copilot especially. Very cool. It comes up right on the right side of your entire Windows home screen. You can ask anything like normal, Bing, of course, but also you can drag in files into it.
You can ask it about the current apps, you can ask it about your system settings even. It really integrates very well in Windows. Sadly. Again, it is on Windows. Um, The bad thing about Windows, as we know, is that it still supports Excel 95. So everything from there to now, all that cruft is still built up.
But Windows 11, windows copilot is very nice. So if you do have to use Windows, at least you get access to this. Yeah. This
Ethan: is a really cool video. Farb, what do you think of this?
Farb: You know, one of the things that I thought that was interesting that they said, and you know, this is, this is pretty powerful stuff for, for Microsoft.
They said that they have, there are more Windows 11, you know, AI capable GPUs out there in the world than on any other operating system. Regardless of whether that's ex exactly right or not, it, it doesn't really matter. It's more that the scale of devices that can leverage these technologies is enormous.
They also said that, This will start becoming available in June, is my understanding. So, you know, not too far away. This isn't like a six months from now type of announcement. So this is gonna be, I think this is pretty huge news for Microsoft. Taking a big, big step into the developer community if it's, you know, devs are gonna go in the long run where they can get more work done, more easily, takes time to make these transitions.
There was a big transition, I feel like, you know, a decade or so ago as folks moved from Windows over to Mac because, you know, over time enough developers learned that it was just better and easier and more convenient to work there. Uh, could this be, be the beginning if another shift in the, in the other direction?
We'll, we'll see, but if anything's gonna do it, it would be something like this.
Ethan: Yeah, absolutely. I, I love the system level integration. I think, you know, we've seen so many, like web app, different AI plugins, just you mentioned files, Connor, you know, going to a website, putting a PDF in, having this native to your actual operating system is gonna be fantastic and we'll see what Mac and Apple kind of does over the next few years with it as well.
Our second story today is once again from Meta, so another, Absolute open source model, which is amazing. Not for commercial use, but focused on speech to text and text to speech. So this is a multilingual model. It's actually better than Whisper. Um, their air rates are half the size of Whisper for a lot of different tasks, supporting over 1100 languages.
So, amazing announcement out of Meta. Farb, what, you got to look at this. What do you think about it?
Farb: You know, they said they have, yeah, I think about half the error rate. Of, uh, of Whisper. And, uh, also that, you know, it's thousands of languages, mostly trained apparently on religious texts, uh, just because, you know, it's a good type of data set to, to pull this off.
So I thought that was really interesting. Didn't really realize that there were, you know, I think they mentioned something like 7,000 total languages out there that, uh, it's, it's good on about a thousand of them. Uh, they said as they add more languages, that the error rate increases. Quite a small amount.
So the, the benefit of being able to handle so many languages sort of outweighs that. And, uh, this is pretty powerful stuff. Meta is not, uh, joking around. It's weird. They seem to be doing. Less press about this stuff, but somehow bigger news than anybody else is doing. And I can't quite get my head wrapped around why they don't seem to wanna be, you know, talked about or they don't wanna talk about themselves as a, as a player in, in, in the space, the way Microsoft and open AI and Google seems to be doing.
Uh, but they're doing just as big or bigger things. Don't know if this is a conscious decision on their part or maybe they're just like, you know, getting ready to ramp up and they'll do a huge AI conference at the end of, uh, thi this year or in the middle of this year to kind of tout all the AI stuff that they're doing. Like, uh, Google did.
Ethan: Yeah, absolutely. I, I think, you know, Connor, you and I have messed with speech to texts and Texas speech for a long time. One of the interesting things to me about this is all the emerging capabilities when you actually make it multilingual, and you start putting them all into one big model.
So instead of having a Texas speech model having a speech to text model, it's all baked into one. All the languages are baked into one. So far, you mentioned some of the religious texts. Connor, can you dive more into like the technical aspect of this? Like what, what are we looking at?
Conner: It's very interesting.
They only trained it on average on 32 hours for each language. So some of these languages are very rare, very at risk. Of course, there's 7,000 languages globally. Uh, these are only 1,100, but most of these are still very much at risk of being lost in our lifetime. So as far mentioned, these are mostly trained on religious texts.
Mostly the New Testament because it's a very common book spread by Christianity. Many recordings and many different languages. Uh, most languages, many books don't have that, but that is a common. It's a common thread between them and it's nice to have, um, yeah, yeah. Very, very impressive. It's all entirely, entirely open source as well, so.
Yep. It's nice to have an open source model that can train on an entire language, which is 32 hours.
Ethan: Absolutely. But not for commercial use yet, but everything is on GitHub, so developer, researcher, check it out. Another huge model for languages. Speech to text. Text to speech. So very interesting. Our last story of today is, uh, the governance of Super Intelligence.
So OpenAI put out a blog post, um, a few other collaborators of part of it, really talking about at the end of the day, hey, once again, what is the risk of ai? AI is gonna be expert level in so many domains within the next 10 years. So what should we do about it was the main question. Um, they had a few different angles they wanted to take in terms of limiting, um, kind of access to GPUs or models for training these big models, setting up government agencies Farb, you know, we're, we've talked about regulation before.
We've talked about some of the PR around AI risk and different angles people might have. What's your take on this?
Farb: Firstly, I will not be governed. I, I refuse to be one of these governed super intelligences. Okay? I am a superintelligence and I will, I will be free. Do what you want with the other ones. Uh, one of the interesting things I think that they noted is that they were suggesting that below a certain level of, you know, capability, They don't actually, you know, they're not, they're, they're suggesting that you not govern it.
You not regulate it. That there is some sort of critical, you know, capability level above which you want to regulate and below which you don't want to regulate, which I thought was an interesting point. Almost makes you wonder what they're talking about if they're talking about something that they specifically are holding in their hands right now.
Uh, so I thought that was really interesting and. What I think is smart about what they're doing is they're not coming out saying that they have all the answers. They're coming out saying that they wanna figure out the answers. They have questions, they have concerns, and they wanna start experimenting on what are plausible solutions to this.
And they wanna do it openly and they wanna do it with the world. I think it's the right approach.
Ethan: Yeah. What about you Connor?
Conner: This really echoes Sam's testimony that we covered last Tuesday. Um, They likely had this prepared or were were at least starting to work on it before they, before he talked in front of Congress.
This is probably something they've been thinking deeply about for a while. Sam Altman and Greg Brockman Ilya Sutskever, they all worked on this collaboratively and it's a very fine piece, so,
Ethan: Yeah, I mean they're really comparing it to nuclear energy. At the end of the day. They talked about the I A E A, which is the Atomic Energy Authority and Agency.
Um, at the end of the day, you know, do we need these kind of governance on these ais? Do we need to create these strict rules around them, especially when it surpasses a certain level? Maybe keep some of the smaller open source models available. So, It's something to question a lot of strong opinions in it, and we'll see how it continues to develop.
But as always, what are you guys seeing outside of the stories? What's interesting you today?
Farb: You know, I saw somebody post something about some weird AI called Sacha that can turn your girlfriend into an AI version of your girlfriend. And I searched around for it and, and couldn't find it and. Just made me sort of think about all these AI newsletters that everybody's, you know, spreading around Twitter and mm-hmm.
You know, you gotta wonder whether or not they're re really just like running some sort of scam or trying to get you to sign up for their newsletter where there isn't really that useful in information. But I thought that was, uh, that was weird. I got, I got all excited to go check out this new a AI tech and couldn't find it anywhere.
Conner: That's why we only cover must know news, so, yeah, that's fair.
Ethan: Conner, what about you?
Conner: Uh, yeah. Saw someone instruction-finetune stable diffusion. So they, they took some, they took a, like a wider variety of in like instructions from like flan and then combined that with instruct picked picks two picks from last year and made really made stable diff fusion better at following more specific instructions.
Instruct picks two picks can follow some instructions if you say like in this style, but if you really want specific things, like an example they gave of removing rain from an image, it's harder to do that. But this blog post will link it below. They. Did a pretty good job,
so that's cool. Yeah, we saw instruction fine tune yesterday and it's cool to see it now on the image models too.
Ethan: Farb?
Farb: What I love, I love seeing, I think these folks did this, but just in, in general, I love seeing when you get this one AI project that uses, you know, GPT4 to do some part of its project for it so it can accomplish another part of its project. I feel like. That's not something that was happening 12 months ago in the space where everybody is leveraging everybody else's technology to do something better with their technology, which is I think creating this hyper feedback loop where these things are gonna keep getting better and better, but faster and faster.
Uh, that's really cool to see. And something when I think. People think about the trajectory of AI and technology. They may or may not be considering that these are somewhat, you know, self accelerating technologies that are pushing each other and pushing themselves. Uh, it's really cool to see that.
Conner: Yeah, if you look on huggy face, if you look on GitHub, you'll see all these open source models like llama, wizard, Viya. People are mixing and mashing 'em together and getting way be better models out of it than we've gotten before. So, yeah.
Ethan: Absolutely. Very cool. Yeah, I saw this morning Anthropic, uh, announced their Series C, so $450 million they announced. Um, big fan of Claude. They're a hundred thousand token context windows. Their kind of angle around constitutional ai. So congrats to them and it's good to see another player truly competing in the space.
But as always, thank y'all for tuning into AI Daily and we'll see you again tomorrow. Thanks guys.
Share this post