Apple discreetly released an open-source multimodal LLM in October.

Summary

Apple, in collaboration with Columbia University, has quietly released an open-source multimodal LLM named Ferret, signaling a shift towards open-source AI and potential integration into consumer devices.

Abstract

In October 2023, Apple, alongside researchers from Columbia University, introduced an open-source multimodal language model known as Ferret, marking an unexpected move into the realm of open-source AI. This development has sparked interest in the AI community, particularly in the context of running local LLMs on small devices and the potential for more immersive visual experiences powered by AI. Apple's foray into open-source LLMs, despite its traditionally closed ecosystem, has been met with enthusiasm, with experts like Bart de Witte and Tristan Behrens highlighting the significance of Apple's contribution to the open-source AI landscape. The move is seen as strategic, with Apple possibly aiming to compete with dominant models like ChatGPT by either becoming a hyperscaler, forming partnerships, or avoiding dependency on cloud providers like Microsoft or Google. The release of Ferret under a non-commercial license showcases Apple's commitment to significant AI research and positions the company as a leader in the multimodal AI sector.

Opinions

Bart de Witte, a European non-profit leader, welcomes Apple's entry into the open-source AI community as a significant event.
Tristan Behrens, a German AI music artist and advisor, expresses excitement about the prospect of Local Large Language Models (LLLMs) running on iPhones.
Ben Dickson, a tech blogger, views Apple's move as a logical step in the company's AI development, suggesting that to compete with models like ChatGPT, Apple needs to scale up its AI efforts or form strategic partnerships.
The AI community recognizes the importance of Apple's open-source LLM release as a way to avoid dependency on external cloud providers and to innovate in the multimodal AI space.

Apple discreetly released an open-source multimodal LLM in October.

Firstly, a large language model (LLM) is part of an artificial intelligence (AI) program that can recognize and generate text, among other tasks.

Apple and Columbia University researchers released an open-source multimodal language model (LLM) called Ferret in October 2023, which initially received little attention. However, with the recent release of open-source models from Mistral and Google’s Gemini model coming to the Pixel Pro and Android, there has been increased interest in the potential for local LLMs to power small devices. Apple announced two new research papers introducing new techniques for 3-D avatars and efficient language model inference, potentially enabling more immersive visual experiences and allowing complex AI systems to run on consumer devices like iPhones and iPads.

The AI community celebrated Apple’s unexpected entry into the open-source LLM since Apple has always been known as a “walled garden.” Bart de Witte, a European non-profit focused on open-source AI in medicine, posted on X that Apple joined the open-source AI community in October. Tristan Behrens, a German AI music artist and advisor, also weighed in, expressing excitement for the day when Local Large Language Models (LLLMs) run on the iPhone as an integrated service of a redesigned iOS.

Source: Zhe Gan

Tech blogger and VentureBeat contributor Ben Dickson commented on the AI development in 2023 that they expected the least. In retrospect, it makes sense for Apple to enter the LLM market with open-source models. To compete with models like ChatGPT, Apple needs to be a hyperscaler or partner with one. The alternative would be to become dependent on a cloud provider like Microsoft or Google or start its open-source models, a la Meta.

Hi! If you like my article, please clap(🤗) and spend at least 30 seconds on each article.
Follow me 🤩 to stay up-to-date on Apple News, Tech News, and Data Science News.

When choosing a deployment strategy for Local Large Language Models (LMS), the main factors to consider include scalability needs, data privacy, security requirements, cost constraints, ease of use, the need for the latest models, and predictability. Other considerations include vendor lock-in issues, network latency tolerance, and team expertise.

Scalability involves the number of users and models needed to meet requirements, while data privacy and security requirements determine whether a cloud-based solution is necessary.

Cost constraints may be more cost-effective for those with limited budgets and access to hardware. Cloud platforms often offer plug-and-play tools that simplify the process, making it more accessible and manageable.

Predictability allows for better management relative to on-premise infrastructure costs, while vendor lock-in issues require more self-maintenance.

Network latency tolerance is crucial for real-time responses and lower latency in applications.

Team expertise is also essential when choosing a cloud option, as implementing new solutions can incur costs only relative to time, money, and human resources.

Apple’s recent release of open-source LLMs, albeit with a non-commercial license, demonstrates its dedication to significant AI research and cements its position as a pioneer in the multimodal AI sector. To compete with models like ChatGPT, Apple needs to be a hyperscaler or partner with one. The alternative would be to become dependent on a cloud provider like Microsoft or Google or release their open-source models, similar to Meta.

Anthropic and OpenAI are reportedly negotiating massive new funding raises for their proprietary LLM development efforts. Anthropic is in discussions to raise $750 million from Menlo Ventures, while OpenAI is in progress to raise a fresh round of funding at a valuation at or above $100 billion.

If you enjoyed this article, please clap and follow me to stay up-to-date on Apple News, Tech News, and Data Science News.