New LLM by Mistral AI
A Quiet Revolution? Mistral AI Releases Sensational New AI Model
GPT-4’s “secret weapon” is available as open source
At the same time that Google announced their new “Gemini” model with great fanfare, including a press tour and a spectacular (but possibly not entirely honest) demo video, a quiet revolution may have begun.
No, we’re not talking about Gemini (which isn’t even fully available yet and where some details about the model’s capabilities remain unclear to say the least.

Instead, we will look at the startup Mistral AI, that posted a download link to their latest language model on X.
Just the link.
No comment.
Casually dropping the world’s first open source language model based on the same architecture the OpenAI flagship GPT-4 runs on.

Let’s find out why the open source community is so excited about this.
An Open-Source Alternative to GPT-4?
Let’s first look at that download link that has been posted on X and the files it leads to.
I mean, just reading the file name of that model must have been a true delight to developers: Mixtral-8x7B-32kseqlen
Nope, not the name of Elon Musk’s next child but actually a preview of the the new language model’s capabilities.
And these are impressive:
- the context size is 32k tokens (equivalent to ChatGPT standards)
- the model’s architecture is also that of GPT-4: the so-called “Mixture Of Experts,” where several highly specialized language models (“experts”) are combined (in the case of Mixtral, these are 8 experts with 7 billion parameters each: “8x7B”)
Why Is “Mixture of Experts” So Important?
The concept originates from the early 90s and, as it turned out, underpins the success of GPT-4: “Mixture of Experts” (MoE) is a training method for AI systems in which, instead of a single model learning everything, a combination of sub-models is used.
Imagine it as a team of experts working together to solve a complex problem.
To efficiently coordinate this expert round, an additional so-called gating network comes into play, which can be thought of as a kind of team leader assigning tasks to the experts.

Interestingly, the gating network does not have to rely on a single expert but can combine the insights of several experts, with multiple nuanced viewpoints contributing to solving a problem.
In the case of Mixtral, the gating network would decide which experts should contribute to text prediction. Interestingly, Mixtral’s current metadata reveals that the model will consult 2 of the 8 available experts for each text prediction (i.e., each individual token is calculated through the interplay of two specialized sub-models).
Thus, Mixtral utilizes one of the most advanced model architectures and has the potential to become a game-changer for the open-source community — similar to what we experienced with the release of the AI image model Stable Diffusion, as countless developers worldwide were able to develop new AI models and workflows, which are now standard repertoire in generative AI for image processing.
How To Use Mixtral 8x7B?
At the time of writing, there’s only one platform offering free testing of Mixtral: Poe.com
(this section will be updated as soon as we get more inference endpoints)
A Quiet Revolution?
In contrast to Google’s glitz and glamour show, Mistral AI’s release strategy seems to specifically target those who actually work with AI: developers looking for a publicly accessible and extremely powerful AI model to adapt to their field of work.
And with MoE entering the open source space Mistral may indeed have started a quiet revolution — not just for developers but for anybody who is experimenting with AI and looking for new creative possibilities.
Please, if you liked the article, be so kind and leave some claps, follow me and feel free to comment with questions or suggestions. ❤️ 🙏
➡️ If you want to support my work, become a Medium member using my referral link and get full access to all my articles (180+ and growing) and those of thousands of other writers. 🙏
➡️ If you like my content, why not leave a “clap” at the end of this article, so more people can see it?





