avatarTristan Wolff

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

3211

Abstract

large as Grok with its 300+ billion parameters.</p><p id="ec20">You can read more about how MoE supports language models here:</p><div id="5844" class="link-block"> <a href="https://generativeai.pub/wait-did-they-just-leak-the-secret-behind-gpt-4-845d3db751cd"> <div> <div> <h2>Wait, Did They Just Leak The Secret Behind GPT-4?</h2> <div><h3>OpenAI’s GPT-4 may owe its capabilities to an old technique from the early 1990s known as “Mixture of Experts”</h3></div> <div><p>generativeai.pub</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*SePHbA3GeJiEgX51)"></div> </div> </div> </a> </div><p id="d243">We do know how big Grok is, but unfortunately we don’t know anything about the training data yet. However, if you look at the benchmarks that X.ai published, Grok appears to perform within the capability range of GPT-3.5 or <a href="https://readmedium.com/a-quiet-revolution-mistral-ai-releases-sensational-new-ai-model-c17c663287f0">Mistral</a>.</p><figure id="40c6"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*-2YVhBhEHjvvGMoRvuIcdw.png"><figcaption>Source: <a href="https://x.ai/blog/grok">https://x.ai/blog/grok</a></figcaption></figure><p id="9b79">In a few days, I think, we can expect Grok (and its derivates) to show up on the usual LLM leaderboards <a href="https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard">here</a> or <a href="https://llm-leaderboard.streamlit.app/">here</a>.</p><div id="10be" class="link-block"> <a href="https://tristwolff.medium.com/let-chatgpt-answer-controversial-questions-like-grok-d42b38f98403"> <div> <div> <h2>Let ChatGPT Answer Controversial Questions Like Grok</h2> <div><h3>The fine line between discourse and promoting harmful biases</h3></div> <div><p>tristwolff.medium.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*ZsqWMxKTF9W1V15Z)"></div> </div> </div> </a> </div><div id="d30d" class="link-block"> <a href="https://readmedium.com/battle-of-the-bots-chatgpt-vs-claude-2-vs-llama-2-2728083b6008"> <div> <div> <h2>Battle Of The Bots — ChatGPT vs Claude 2 vs Llama 2</h2> <div><h3>Comparing today’s best AI chatbots</h3></div> <div><p>medium.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*jnSJyovD1Y8nIWDX)"></div> </div> </div> </a> </div><h1 id="4f65">What’s Next?</h1><p id="99d3">It usually only takes a few days or weeks for the open source community to publish <b>quantized versions</b> of major open source models.</p><p id="53e4"><b>In what is known as “quantization”, the original model is simplified to reduce its size

Options

and shorten its inference time. This allows it to be used on devices with limited computing power or memory.</b></p><p id="8c6a">How does this simplification work?</p><p id="a2c7">Remember the parameters, of which Grok has about 300+ billion? Each individual parameter is actually a “computation point”, if you will, a place where the model calculates numerical values that add up to the final prediction at the end: <b>which is the most likely word to follow an input sequence of words.</b></p><p id="387f"><b>Quantization reduces the precision of these numerical values (usually by converting floating point numbers to lower precision formats).</b></p><p id="d900">Once we have quantized versions of Grok, we’ll see more people experimenting with it. In the meantime, we can expect the first online demos for Grok to show up on your favorite machine learning platform, such as <a href="https://huggingface.co/">Huggingface</a> or <a href="https://replicate.com/">Replicate</a>.</p><h1 id="5af6">Links</h1><p id="dc5e">Grok Announcement (2023): <a href="https://x.ai/blog/grok">https://x.ai/blog/grok</a></p><p id="f705">Grok Open Source Announcement (2024): <a href="https://x.ai/blog/grok-os">https://x.ai/blog/grok-os</a></p><p id="faeb">Grok on Huggingface: <a href="https://huggingface.co/xai-org/grok-1">https://huggingface.co/xai-org/grok-1</a></p><p id="7558">Grok on Github: <a href="https://github.com/xai-org/grok-1">https://github.com/xai-org/grok-1</a></p><p id="790a">➡️ Follow me to stay up to date on “AI &Creativity”. If you want to support my work, become a Medium member using <a href="https://medium.com/@tristwolff/membership">my referral link and get full access to all my articles</a> (170+ and growing) and those of thousands of other writers. 🙏</p><div id="3969" class="link-block"> <a href="https://medium.com/@tristwolff/membership?source=post_page-----ba914257c53a--------------------------------"> <div> <div> <h2>Medium</h2> <div><h3>You can find (just about) anything on Medium - apparently even a page that doesn't exist. Maybe these stories will take…</h3></div> <div><p>medium.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*T39WkB93NJtt7IiA)"></div> </div> </div> </a> </div><p id="ad0f">➡️ If you like my content, why not leave a “clap” at the end of this article, so more people can see it?</p><figure id="058d"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*SJ0b_Pbq97KvuVPw.png"><figcaption></figcaption></figure><p id="c223">This story is published on <a href="https://generativeai.pub/">Generative AI</a>. Connect with us on <a href="https://www.linkedin.com/company/generative-ai-publication">LinkedIn</a> and follow <a href="https://www.zeniteq.com/">Zeniteq</a> to stay in the loop with the latest AI stories. Let’s shape the future of AI together!</p><figure id="300f"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*OOdngGyox077m-zv.png"><figcaption></figcaption></figure></article></body>

GROK: Did Elon Just Drop An AI Game-Changer?

The Open-Source Era Of GROK Begins

How a new open-source model is offering unparalleled scale

“The cover image was generated using Midjourney based on the following prompt proposed by Grok: A 3D illustration of a neural network, with transparent nodes and glowing connections, showcasing the varying weights as different thicknesses and colors of the connecting lines.” (from: https://x.ai/blog/grok-os)

The large language model (LLM) “Grok” caused quite a buzz during its initial development and testing phase back in 2023.

Elon Musk’s X.ai has now released its code, providing us with possibly the largest open-source model EVER, boasting over 300 billion parameters (for comparison, GPT-3 had 175 billion).

This is so large that it cannot fit anymore on a single GPU — nothing unusual for LLMs, but quite unique for an open-source model. However, there’s no need to worry: we will still be able to utilize Grok without needing to set up a GPU farm in our bedrooms. Here’s how:

What We Know About Grok

It may take another few days before we can actually use Grok. But this is worth the wait considering

  • the immense size (314 billion parameters)
  • the fact that it is a non-fine-tuned model without any specialization so far (for example, GPT-4 has been fine-tuned to follow instructions and thus work in a dialogue-style chatbox interface)

Grok is particularly intriguing to keep an eye on, given the potential developments that will arise from further fine-tuning and quantization (we’ll clarify what that means in the next section).

So, what do we know about Grok? Here’s what X.ai tells us:

We are releasing the base model weights and network architecture of Grok-1, our large language model. Grok-1 is a 314 billion parameter Mixture-of-Experts model trained from scratch by xAI.

This is the raw base model checkpoint from the Grok-1 pre-training phase, which concluded in October 2023. This means that the model is not fine-tuned for any specific application, such as dialogue.

Besides the size and the fact it’s a “raw base model” without any fine-tuning, there’s another interesting detail: Grok uses the successful Mixture Of Experts (MoE) architecture, the same architecture that powers GPT-4. MoE seems to significantly increase the capabilities of models, especially when they are as large as Grok with its 300+ billion parameters.

You can read more about how MoE supports language models here:

We do know how big Grok is, but unfortunately we don’t know anything about the training data yet. However, if you look at the benchmarks that X.ai published, Grok appears to perform within the capability range of GPT-3.5 or Mistral.

Source: https://x.ai/blog/grok

In a few days, I think, we can expect Grok (and its derivates) to show up on the usual LLM leaderboards here or here.

What’s Next?

It usually only takes a few days or weeks for the open source community to publish quantized versions of major open source models.

In what is known as “quantization”, the original model is simplified to reduce its size and shorten its inference time. This allows it to be used on devices with limited computing power or memory.

How does this simplification work?

Remember the parameters, of which Grok has about 300+ billion? Each individual parameter is actually a “computation point”, if you will, a place where the model calculates numerical values that add up to the final prediction at the end: which is the most likely word to follow an input sequence of words.

Quantization reduces the precision of these numerical values (usually by converting floating point numbers to lower precision formats).

Once we have quantized versions of Grok, we’ll see more people experimenting with it. In the meantime, we can expect the first online demos for Grok to show up on your favorite machine learning platform, such as Huggingface or Replicate.

Links

Grok Announcement (2023): https://x.ai/blog/grok

Grok Open Source Announcement (2024): https://x.ai/blog/grok-os

Grok on Huggingface: https://huggingface.co/xai-org/grok-1

Grok on Github: https://github.com/xai-org/grok-1

➡️ Follow me to stay up to date on “AI &Creativity”. If you want to support my work, become a Medium member using my referral link and get full access to all my articles (170+ and growing) and those of thousands of other writers. 🙏

➡️ If you like my content, why not leave a “clap” at the end of this article, so more people can see it?

This story is published on Generative AI. Connect with us on LinkedIn and follow Zeniteq to stay in the loop with the latest AI stories. Let’s shape the future of AI together!

Artificial Intelligence
Llm
Technology
Creativity
Open Source
Recommended from ReadMedium