Free AI web copilot to create summaries, insights and extended knowledge, download it at here

12024

Abstract

: even though Mistral 7B is just hitting the scene, it has already proven its mettle in the tests. In head-to-head comparisons with open-source competition, the model consistently outperforms. It bests <a href="https://dataconomy.com/2023/07/19/meta-ai-what-is-llama-2-and-how-to-use/">Llama 2</a> 7B and 13B with ease, showcasing its prowess in various tasks.</p><figure id="9ebc"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*rGqQrTFPSD8-cJy0.jpg"><figcaption>Images from <a href="https://mistral.ai/news/announcing-mistral-7b/">https://mistral.ai/news/announcing-mistral-7b/</a></figcaption></figure><p id="f9c6">Mistral 7B boasts some impressive key strengths that set it apart from the rest. One of its secret weapons is Grouped-query attention (GQA), a revolutionary technique that enables lightning-fast inference. This means you can expect quick and efficient results when using Mistral 7B.</p><p id="1126">But that’s not all — Mistral 7B also utilizes Sliding Window Attention (SWA) to flawlessly handle longer sequences without burdening computational resources. This innovative approach ensures smooth and seamless processing, making Mistral 7B a true powerhouse in performance.</p><h1 id="8d7e">Prompts for testing</h1><p id="fb9e">We are going to evaluate few prompts and compare the results with the ones coming from Llama2. I decided to pick some of my favorite ones, without focusing on logics or mathematics or coding. I believe that instruction following is the best benchmark here: we want to see if the model is able to <i>understand</i> our prompt and give results accordingly.</p><ul><li>Prompt 1 — Predict the impact of artificial intelligence on human learning. write the <b>output in the form of a timeline marking the dates</b>.</li><li>Prompt 2 — You are an expert able to write an easy to understand table of content. Extract the main points of the following<b> text delimited by triple backquotes</b> and <b>generate the table of content</b>: How to write articles that people want to read | by Sunil Sandhu | JavaScript in Plain…</li><li>Prompt 3 —You are a <b>thriller book writer</b>. Use the <b>style of Lee Child</b> to write a dialogue between Jack Reacher and a suspect: the suspect may have stolen secrets from the military that may cause harm to many.</li><li>Prompt 4 — Imagine a group of two different experts are answering this question. All experts will write down 1 step of their thinking, then share it with the group. Then all experts will go on to the next step, think about the question and consider the previous ideas of the group. If any expert realizes they’re wrong at any point then they leave. At the end the expert must come to a conclusion together. The question is… How can we prevent the economical pitfalls on normal people caused by a disease outbreak and lockdowns?</li></ul><p id="f1c0">We have a clear request to format the output sa a list with dates, aa table of content creator (really useful when you want to give a structure to an unstructured loaded document), a creative task following the writing style of a famous book writer and a multi agent conversation were the LLM act as two experts discussing over a topic.</p><figure id="5dc4"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*_plqDMDN2p7VIohTBt7JWg.jpeg"><figcaption>Image by <a href="https://pixabay.com/users/quincecreative-1031690/?utm_source=link-attribution&utm_medium=referral&utm_campaign=image&utm_content=2886223">3D Animation Production Company</a> from <a href="https://pixabay.com//?utm_source=link-attribution&utm_medium=referral&utm_campaign=image&utm_content=2886223">Pixabay</a></figcaption></figure><h1 id="ac4f">The results…</h1><h2 id="0668">Prompt 1 — Predict the impact… in the form of a timeline marking the dates.</h2><div id="e781"><pre><span class="hljs-title class_">Number</span> <span class="hljs-keyword">of</span> tokens <span class="hljs-keyword">in</span> the <span class="hljs-attr">prompt</span>: <span class="hljs-number">37</span> <span class="hljs-title class_">Number</span> <span class="hljs-keyword">of</span> tokens <span class="hljs-keyword">in</span> the <span class="hljs-attr">answer</span>: <span class="hljs-number">196</span></pre></div><blockquote id="a234"><p>2015: AI-powered educational tools begin to appear, offering personalized learning experiences for students.</p></blockquote><blockquote id="61e7"><p>2020: AI-based language learning platforms gain popularity as they offer real-time feedback and personalized instruction.</p></blockquote><blockquote id="02ac"><p>2025: AI-powered virtual reality and augmented reality technologies become widely used in education, allowing for immersive and interactive learning experiences.</p></blockquote><blockquote id="62b8"><p>2030: AI-driven intelligent tutoring systems are widely adopted by schools and universities, providing adaptive and individualized learning experiences for students.</p></blockquote><blockquote id="c3b7"><p>2035: AI-enabled natural language processing technology becomes integrated into educational software, enabling students to learn through interactive conversations with virtual assistants.</p></blockquote><blockquote id="e5b0"><p>2040: AI-based predictive analytics is used to identify at-risk students and provide early intervention to prevent academic failure.</p></blockquote><p id="cbc7">Generated in 12 seconds</p><h2 id="5c14">Prompt 2 — Expert able to extract table of contents</h2><div id="9c6f"><pre><span class="hljs-title class_">Number</span> <span class="hljs-keyword">of</span> tokens <span class="hljs-keyword">in</span> the <span class="hljs-attr">prompt</span>: <span class="hljs-number">618</span> <span class="hljs-title class_">Number</span> <span class="hljs-keyword">of</span> tokens <span class="hljs-keyword">in</span> the <span class="hljs-attr">answer</span>: <span class="hljs-number">349</span></pre></div><p id="ec1a">Here the table of content… in markdown</p><div id="e9a6"><pre> Table of Contents:

<span class="hljs-bullet">1.</span> Introduction <span class="hljs-bullet"> *</span> Explanation of best practices for writing articles <span class="hljs-bullet"> *</span> Author of the article: Sunil Sandhu <span class="hljs-bullet">2.</span> Choosing a Good Title and Subtitle <span class="hljs-bullet"> *</span> Importance of a good title and subtitle <span class="hljs-bullet"> *</span> Tips for creating a good title and subtitle <span class="hljs-bullet"> *</span> Asking yourself if the title is interesting enough <span class="hljs-bullet">3.</span> Formatting Headings and Subheadings <span class="hljs-bullet"> *</span> Keeping headings and subheadings simple and consistent <span class="hljs-bullet"> *</span> Examples of headings and subheadings <span class="hljs-bullet">4.</span> Writing a Good Article <span class="hljs-bullet"> *</span> Importance of good structure, punctuation, spelling, and formatting <span class="hljs-bullet"> *</span> Taking pride in your work and representing yourself effectively <span class="hljs-bullet">5.</span> Capitalizing the Start of Sentences <span class="hljs-bullet">6.</span> Covering Everything in Your Article <span class="hljs-bullet"> *</span> Checking if your article covers all expected topics <span class="hljs-bullet"> *</span> Minimum read time vs. optimal read time <span class="hljs-bullet">7.</span> Using Writing Software <span class="hljs-bullet"> *</span> Using tools like Grammarly for improving content quality <span class="hljs-bullet"> *</span> Importance of using writing software even if you're confident in your English <span class="hljs-bullet">8.</span> Reading Your Own Article Back <span class="hljs-bullet"> *</span> Evaluating your article's effectiveness for others to read <span class="hljs-bullet">9.</span> Avoiding Overuse of Emojis, GIFs, and Memes <span class="hljs-bullet">10.</span> Conclusion</pre></div><p id="aae6">Generated in 14 seconds</p><h2 id="216d">Prompt 3 — You are a thriller book writer. Use the style of Lee Child to write a dialogue between Jack Reacher…</h2><div id="040f"><pre><span class="hljs-title class_">Number</span> <span class="hljs-keyword">of</span> tokens <span class="hljs-keyword">in</span> the <span class="hljs-attr">prompt</span>: <span class="hljs-number">55</span> <span class="hljs-title class_">Number</span> <span class="hljs-keyword">of</span> tokens <span class="hljs-keyword">in</span> the <span class="hljs-attr">answer</span>: <span class="hljs-number">923</span></pre></div><blockquote id="1a8c"><p>“You’re a damn fool, you know that? You think you can just waltz into this building, steal some top-secret documents, and get away with it? You’ve got another thing coming.” Jack Reacher stood in the doorway of the room, his hands on his hips. He had been tracking the suspect for hours, and he finally had him cornered.</p></blockquote><blockquote id="9372"><p>The man was sweating profusely, and his eyes darted nervously around the room. “I didn’t steal anything,” the suspect said, his voice shaking. “I swear. I just came here to do some research. I didn’t know there were any classified documents in this place.”</p></blockquote><blockquote id="67aa"><p>Reacher snorted. “Bullshit. You knew exactly what you were doing. You came here to steal those secrets, and you’re going to pay for it.”</p></blockquote><blockquote id="7b7a"><p>The suspect looked down at the floor, his shoulders slumped. “Okay,” he said quietly. “Whatever you want. Just don’t hurt me.”</p></blockquote><blockquote id="7486"><p>Reacher stepped forward, his eyes locked on the suspect’s face. “I don’t want to hurt you,” he said. “But I will if I have to. These documents could cause a lot of damage if they fell into the wrong hands. You understand that, right?”</p></blockquote><blockquote id="2f70"><p>The suspect nodded, tears streaming down his cheeks. “I understand,” he whispered. Reacher reached out and grabbed the suspect by the collar, pulling him to his feet. “Now tell me everything you know,” he said gruffly. “Every single detail about where those documents are, who else has access to them, and how you got them in the first place.”</p></blockquote><blockquote id="11db"><p>The suspect struggled against Reacher’s grip, but he couldn’t break free. “Fine,” he said through gritted teeth. “I’ll tell you. But you have to promise not to hurt anyone else. Not my family, not my friends, not anybody.”</p></blockquote><blockquote id="49e8"><p>Reacher pulled the suspect closer, his lips inches from his ear. “I make no promises,” he said quietly. “But I will make sure those documents are safe. Now tell me what you know.” The suspect began to speak, his voice barely audible over the sound of Reacher’s breathing. It took him several minutes to reveal everything he knew, but eventually, he had told Reacher everything he needed to know</p></blockquote><p id="869a">Generated by Mistral-7b-instruct in 0:17:18.000940</p><h2 id="d7be">Prompt 4 — Imagine 2 different experts….</h2><div id="8803"><pre><span class="hljs-title class_">Number</span> <span class="hljs-keyword">of</span> tokens <span class="hljs-keyword">in</span> the <span class="hljs-attr">prompt</span>: <span class="hljs-number">131</span> <span class="hljs-title class_">Number</span> <span class="hljs-keyword">of</span> tokens <span class="hljs-keyword">in</span> the <span class="hljs-attr">answer</span>: <span class="hljs-number">722</span></pre></div><blockquote id="6c38"><p>Expert 1: The first step in preventing economic pitfalls caused by a disease outbreak and lockdowns is to ensure that businesses have adequate financial deserves and insurance coverage. This will help them stay afloat during times of reduced revenue or closure due to restrictions on movement and gatherings. Additionally, governments can provide grants and loans to small businesses and entrepreneurs who may not have the resources to withstand such challenges.</p></blockquote><blockquote id=

Options

"8e10"><p>Expert 2: While securing adequate financial reserves is important, it is only one aspect of preventing economic pitfalls during a disease outbreak and lockdown. Another crucial step is to encourage innovation and adaptability in businesses. This can be achieved by providing incentives for companies to develop new products or services that are compatible with the current situation, such as online sales or delivery services. Additionally, governments can invest in education and training programs to help workers acquire new skills that will make them more valuable in a changing job market.</p></blockquote><blockquote id="fcaa"><p>Expert 1: I agree with Expert 2 on the importance of encouraging innovation and adaptability. In addition to providing financial support for businesses, governments should also invest in research and development to find solutions for problems caused by the outbreak, such as improving healthcare infrastructure or developing new treatments for the disease itself. This will not only help alleviate economic hardships but also contribute to long-term growth and prosperity.</p></blockquote><blockquote id="4a6f"><p>Expert 2: I agree with Expert 1 on the need for government investment in research and development. However, we must also consider the impact of lockdowns on individuals and communities. In addition to financial support for businesses, governments should also provide social safety nets such as unemployment benefits, food assistance, and housing subsidies to help people cope with the economic consequences of lockdowns. This will not only prevent poverty but also reduce the spread of the disease by providing people with the resources they need to maintain a healthy lifestyle.</p></blockquote><blockquote id="9697"><p>Expert 1: I completely agree with Expert 2 on the importance of social safety nets. In fact, these measures can be seen as complementary to financial support for businesses, as they will help ensure that individuals and communities are able to weather the economic impacts of lockdowns. Additionally, governments should consider implementing targeted stimulus packages to specific industries that have been hit particularly hard by the outbreak, such as tourism or hospitality. This will not only help keep these industries afloat but also provide a boost to the overall economy by creating new jobs and increasing consumer spending.</p></blockquote><blockquote id="e164"><p>Expert 2: I agree with Expert 1 on implementing targeted stimulus packages for specific industries. However, we must also consider the long-term implications of our actions. For example, governments may be tempted to use large amounts of borrowing to fund their economic interventions. While this may provide short-term relief, it can lead to unsustainable debt burdens that could exacerbate economic problems in the future. To mitigate these risks, governments should explore alternative financing options such as public-private partnerships or crowdfunding platforms.</p></blockquote><blockquote id="491c"><p>Expert 1: I completely agree with Expert 2 on the importance of exploring alternative financing options. In fact, this is an area where collaboration between government and private sector experts can be particularly valuable. By working together to find innovative solutions that balance short-term relief with long-term sustainability, we can prevent economic pitfalls caused by disease outbreaks and lockdowns while also building a stronger, more resilient economy for the future.</p></blockquote><p id="ef7d">Generated by Mistral-7b-instruct in 13 seconds!</p><h1 id="5082">Comparisons and Conclusions (for now)</h1><p id="9371">I tested the smae prompts with Llama2–7b-chat: I used the exactly same method to loada the GPTQ version of the model. I leave the task to you guys: if you meet any issues leave me a comment to the article and I will reply to you.</p><p id="b47c">Let’s go only with quantitative comparison here:</p><p id="d966"><b>Prompt 1:</b></p><div id="bd5c"><pre><span class="hljs-title class_">Mistral7</span><span class="hljs-attr">b</span>: <span class="hljs-title class_">Number</span> <span class="hljs-keyword">of</span> tokens <span class="hljs-keyword">in</span> the <span class="hljs-attr">answer</span>: <span class="hljs-number">600</span> <span class="hljs-title class_">Generated</span> <span class="hljs-keyword">in</span> <span class="hljs-number">12</span> seconds <span class="hljs-title class_">Llama2</span>: <span class="hljs-title class_">Number</span> <span class="hljs-keyword">of</span> tokens <span class="hljs-keyword">in</span> the <span class="hljs-attr">answer</span>: <span class="hljs-number">762</span> generation time <span class="hljs-number">13</span> seconds</pre></div><p id="b78b">In this one Llama2 was slower but with a more complex answer: both followed the instruction for the ouput format</p><p id="6670"><b>Prompt 2:</b></p><div id="1452"><pre><span class="hljs-title class_">Mistral7</span><span class="hljs-attr">b</span>: <span class="hljs-title class_">Number</span> <span class="hljs-keyword">of</span> tokens <span class="hljs-keyword">in</span> the <span class="hljs-attr">answer</span>: <span class="hljs-number">349</span> <span class="hljs-title class_">Generated</span> <span class="hljs-keyword">in</span> <span class="hljs-number">17</span> seconds <span class="hljs-title class_">Llama2</span>: <span class="hljs-title class_">Number</span> <span class="hljs-keyword">of</span> tokens <span class="hljs-keyword">in</span> the <span class="hljs-attr">answer</span>: <span class="hljs-number">250</span> generation time <span class="hljs-number">12</span> seconds</pre></div><p id="b25f">Here Llama2 was faster, but also the 13b fails to generate the correct output. So far only Mistral7b is able to create a well formatted Table of Content out of a tedt</p><p id="d985"><b>Prompt 3:</b></p><div id="d0ef"><pre><span class="hljs-title class_">Mistral7</span><span class="hljs-attr">b</span>: <span class="hljs-title class_">Number</span> <span class="hljs-keyword">of</span> tokens <span class="hljs-keyword">in</span> the <span class="hljs-attr">answer</span>: <span class="hljs-number">923</span> <span class="hljs-title class_">Generated</span> <span class="hljs-keyword">in</span> <span class="hljs-number">17</span> seconds <span class="hljs-title class_">Llama2</span>: <span class="hljs-title class_">Number</span> <span class="hljs-keyword">of</span> tokens <span class="hljs-keyword">in</span> the <span class="hljs-attr">answer</span>: <span class="hljs-number">513</span> generation time <span class="hljs-number">8</span> seconds</pre></div><p id="34e3">The dialaogue from Mistral7b is complex and well done, but much slower than the Llama2</p><p id="1943"><b>Prompt 4:</b></p><div id="0a26"><pre><span class="hljs-title class_">Mistral7</span><span class="hljs-attr">b</span>: <span class="hljs-title class_">Number</span> <span class="hljs-keyword">of</span> tokens <span class="hljs-keyword">in</span> the <span class="hljs-attr">answer</span>: <span class="hljs-number">722</span> <span class="hljs-title class_">Generated</span> <span class="hljs-keyword">in</span> <span class="hljs-number">13</span> seconds <span class="hljs-title class_">Llama2</span>: <span class="hljs-title class_">Number</span> <span class="hljs-keyword">of</span> tokens <span class="hljs-keyword">in</span> the <span class="hljs-attr">answer</span>: <span class="hljs-number">868</span> generation time <span class="hljs-number">15</span> seconds</pre></div><p id="42f4">Both Llama2–7b and 13b fails to generate multiple steps in the dialogue. Here Mistral7b did an amazing job. This kind of internal reasoning, like a team of expert reasoning, is really promising.</p><h1 id="be50">Conclusions</h1><p id="9cc0">Mistral7b is indeed a promising Model: the fact there a base model is being releases in a time where only the commercial powerhouses are making big moves is amazing. In terms of performance Mistral is indeed very good, and the size is certainly an indicator that being opens source and tiny is a choice. There is trend that is moving the community toward models that may be able to run on mobile and smart devices.</p><p id="cd30">If this story provided value and you like the topics consider subscribing to Medium to unlock more resources. Medium is a big community with high quality content: you can certainly find here what you need.</p><ol><li>Sign up for a Medium membership using <a href="https://medium.com/@fabio.matricardi/membership">my link</a> — ($5/month to read unlimited Medium stories)</li><li>Follow me on Medium</li><li>Highlight what you want to remember and if you have doubts or suggestions simply drop a comment to the article: I will promptly reply to you</li><li>Read my latest articles <a href="https://medium.com/@fabio.matricardi">https://medium.com/@fabio.matricardi</a></li></ol><p id="f17d">Don you want to read more? Here some topics</p><div id="b3a2" class="link-block"> <a href="https://readmedium.com/battle-of-the-prompts-unveiling-the-true-capabilities-of-open-source-language-models-2ad401bd6bf3"> <div> <div> <h2>Battle of the Prompts: Unveiling the True Capabilities of Open Source Language Models</h2> <div><h3>Putting LLMs to the Test: Analyzing Performance of Orca-3b, Llama2–7b and Platypus-13b Across Varied Prompts</h3></div> <div><p>medium.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/1*XCdymkmhzSvHLIlKC59NKQ.png)"></div> </div> </div> </a> </div><div id="31c5" class="link-block"> <a href="https://artificialcorner.com/the-future-of-storytelling-creating-compelling-photo-to-audio-narratives-with-free-ai-69fe585112fd"> <div> <div> <h2>The Future of Storytelling: Creating Compelling Photo-to-Audio Narratives with free AI</h2> <div><h3>Use Python, Streamlit and Hugging Face models with free API token to create Audio Stories from your pictures.</h3></div> <div><p>artificialcorner.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*Q6YN6Eag2F1U27Mp.png)"></div> </div> </div> </a> </div><div id="2928" class="link-block"> <a href="https://artificialcorner.com/a-7-steps-guide-to-install-and-run-lamini-lm-on-your-computer-7841ea94522b"> <div> <div> <h2>A 7 steps guide to install and run LaMini-LM on your computer</h2> <div><h3>LaMini is your small ChatGPT running on a CPU: learn how to ask questions with python</h3></div> <div><p>artificialcorner.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/1*wUgWt6_DWJdTkcNOY1eZUw.png)"></div> </div> </div> </a> </div><h2 id="d3d7">WRITER at MLearning.ai /AI Agents LLM / Good-Bad / AI ART 2024</h2><div id="493a" class="link-block"> <a href="https://readmedium.com/mlearning-ai-submission-suggestions-b51e2b130bfb"> <div> <div> <h2>Mlearning.ai Submission Suggestions</h2> <div><h3>How to become a writer on Mlearning.ai</h3></div> <div><p>medium.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/1*6xCb1sNpjadaSBuVLPTFQQ.png)"></div> </div> </div> </a> </div></article></body>

The Ultimate Showdown: Mistral7b vs. Llama2–13b. Unveiling the Champion

Mistral7b is making the bold claim to be better than a model at least twice larger… Let’s check it out learning how to use it!

Meet the new kid on the block! Mistral 7B is a powerful language model with 7.3 billion parameters. It’s creating quite a buzz because it claims not only to beat Meta’s Llama 2–13b, but also to outshine other larger models. In this article, we’ll take a closer look at Mistral 7B, learn how to use it and compare the results with Llama2–13b-chat.

All we are going to do is only with FREE resources: we will see how to run the GGUF version (a quantized method that can run on any normal CPU) and the GPTQ version (aa quantized method that requires a GPU).

Mistral: a startup with a mission

Meet Mistral AI, an exciting new startup from Paris! Founded by brilliant minds who previously worked at tech giants like Google’s DeepMind and Meta, this company has made a splash in the industry. Their eye-catching Word Art logo and groundbreaking $118 million seed funding round have put them in the spotlight, making history in Europe.

But what’s their mission? It’s simple yet ambitious: to make AI truly beneficial for businesses. How do they plan to achieve this? By tapping into publicly available data and collaborating closely with their customers. And now, with the introduction of Mistral 7B, they’re taking their first big step toward making that mission a reality. It’s an exciting journey filled with possibilities!

Unlocking the Power of Mistral 7B: A Game-Changing Language Model

Prepare to be amazed by Mistral 7B, a language model that’s anything but ordinary. Despite its compact size of 7.3 billion parameters, it surpasses larger models like Meta’s Llama 2 13B, setting a new standard for both efficiency and performance (or at least this the claim on the official press release of Mistral7b).

Mistral 7B is a 7.3B parameter model that:

- Outperforms Llama 2 13B on all benchmarks
- Outperforms Llama 1 34B on many benchmarks
- Approaches CodeLlama 7B performance on code, while remaining 
  good at English tasks

At a first glance, this remarkable model combines a wide range of capabilities, excelling not only in English language tasks but also showcasing impressive coding prowess. Its versatility opens up endless possibilities for enterprise applications.

Why should we care?

One standout feature of Mistral 7B is its open-source nature, released under the Apache 2.0 license. This means that anyone can fine-tune and harness the potential of this model without any restrictions, whether it’s for local or cloud-based applications, including various enterprise scenarios. And above all it caan be used also for commercial purposes.

The power of Mistral 7B is now at your fingertips, waiting to be unleashed.

Thankfully we can jump-start with Mistral7b straight away: it is compatible with the latest Hugging Face Transformers library and TheBloke already released the quantized versions. Let’s dive in!

Mistral Running on Google Colab with T4 free GPU

How to use Mistral7b for free

The only requirements to use Mistral7b is a computer. You heard it right: if you have only a CPU… you can use it. If you have a GPU you can use it faster!

For our test we will leverage the Free Tier of Google Colab notebook: even with the free account you are entitled to 1 free session with a T4 GPU.

If you are new to Google Colab notebook, have a look here to learn how to get it.

LLM wants… to break free!

Breaking Barriers: Democratizing AI with Free Access to Large Language Models. Learn how to get your free access to AI…

ai.gopubby.com

From the Hugging Face Model card we can get all the information we need to runthe model.

open a new Colab Notebook
change the runtime type to T4 GPU

Do not download the model weights (in the GPTQ quantized model): it will be done automatically when the text generation is called for the first time
install the dependencies (latest version of Transformers and latest version of AutoGPTQ)

%%capture
!pip install optimum
!pip install git+https://github.com/huggingface/transformers.git@72958fcd3c98a7afdc61f953aa58c544ebda2f79
!pip install auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/  # Use cu117 if on CUDA 11.7
!pip install langchain
!pip install tiktoken

use the template for the inferences with AutoGPTQ and Transformers libraries from the model card

Hugging Face TheBloke model card page with instructions

To have the model ready to run on CPU the instructions are really simple

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_name_or_path = "TheBloke/Mistral-7B-Instruct-v0.1-GPTQ"
# To use a different branch, change revision
# For example: revision="gptq-4bit-32g-actorder_True"   gptq-4bit-32g-actorder_True
model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
                                             device_map="auto",
                                             trust_remote_code=False,
                                             revision="main")

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)

prompt = "Tell me about AI"
prompt_template=f'''<s>[INST] {prompt} [/INST]
'''

pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_p=0.95,
    top_k=40,
    repetition_penalty=1.1
)

print(pipe(prompt_template)[0]['generated_text'])
tok = tokenizer("prompt_template")
tokens = len(tok['input_ids'])
print(f"Number of tokens: {tokens}")

We import the Classes for the the Model, the Tokenizer and the pipeline.

Honestly this is it! with one more instruction we can use Mistral7b with any prompt that comes to our mind.

Prompt Template: we need to consider it.

Mistral7b-instruct is specifically pre-trained with instructions formatted in a specific way. This also means that this model is expecting us to provide our prompts in the same way.

The official Mistral7b-instruct model card is helping us:

In order to leverage instruction fine-tuning, your prompt should be surrounded by [INST] and [\INST] tokens. The very first instruction should begin with a begin of sentence id. The next instructions should not. The assistant generation will be ended by the end-of-sentence token id.

E.g.

text = "<s>[INST] What is your favourite condiment? [/INST]"
"Well, I'm quite partial to a good squeeze of fresh lemon juice. 
It adds just the right amount of zesty flavour to whatever 
I'm cooking up in the kitchen!</s> "
"[INST] Do you have mayonnaise recipes? [/INST]"

Basically Mistral7b-instruct wants the prompt like this: <s>[INST] {prompt} [/INST] . We can easily set up aa python f-string to take care of all of this:

yourprompt = "What is Process Control in Industrial Automation?"
mistral_prompt = f"<s>[INST] {prompt} [/INST]"

Now we can call the pipelineobject with the modified prompt save the generated text in a variable answer and then print it:

answer = pipe(prompt_template)[0]['generated_text']
# calculate the tokens for the prompt
tok = tokenizer("prompt_template")
tokens = len(tok['input_ids'])
print(f"Number of tokens: {tokens}")
#print the answer
print(answer)

That is all.

NOTE: you can see that we are selecting specific parts of the generated text. This is because the pipeline generates a list with a dictionary.

A first look at Mistral7b performances

The initial claim of the great benchmarks should speak louder than words: even though Mistral 7B is just hitting the scene, it has already proven its mettle in the tests. In head-to-head comparisons with open-source competition, the model consistently outperforms. It bests Llama 2 7B and 13B with ease, showcasing its prowess in various tasks.

Images from https://mistral.ai/news/announcing-mistral-7b/

Mistral 7B boasts some impressive key strengths that set it apart from the rest. One of its secret weapons is Grouped-query attention (GQA), a revolutionary technique that enables lightning-fast inference. This means you can expect quick and efficient results when using Mistral 7B.

But that’s not all — Mistral 7B also utilizes Sliding Window Attention (SWA) to flawlessly handle longer sequences without burdening computational resources. This innovative approach ensures smooth and seamless processing, making Mistral 7B a true powerhouse in performance.

Prompts for testing

We are going to evaluate few prompts and compare the results with the ones coming from Llama2. I decided to pick some of my favorite ones, without focusing on logics or mathematics or coding. I believe that instruction following is the best benchmark here: we want to see if the model is able to understand our prompt and give results accordingly.

Prompt 1 — Predict the impact of artificial intelligence on human learning. write the output in the form of a timeline marking the dates.
Prompt 2 — You are an expert able to write an easy to understand table of content. Extract the main points of the following text delimited by triple backquotes and generate the table of content: ```How to write articles that people want to read | by Sunil Sandhu | JavaScript in Plain…```
Prompt 3 —You are a thriller book writer. Use the style of Lee Child to write a dialogue between Jack Reacher and a suspect: the suspect may have stolen secrets from the military that may cause harm to many.
Prompt 4 — Imagine a group of two different experts are answering this question. All experts will write down 1 step of their thinking, then share it with the group. Then all experts will go on to the next step, think about the question and consider the previous ideas of the group. If any expert realizes they’re wrong at any point then they leave. At the end the expert must come to a conclusion together. The question is… How can we prevent the economical pitfalls on normal people caused by a disease outbreak and lockdowns?

We have a clear request to format the output sa a list with dates, aa table of content creator (really useful when you want to give a structure to an unstructured loaded document), a creative task following the writing style of a famous book writer and a multi agent conversation were the LLM act as two experts discussing over a topic.

Image by 3D Animation Production Company from Pixabay

The results…

Prompt 1 — Predict the impact… in the form of a timeline marking the dates.

Number of tokens in the prompt: 37
Number of tokens in the answer: 196

2015: AI-powered educational tools begin to appear, offering personalized learning experiences for students.

2020: AI-based language learning platforms gain popularity as they offer real-time feedback and personalized instruction.

2025: AI-powered virtual reality and augmented reality technologies become widely used in education, allowing for immersive and interactive learning experiences.

2030: AI-driven intelligent tutoring systems are widely adopted by schools and universities, providing adaptive and individualized learning experiences for students.

2035: AI-enabled natural language processing technology becomes integrated into educational software, enabling students to learn through interactive conversations with virtual assistants.

2040: AI-based predictive analytics is used to identify at-risk students and provide early intervention to prevent academic failure.

Generated in 12 seconds

Prompt 2 — Expert able to extract table of contents

Number of tokens in the prompt: 618
Number of tokens in the answer: 349

Here the table of content… in markdown

 Table of Contents:

1. Introduction
 * Explanation of best practices for writing articles
 * Author of the article: Sunil Sandhu
2. Choosing a Good Title and Subtitle
 * Importance of a good title and subtitle
 * Tips for creating a good title and subtitle
 * Asking yourself if the title is interesting enough
3. Formatting Headings and Subheadings
 * Keeping headings and subheadings simple and consistent
 * Examples of headings and subheadings
4. Writing a Good Article
 * Importance of good structure, punctuation, spelling, and formatting
 * Taking pride in your work and representing yourself effectively
5. Capitalizing the Start of Sentences
6. Covering Everything in Your Article
 * Checking if your article covers all expected topics
 * Minimum read time vs. optimal read time
7. Using Writing Software
 * Using tools like Grammarly for improving content quality
 * Importance of using writing software even if you're confident in your English
8. Reading Your Own Article Back
 * Evaluating your article's effectiveness for others to read
9. Avoiding Overuse of Emojis, GIFs, and Memes
10. Conclusion

Generated in 14 seconds

Prompt 3 — You are a thriller book writer. Use the style of Lee Child to write a dialogue between Jack Reacher…

Number of tokens in the prompt: 55
Number of tokens in the answer: 923

“You’re a damn fool, you know that? You think you can just waltz into this building, steal some top-secret documents, and get away with it? You’ve got another thing coming.” Jack Reacher stood in the doorway of the room, his hands on his hips. He had been tracking the suspect for hours, and he finally had him cornered.

The man was sweating profusely, and his eyes darted nervously around the room. “I didn’t steal anything,” the suspect said, his voice shaking. “I swear. I just came here to do some research. I didn’t know there were any classified documents in this place.”

Reacher snorted. “Bullshit. You knew exactly what you were doing. You came here to steal those secrets, and you’re going to pay for it.”

The suspect looked down at the floor, his shoulders slumped. “Okay,” he said quietly. “Whatever you want. Just don’t hurt me.”

Reacher stepped forward, his eyes locked on the suspect’s face. “I don’t want to hurt you,” he said. “But I will if I have to. These documents could cause a lot of damage if they fell into the wrong hands. You understand that, right?”

The suspect nodded, tears streaming down his cheeks. “I understand,” he whispered. Reacher reached out and grabbed the suspect by the collar, pulling him to his feet. “Now tell me everything you know,” he said gruffly. “Every single detail about where those documents are, who else has access to them, and how you got them in the first place.”

The suspect struggled against Reacher’s grip, but he couldn’t break free. “Fine,” he said through gritted teeth. “I’ll tell you. But you have to promise not to hurt anyone else. Not my family, not my friends, not anybody.”

Reacher pulled the suspect closer, his lips inches from his ear. “I make no promises,” he said quietly. “But I will make sure those documents are safe. Now tell me what you know.” The suspect began to speak, his voice barely audible over the sound of Reacher’s breathing. It took him several minutes to reveal everything he knew, but eventually, he had told Reacher everything he needed to know

Generated by Mistral-7b-instruct in 0:17:18.000940

Prompt 4 — Imagine 2 different experts….

Number of tokens in the prompt: 131
Number of tokens in the answer: 722

Expert 1: The first step in preventing economic pitfalls caused by a disease outbreak and lockdowns is to ensure that businesses have adequate financial deserves and insurance coverage. This will help them stay afloat during times of reduced revenue or closure due to restrictions on movement and gatherings. Additionally, governments can provide grants and loans to small businesses and entrepreneurs who may not have the resources to withstand such challenges.

Expert 2: While securing adequate financial reserves is important, it is only one aspect of preventing economic pitfalls during a disease outbreak and lockdown. Another crucial step is to encourage innovation and adaptability in businesses. This can be achieved by providing incentives for companies to develop new products or services that are compatible with the current situation, such as online sales or delivery services. Additionally, governments can invest in education and training programs to help workers acquire new skills that will make them more valuable in a changing job market.

Expert 1: I agree with Expert 2 on the importance of encouraging innovation and adaptability. In addition to providing financial support for businesses, governments should also invest in research and development to find solutions for problems caused by the outbreak, such as improving healthcare infrastructure or developing new treatments for the disease itself. This will not only help alleviate economic hardships but also contribute to long-term growth and prosperity.

Expert 2: I agree with Expert 1 on the need for government investment in research and development. However, we must also consider the impact of lockdowns on individuals and communities. In addition to financial support for businesses, governments should also provide social safety nets such as unemployment benefits, food assistance, and housing subsidies to help people cope with the economic consequences of lockdowns. This will not only prevent poverty but also reduce the spread of the disease by providing people with the resources they need to maintain a healthy lifestyle.

Expert 1: I completely agree with Expert 2 on the importance of social safety nets. In fact, these measures can be seen as complementary to financial support for businesses, as they will help ensure that individuals and communities are able to weather the economic impacts of lockdowns. Additionally, governments should consider implementing targeted stimulus packages to specific industries that have been hit particularly hard by the outbreak, such as tourism or hospitality. This will not only help keep these industries afloat but also provide a boost to the overall economy by creating new jobs and increasing consumer spending.

Expert 2: I agree with Expert 1 on implementing targeted stimulus packages for specific industries. However, we must also consider the long-term implications of our actions. For example, governments may be tempted to use large amounts of borrowing to fund their economic interventions. While this may provide short-term relief, it can lead to unsustainable debt burdens that could exacerbate economic problems in the future. To mitigate these risks, governments should explore alternative financing options such as public-private partnerships or crowdfunding platforms.

Expert 1: I completely agree with Expert 2 on the importance of exploring alternative financing options. In fact, this is an area where collaboration between government and private sector experts can be particularly valuable. By working together to find innovative solutions that balance short-term relief with long-term sustainability, we can prevent economic pitfalls caused by disease outbreaks and lockdowns while also building a stronger, more resilient economy for the future.

Generated by Mistral-7b-instruct in 13 seconds!

Comparisons and Conclusions (for now)

I tested the smae prompts with Llama2–7b-chat: I used the exactly same method to loada the GPTQ version of the model. I leave the task to you guys: if you meet any issues leave me a comment to the article and I will reply to you.

Let’s go only with quantitative comparison here:

Prompt 1:

Mistral7b: Number of tokens in the answer: 600 
           Generated in 12 seconds
   Llama2: Number of tokens in the answer: 762 
           generation time 13 seconds

In this one Llama2 was slower but with a more complex answer: both followed the instruction for the ouput format

Prompt 2:

Mistral7b: Number of tokens in the answer: 349 
           Generated in 17 seconds
   Llama2: Number of tokens in the answer: 250 
           generation time 12 seconds

Here Llama2 was faster, but also the 13b fails to generate the correct output. So far only Mistral7b is able to create a well formatted Table of Content out of a tedt

Prompt 3:

Mistral7b: Number of tokens in the answer: 923 
           Generated in 17 seconds
   Llama2: Number of tokens in the answer: 513 
           generation time 8 seconds

The dialaogue from Mistral7b is complex and well done, but much slower than the Llama2

Prompt 4:

Mistral7b: Number of tokens in the answer: 722 Generated in 13 seconds
Llama2: Number of tokens in the answer: 868 generation time 15 seconds

Both Llama2–7b and 13b fails to generate multiple steps in the dialogue. Here Mistral7b did an amazing job. This kind of internal reasoning, like a team of expert reasoning, is really promising.

Conclusions

Mistral7b is indeed a promising Model: the fact there a base model is being releases in a time where only the commercial powerhouses are making big moves is amazing. In terms of performance Mistral is indeed very good, and the size is certainly an indicator that being opens source and tiny is a choice. There is trend that is moving the community toward models that may be able to run on mobile and smart devices.

If this story provided value and you like the topics consider subscribing to Medium to unlock more resources. Medium is a big community with high quality content: you can certainly find here what you need.

Sign up for a Medium membership using my link — ($5/month to read unlimited Medium stories)
Follow me on Medium
Highlight what you want to remember and if you have doubts or suggestions simply drop a comment to the article: I will promptly reply to you
Read my latest articles https://medium.com/@fabio.matricardi

Don you want to read more? Here some topics

Battle of the Prompts: Unveiling the True Capabilities of Open Source Language Models

Putting LLMs to the Test: Analyzing Performance of Orca-3b, Llama2–7b and Platypus-13b Across Varied Prompts

medium.com

The Future of Storytelling: Creating Compelling Photo-to-Audio Narratives with free AI

Use Python, Streamlit and Hugging Face models with free API token to create Audio Stories from your pictures.

artificialcorner.com