avatarHarshmeet Singh Chandhok

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

8317

Abstract

</div> </a> </div><div id="d255" class="link-block"> <a href="https://www.coursera.org/specializations/deep-learning"> <div> <div> <h2>Deep Learning</h2> <div><h3>Learn Deep Learning from deeplearning.ai. If you want to break into Artificial intelligence (AI), this Specialization…</h3></div> <div><p>www.coursera.org</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*lSusVNYJ4OwEimGF)"></div> </div> </div> </a> </div><div id="3ece" class="link-block"> <a href="https://huggingface.co/learn/nlp-course/chapter1/1"> <div> <div> <h2>Introduction - Hugging Face NLP Course</h2> <div><h3>This course will cover in Tranformer based models as well as NLP</h3></div> <div><p>huggingface.co</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*wUbtD4gJIDo4tux1)"></div> </div> </div> </a> </div><h2 id="4904">Note:</h2><p id="7728">Many companies will try to check your basic understanding of these concepts as these are one of the major building blocks for GenAI.</p><h1 id="5ccb">4. Look out for GenAI tools for NLP and Computer Vision</h1><p id="506a">There are many GenAI tools out there, that can be utilized to make yourself aware of how an end-to-end real-time application works. Using tools like Chatgpt, Huggingchat, Midjourney, etc., will help you determine whether you are good at creating proper prompts or not.</p><blockquote id="6403"><p>This adds a spark to build your end-to-end GenAI application.</p></blockquote><h2 id="696c">Resources to learn from:</h2><p id="ba34"><b>Computer Vision GenAI tools</b> — <a href="https://zapier.com/blog/best-ai-image-generator/#dall-e-2">DALL·E 3</a>, <a href="https://zapier.com/blog/best-ai-image-generator/#midjourney">Midjourney</a> ,<a href="https://zapier.com/blog/best-ai-image-generator/#dreamstudio">DreamStudio (Stable Diffusion)</a> ,<a href="https://zapier.com/blog/best-ai-image-generator/#firefly">Firefly (Photoshop)</a></p><p id="0eec"><b>NLP GenAI tools</b> — <a href="https://chat.openai.com/">ChatGPT</a>, <a href="https://bard.google.com/">Bard</a>, <a href="https://github.com/features/copilot">Github Copilot</a>, <a href="https://huggingface.co/chat/">HuggingChat</a></p><h2 id="d6f2">Note:</h2><p id="6293">You can try the same prompts in different GenAI tools to know the difference and consistency between the outputs. This will give you overall idea of how Robust can your application can be.</p><h1 id="48c7">5. Open source LLMs</h1><p id="fed7">Open-source language models (LLMs) are freely available models that can be utilized to build your application on your data or any use case. These models are pre-trained on huge data which can be fine-tuned for different tasks. Llama, GPT, Falcon, Mistral, etc are certain examples of open-source models available.</p><blockquote id="41e3"><p>Many people just stay limited to openai studio and say they know whole LLM, but that's not true as customizing open source LLMs on your own is worthy.</p></blockquote><h2 id="f86c">Resources to learn from:</h2><div id="7cd0" class="link-block"> <a href="https://ai.meta.com/llama/"> <div> <div> <h2>Llama 2 - Meta AI</h2> <div><h3>Llama 2 - The next generation of our open source large language model, available for free for research and commercial…</h3></div> <div><p>ai.meta.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*1gLSGhIUhCQ2FgSs)"></div> </div> </div> </a> </div><div id="1832" class="link-block"> <a href="https://mistral.ai/"> <div> <div> <h2>Mistral AI | Open-weight models</h2> <div><h3>Frontier AI in your hands</h3></div> <div><p>mistral.ai</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*-WhZlCJr3mtPLgRX)"></div> </div> </div> </a> </div><div id="c4d1" class="link-block"> <a href="https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard"> <div> <div> <h2>Open LLM Leaderboard - a Hugging Face Space by HuggingFaceH4</h2> <div><h3>Discover amazing ML apps made by the community</h3></div> <div><p>huggingface.co</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*G7NSh6QPZO-XqaDM)"></div> </div> </div> </a> </div><h2 id="4b50">Note:</h2><p id="797c">The majority of good open-source LLMs can be found over hugging face which can be utilized.</p><h1 id="e9da">6. Finetuning LLMs</h1><p id="a416">For a particular LLM to work for a domain-specific or, you can say, task-specific part, you need to know how to fine-tune LLMs. It involves customizing a pre-trained model for certain tasks by inputting the task-specific data, which in turn reduces the need for exhaustive training resources.</p><blockquote id="a76d"><p>Its always a good practice to fine-tune models on domain-specific data to avoid wrong model outputs.</p></blockquote><h2 id="a9af">Resources to learn from:</h2><div id="42e0" class="link-block"> <a href="https://www.deeplearning.ai/short-courses/finetuning-large-language-models/"> <div> <div> <h2>Finetuning Large Language Models</h2> <div><h3>Finetuning Large Language Models! Learn from Sharon Zhou, Co-Founder and CEO of Lamini, and instructor for the GANs…</h3></div> <div><p>www.deeplearning.ai</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*2NZjkkrjUVTja4wq)"></div> </div> </div> </a> </div><div id="d30c" class="link-block"> <a href="https://www.databricks.com/blog/efficient-fine-tuning-lora-guide-llms"> <div> <div> <h2>Efficient Fine-Tuning with LoRA: A Guide to Optimal Parameter Selection for Large Language Models</h2> <div><h3>With the rapid advan</h3></div> <div><p>www.databricks.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*YCOnzeK6g9yePmUP)"></div> </div> </div> </a> </div><h2 id="4c4b">Note:</h2><p id="66f0">There are high chances for an LLM not to follow ethical AI guidelines after getting fine-tuned. So it's advisable to choose a good dataset while fine-tuning that doesn't have harmful words, private info, biases, etc.</p><h1 id="94ee">7. RAG and Advanced RAG</h1><p id="b992">RAG (Retrieval-Augmented Generation) helps to retrieve relevant information from large datasets efficiently. It is mainly used for QnA tasks, retrieval, and generation models to enhance overall performance. One should also know about VectorDatabases (e.g., Chroma) while dealing with RAG.</p><blockquote id="0e7b"><p>Chatgpt won’t give you the latest info but using your own LLM with RAG you can keep your GenAI with the latest info.</p></blockquote><h2 id="4abf">Resources to learn from:</h2><div id="714a" class="link-block"> <a href="https://www.deeplearning.ai/short-courses/building-evaluating-advanced-rag/"> <div> <div> <h2>Building and Evaluating Advanced RAG Applications</h2> <div><h3>Learn methods like sentence-window retrieval and auto-merging retrieval, improving your RAG pipeline's performance # Options …</h3></div> <div><p>www.deeplearning.ai</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*mZd8F3f2wyeP86ou)"></div> </div> </div> </a> </div><div id="10a4" class="link-block"> <a href="https://learn.activeloop.ai/courses/rag"> <div> <div> <h2>Retrieval Augmented Generation LlamaIndex &amp; LangChain Course</h2> <div><h3>Master RAG Apps with 15+ Theory Lessons and 7+ Practical Projects. Join 20K+ Engineers in the Free Certification Course…</h3></div> <div><p>learn.activeloop.ai</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*Zbj3ADTeuq4rcqnY)"></div> </div> </div> </a> </div><div id="a276" class="link-block"> <a href="https://www.trychroma.com/"> <div> <div> <h2>the AI-native open-source embedding database</h2> <div><h3>the AI-native open-source embedding database</h3></div> <div><p> the AI-native open-source embedding databasewww.trychroma.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*nzYIPnVeMWB5sKNP)"></div> </div> </div> </a> </div><h2 id="1150">Note:</h2><p id="698d">Step 8 below<b> </b>and this step should go hand in hand. And it is preferred to do these simultaneously.<b> </b>There are many vector databases out there like Weaviate, Pincone, Faiss, etc. but Numpy Vector can also be used instead of vector databases for small-scale applications only.</p><h1 id="5c78">8. LangChain and LlamaIndex</h1><p id="eed5">LangChain is a framework for developing LLM applications, and<b> </b>Llamaindex is a framework that contains tools required to manage the end-to-end lifecycle for making LLM applications.</p><blockquote id="bce9"><p>Using these frameworks simply the journey towards production.</p></blockquote><h2 id="6f17">Resources to learn from:</h2><div id="542d" class="link-block"> <a href="https://www.deeplearning.ai/short-courses/langchain-for-llm-application-development/"> <div> <div> <h2>LangChain for LLM Application Development</h2> <div><h3>The framework to take LLMs out of the box. Learn to use LangChain to call LLMs into new environments, and use memories…</h3></div> <div><p>www.deeplearning.ai</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*Vt2Ziyw2gMHfNOHE)"></div> </div> </div> </a> </div><div id="7e04" class="link-block"> <a href="https://docs.llamaindex.ai/en/stable/"> <div> <div> <h2>LlamaIndex 🦙 0.9.34</h2> <div><h3>LlamaIndex is a data framework for LLM-based applications to ingest, structure, and access private or domain-specific…</h3></div> <div><p>docs.llamaindex.ai</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/)"></div> </div> </div> </a> </div><h2 id="1152">Note:</h2><p id="5e38">By using LangChain or LlamaIndex the proper way of building pipelines is achieved which is basically a must skill asked in interviews for GenAI roles.</p><h1 id="730c">9. Quantization of LLMs</h1><p id="01dd">Quantization in LLMs refers to reducing the precision of numerical values in the model’s parameters. The quantized model is faster and memory-efficient in terms of training and inference.</p><blockquote id="bdb0"><p>Many Data Scientists use the model as it is without quantization which causes them 4 times more cost.</p></blockquote><h2 id="d556">Resources to learn from:</h2><div id="1325" class="link-block"> <a href="https://huggingface.co/blog/4bit-transformers-bitsandbytes"> <div> <div> <h2>Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA</h2> <div><h3>We're on a journey to advance and democratize artificial intelligence through open source and open science.</h3></div> <div><p>huggingface.co</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*uLHxpH1h0uBBb8va)"></div> </div> </div> </a> </div><div id="32cc" class="link-block"> <a href="https://huggingface.co/docs/optimum/concept_guides/quantization"> <div> <div> <h2>Quantization</h2> <div><h3>We're on a journey to advance and democratize artificial intelligence through open source and open science.</h3></div> <div><p>huggingface.co</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*QWOu_Ar8R23g67ze)"></div> </div> </div> </a> </div><h2 id="f32c">Note:</h2><p id="9d7b">There are multiple methods for quantization used, it's better to have some codes saved with you to quantize your model whenever making a large-scale application.</p><h1 id="0b5d">10. LLMOps</h1><p id="eb31">LLMOps allow to efficiently deploy, monitor, and maintain LLMs. A proper LLMOps pipeline can save a lot of money by using optimal resources and structure.</p><blockquote id="be55"><p>What's the use of building an LLM if you don't know how to deploy it on different scales?</p></blockquote><h2 id="48cd">Resources to learn from:</h2><div id="2062" class="link-block"> <a href="https://www.databricks.com/glossary/llmops"> <div> <div> <h2>What Is LLMOps? | Databricks</h2> <div><h3>Learn the key concepts of large language model operations (LLMOps) and best practices for efficiently deploying…</h3></div> <div><p>www.databricks.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*OJpfY-1rR8CWxBoU)"></div> </div> </div> </a> </div><div id="1cfe" class="link-block"> <a href="https://www.deeplearning.ai/short-courses/llmops/"> <div> <div> <h2>LLMOps</h2> <div><h3>In this course, you'll go through the LLMOps pipeline of pre-processing training data for supervised instruction…</h3></div> <div><p>www.deeplearning.ai</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*kHRlhQadGXBFXNqR)"></div> </div> </div> </a> </div><div id="e9dc" class="link-block"> <a href="https://www.comet.com/site/llm-course/"> <div> <div> <h2>LLM Course</h2> <div><h3>Register for this course and learn to build modern software with LLMs using the newest tools and techniques in the…</h3></div> <div><p>www.comet.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*LLEmGaT5-9AFItSU)"></div> </div> </div> </a> </div><h2 id="8028">Note:</h2><p id="99bf">LLMOps is quite different from MLOps as workflows change while transitioning from ML to LLMs.</p><h1 id="162d">Conclusion</h1><p id="7d37">Following this Roadmap guarantees excelling in GenAI, which will improve your profile to a greater extent to stand out from others in this era.</p><p id="d661">Happy Learning !! 😀</p></article></body>

Roadmap To Become A GenAI Expert In 2024

In the future, every company will need to leverage generative AI in order to stay competitive and drive innovation.”

— Kai-Fu Lee, Former President of Google China.

As we all have entered a new technological innovation era, Generative AI has gained too much popularity in a shorter span and is becoming an integral part of the tech industry. From making realistic photos and videos to generating music and different insights from text, this tech stack has pushed the boundaries of creativity and problem-solving in various aspects. Many kinds of research and use cases are still in the exploration phase.

If you want to stay ahead of the crowd and become an expert in Generative AI in 2024, you’ve come to the right place. This roadmap will enlighten you with all the required steps, resources, and concepts to master GenAI and unlock your full potential. Whether you’re a beginner or have some experience with AI, this complete guide has got you covered.

So Let's Suit Up! and get ready to undertake an amazing journey towards becoming a Generative AI expert!!

Roadmap 🎯

The following roadmap will develop all-round skills that are required for GenAI roles in corporate as well as in the research -

Image by Author

1. Prompt engineering

Prompt engineering is the most important part of the process, as it enables crafting proper prompts to get more out of the GenAI models. Prompt engineering enables proper communication between humans and AI to get tasks done as desired.

Many people fail to utilize the full potential of ChatGPT as they lack writing good prompts.

Resources to learn from :

Note :

One can also try to explore how a prompt can be written in different ways so that the GenAI model, which is unable to generate specific output due to restrictions, can induce results. This concept is Prompt Hacking.

2. Ethical AI

Developing any AI application without proper principles or guidelines to follow can lead to the wrong usage, which will, in turn, lead to crisis, defamation, crime, etc. This brings out the concept of Ethical AI, also known as Responsible AI.

Imagine someone blackmailed you using deepfake AI app to get money from you in turn of your image in the society.

Resources:

Note :

The basic principles stay the same, but still, there are few changes in the principles when it comes to a particular company. You can have a look at TechGiant's (Microsoft, Google, Meta) Responsible AI principles.

3. ML/ DL to Transformer-Based Models

One should have proper knowledge, from basic ML to DL, to know exactly how classification, regression tasks, algorithms, activation functions, etc., work. Apart from that, one should also consider knowing how transformer models work and knowing in and out of it. Knowing about Computer Vision is optional.

Directly jumping into AI without knowing the foundations is futile.

Resources to learn from:

Note:

Many companies will try to check your basic understanding of these concepts as these are one of the major building blocks for GenAI.

4. Look out for GenAI tools for NLP and Computer Vision

There are many GenAI tools out there, that can be utilized to make yourself aware of how an end-to-end real-time application works. Using tools like Chatgpt, Huggingchat, Midjourney, etc., will help you determine whether you are good at creating proper prompts or not.

This adds a spark to build your end-to-end GenAI application.

Resources to learn from:

Computer Vision GenAI toolsDALL·E 3, Midjourney ,DreamStudio (Stable Diffusion) ,Firefly (Photoshop)

NLP GenAI toolsChatGPT, Bard, Github Copilot, HuggingChat

Note:

You can try the same prompts in different GenAI tools to know the difference and consistency between the outputs. This will give you overall idea of how Robust can your application can be.

5. Open source LLMs

Open-source language models (LLMs) are freely available models that can be utilized to build your application on your data or any use case. These models are pre-trained on huge data which can be fine-tuned for different tasks. Llama, GPT, Falcon, Mistral, etc are certain examples of open-source models available.

Many people just stay limited to openai studio and say they know whole LLM, but that's not true as customizing open source LLMs on your own is worthy.

Resources to learn from:

Note:

The majority of good open-source LLMs can be found over hugging face which can be utilized.

6. Finetuning LLMs

For a particular LLM to work for a domain-specific or, you can say, task-specific part, you need to know how to fine-tune LLMs. It involves customizing a pre-trained model for certain tasks by inputting the task-specific data, which in turn reduces the need for exhaustive training resources.

Its always a good practice to fine-tune models on domain-specific data to avoid wrong model outputs.

Resources to learn from:

Note:

There are high chances for an LLM not to follow ethical AI guidelines after getting fine-tuned. So it's advisable to choose a good dataset while fine-tuning that doesn't have harmful words, private info, biases, etc.

7. RAG and Advanced RAG

RAG (Retrieval-Augmented Generation) helps to retrieve relevant information from large datasets efficiently. It is mainly used for QnA tasks, retrieval, and generation models to enhance overall performance. One should also know about VectorDatabases (e.g., Chroma) while dealing with RAG.

Chatgpt won’t give you the latest info but using your own LLM with RAG you can keep your GenAI with the latest info.

Resources to learn from:

Note:

Step 8 below and this step should go hand in hand. And it is preferred to do these simultaneously. There are many vector databases out there like Weaviate, Pincone, Faiss, etc. but Numpy Vector can also be used instead of vector databases for small-scale applications only.

8. LangChain and LlamaIndex

LangChain is a framework for developing LLM applications, and Llamaindex is a framework that contains tools required to manage the end-to-end lifecycle for making LLM applications.

Using these frameworks simply the journey towards production.

Resources to learn from:

Note:

By using LangChain or LlamaIndex the proper way of building pipelines is achieved which is basically a must skill asked in interviews for GenAI roles.

9. Quantization of LLMs

Quantization in LLMs refers to reducing the precision of numerical values in the model’s parameters. The quantized model is faster and memory-efficient in terms of training and inference.

Many Data Scientists use the model as it is without quantization which causes them 4 times more cost.

Resources to learn from:

Note:

There are multiple methods for quantization used, it's better to have some codes saved with you to quantize your model whenever making a large-scale application.

10. LLMOps

LLMOps allow to efficiently deploy, monitor, and maintain LLMs. A proper LLMOps pipeline can save a lot of money by using optimal resources and structure.

What's the use of building an LLM if you don't know how to deploy it on different scales?

Resources to learn from:

Note:

LLMOps is quite different from MLOps as workflows change while transitioning from ML to LLMs.

Conclusion

Following this Roadmap guarantees excelling in GenAI, which will improve your profile to a greater extent to stand out from others in this era.

Happy Learning !! 😀

Llm
2024
Large Language Models
NLP
Data Science
Recommended from ReadMedium