ML news: Week 29 January

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

9694

Abstract

ces.</li></ul> <figure id="9fdd"> <div> <div> <img class="ratio" src="http://placehold.it/16x9"> <iframe class="" src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FcEKoHddWnuY&display_name=YouTube&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DcEKoHddWnuY&image=http%3A%2F%2Fi.ytimg.com%2Fvi%2FcEKoHddWnuY%2Fhqdefault.jpg&key=a19fcc184b9711e1b4764040d3dc5c07&type=text%2Fhtml&schema=youtube" allowfullscreen="" frameborder="0" height="480" width="854"> </div> </div> </figure></iframe></div></div></figure><ul><li><a href="https://arxiv.org/abs/2401.17773v1">SNP-S3: Shared Network Pre-training and Significant Semantic Strengthening for Various Video-Text Tasks.</a> Shared Network Pre-training (SNP) enhances the joint learning of text and video. Compared to earlier models, this approach is more effective and adaptable and incorporates a novel technique called Significant Semantic Strengthening (S3) to improve comprehension of important terms in sentences.</li><li><a href="https://github.com/ymy-k/hi-sam">Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation.</a> An improved version of the Segment Anything Model (SAM) with a focus on hierarchical text segmentation is called Hi-SAM. Hi-SAM is an excellent text segmenter at several levels, ranging from strokes to paragraphs, and it can even analyze layouts.</li></ul><h1 id="aa09">News</h1><ul><li><a href="https://venturebeat.com/data-infrastructure/exclusive-voltron-data-acquires-claypot-to-unlock-real-time-ai-with-modular-data-systems/">Voltron Data acquires Claypot to unlock real-time AI with modular data systems.</a> Today, San Francisco-based Voltron Data, a startup providing enterprises with a modular and composable approach to building systems for data analytics, confirmed to VentureBeat that is acquiring the real-time AI platform Claypot. The terms of the deal were not disclosed.</li><li><a href="https://www.theverge.com/2024/1/25/24050693/ftc-investigating-microsoft-amazon-google-investments-openai-anthropic">FTC investigating Microsoft, Amazon, and Google investments into OpenAI and Anthropic.</a> The commission wants to understand the tangled web of investments between cloud providers and AI startups.</li><li><a href="https://spectrum.ieee.org/ai-doctor">Google’s New AI Is Learning to Diagnose Patients.</a> The DeepMind team turns to medicine with an AI model named AMIE</li><li><a href="https://www.techradar.com/pro/1100th-of-the-cost-cpu-startup-tachyum-claims-that-one-of-its-processing-units-can-rival-dozens-of-nvidia-h200-gpus-with-a-99-saving-that-could-turn-the-ai-market-on-its-head-if-true">1/100th of the cost: CPU startup Tachyum claims that one of its processing units can rival dozens of Nvidia H200 GPUs — with a 99% saving that could turn the AI market on its head if true.</a> The 5nm Prodigy processor can dynamically switch between AI, HPC, and cloud workloads and costs 150 billion market capitalization business ServiceNow revealed last week that, among all of its new product family launches, including its initial Pro SKU, its generation AI solutions generated the biggest net new ACV contribution for the first full quarter. It’s exciting to see that enterprise-level AI applications are already contributing to significant revenue growth.</li><li><a href="https://blog.google/products/bard/google-bard-gemini-pro-image-generation/amp/">Bard’s latest updates: Access Gemini Pro globally and generate images.</a> You can now generate images in Bard in English in most countries around the world, at no cost. This new capability is powered by our updated Imagen 2 model</li><li><a href="https://techcrunch.com/2024/02/01/amazon-debuts-rufus-an-ai-shopping-assistant-in-its-mobile-app/">Amazon debuts ‘Rufus,’ an AI shopping assistant in its mobile app.</a> Amazon announced today the launch of an AI-powered shopping assistant it’s calling Rufus who’s been trained on the e-commerce giant’s product catalog as well as information from around the web.</li></ul><figure id="82d0"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*qVbODx9Pq98gCdq3fni7KQ.png"><figcaption><a href="https://arxiv.org/pdf/2401.15075v1.pdf">source</a></figcaption></figure><h1 id="40d1">Resources</h1><ul><li><a href="https://huggingface.co/MILVLG/imp-v1-3b">imp-v1–3b.</a> An additional multimodal model was trained using SigLIP and Phi-2. This one is tiny enough to run on-device and provides very promising performance.</li><li><a href="https://huggingface.co/docs/hub/datasets-webdataset">WebDataset.</a> WebDataset is a library for writing I/O pipelines for large datasets. Its sequential I/O and sharding features make it especially useful for streaming large-scale datasets to a DataLoader.</li><li><a href="https://github.com/rasbt/LLMs-from-scratch">LLMs-from-scratch.</a>An unfinished yet intriguing series of exercises to teach language model building from the beginning.</li><li><a href="https://til.simonwillison.net/llms/colbert-ragatouille">Exploring ColBERT with RAGatouille.</a> For RAG applications, ColBERT is a great paradigm for embedding queries and index data. This article runs some benchmarks and examines the method’s underlying intuition.</li><li><a href="https://github.com/LaurentMazare/mamba.rs">mamba.rs.</a> Inspired by efforts on the Llama models, this project uses pure Rust to run inference for Mamba on the CPU.</li><li><a href="https://huggingface.co/codellama">🦙 Code Llama.</a> Code Llama is a code-specialized version of Llama 2 that was created by further training Llama 2 on its code-specific datasets, sampling more data from that same dataset for longer.</li><li><a href="https://blog.rwkv.com/p/eagle-7b-soaring-past-transformers">Eagle 7B : Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages (RWKV-v5).</a> A brand new era for the RWKV-v5 architecture and linear transformer has arrived — with the strongest multi-lingual model in open source today</li></ul><figure id="d4f4"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*T1eNC4Sopn2D_JMfx4ojIg.png"><figcaption><a href="https://arxiv.org/pdf/2401.16380.pdf">source</a></figcaption></figure><ul><li><a href="https://github.com/michaelvorndran/inconsistencymasks">InconsistencyMasks.</a> A novel techn

Options

ique for picture segmentation called Inconsistency Masks (IM) functions even with sparse data. Tested on the ISIC 2018 dataset, our method performs better than conventional methods and even surpasses models trained on fully labeled datasets.</li><li><a href="https://github.com/zamdimon/distortion-generator">distortion-generator.</a> A novel technique for picture distortion strikes a compromise between privacy and accuracy in biometric systems, rendering facial photos incomprehensible to humans yet identifiable to AI.</li><li><a href="https://github.com/TaskingAI/TaskingAI">TaskingAI.</a> TaskingAI brings Firebase’s simplicity to AI-native app development. The platform enables the creation of GPTs-like multi-tenant applications using a wide range of LLMs from various providers. It features distinct, modular functions such as Inference, Retrieval, Assistant, and Tool, seamlessly integrated to enhance the development process.</li><li><a href="https://docs.lilacml.com/blog/introducing-garden.html">100x Faster Clustering with Lilac Garden.</a> A difficulty in language model training is locating a sufficiently varied dataset. It is considerably more difficult to visualize this data. This useful tool facilitates data exploration to enhance filtering and overall quality through topic modeling and quick clustering.</li><li><a href="https://github.com/pytorch-labs/float8_experimental">float8_experimental.</a> Although less precise model training is quicker and less expensive, it is less reliable. Quantized training has been the subject of several excellent contemporary studies. Building on those foundations, this repository offers float8 teaching through readable and hackable code.</li></ul><figure id="c793"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*ERZAt68nj1XUejw5.png"><figcaption><a href="https://github.com/yuanchenbei/macgnn">source</a></figcaption></figure><ul><li><a href="https://github.com/AugustDev/enchanted">Enchanted.</a> Enchanted is an open-source, Ollama-compatible, elegant iOS/iPad mobile app for chatting with privately hosted models such as Llama 2, Mistral, Vicuna, Starling, and more. It’s essentially ChatGPT app UI that connects to your private Ollama models. You can download Enchanted from the App Store or build yourself from scratch.</li><li><a href="https://readmedium.com/introduction-to-point-processing-b9d022ad8cf8">Introduction to point processing.</a> Whether you are doing medical image analysis or you use Photoshop, you are using point preprocessing</li><li><a href="https://github.com/scnu-rislab/mf-mos">MF-MOS: A Motion-Focused Model for Moving Object Segmentation.</a> A new model called MF-MOS makes use of LiDAR technology to more effectively identify moving objects during autonomous driving. Using residual maps for motion capture and range pictures for semantic guiding distinguishes motion from semantic information in a unique way.</li><li><a href="https://github.com/google-deepmind/mctx">Mctx: MCTS-in-JAX.</a> Mctx is a library with a JAX-native implementation of Monte Carlo tree search (MCTS) algorithms such as AlphaZero, MuZero, and Gumbel MuZero. For computation speed up, the implementation fully supports JIT-compilation.</li><li><a href="https://fireworks.ai/blog/firellava-the-first-commercially-permissive-oss-llava-model">FireLLaVA: the first commercially permissive OSS LLaVA model.</a> A new open-vision model called FireLlava can be used for commercial applications after it is trained on data. It performs similarly to the first Llava, but not quite as well as Llava 1.5.</li><li><a href="https://github.com/fetchai/uAgents">uAgents: AI Agent Framework.</a> uAgents is a library developed by Fetch.ai that allows for the creation of autonomous AI agents in Python. With simple and expressive decorators, you can have an agent that performs various tasks on a schedule or takes action on various events.</li><li><a href="https://huggingface.co/datasets/teknium/OpenHermes-2.5">teknium/OpenHermes-2.5.</a> Some of the top open models available have been trained using data from OpenHermes-2.5. More than one million high-quality data points are included in the collection. It’s now available for purchase.</li><li><a href="https://blog.allenai.org/olmo-open-language-model-87ccfc95f580">OLMo: Open Language Model.</a> A State-Of-The-Art, Truly Open LLM and Framework</li></ul><figure id="5c2c"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*fXVKckOgSPSad-zCaIxzQQ.png"><figcaption><a href="https://arxiv.org/pdf/2401.17268.pdf">source</a></figcaption></figure><ul><li><a href="https://huggingface.co/BAAI/bge-m3">BAAI/bge-m3.</a> A flexible embedding model that performs very well in multi-functionality (dense, multi-vector, and sparse retrieval), multi-linguistic (supporting more than 100 languages), and multi-granularity (managing inputs ranging from brief phrases to documents with up to 8192 tokens) is presented by the BGE-M3 project. It makes use of a hybrid retrieval pipeline, which leverages its simultaneous embedding and sparse retrieval capabilities, to combine several techniques and re-ranking for increased accuracy and generalization.</li><li><a href="https://github.com/run-llama/rags">RAGs.</a> Using natural language, users can develop RAG pipelines from data sources with the help of the Streamlit app RAGs. All users need to do is specify the parameters and tasks they require from their RAG systems. You can query the RAG, and it will respond to inquiries about the information.</li><li><a href="https://github.com/assafelovic/gpt-newspaper">GPT Newspaper.</a> GPT Newspaper project, an innovative autonomous agent designed to create personalized newspapers tailored to user preferences. GPT Newspaper revolutionizes the way we consume news by leveraging the power of AI to curate, write, design, and edit content based on individual tastes and interests.</li></ul><h1 id="8524">Perspectives</h1><ul><li><a href="https://1a3orn.com/sub/machine-learning-bans.html">Many AI Safety Orgs Have Tried to Criminalize Currently-Existing Open-Source AI.</a> Numerous teams are attempting to address the difficulties posed by the quickly developing field of artificial intelligence.</li><li><a href="https://www.nature.com/articles/d41586-024-00130-8">AlphaFold found thousands of possible psychedelics. Will its predictions help drug discovery?</a> Researchers have doubted how useful the AI protein-structure tool will be in discovering medicines — now they are learning how to deploy it effectively.</li><li><a href="https://www.nature.com/articles/d41586-024-00200-x">Reaching carbon neutrality requires energy-efficient training of AI.</a> Artificial intelligence (AI) models have achieved remarkable success, but their training requires a huge amount of energy.</li><li><a href="https://www.science.org/doi/10.1126/scirobotics.adn6096">What will robots think of us?</a> Two recent science fiction novels humorously illustrate the importance of correct robot mental models.</li><li><a href="https://www.oneusefulthing.org/p/what-can-be-done-in-59-seconds-an">What Can be Done in 59 Seconds: An Opportunity (and a Crisis).</a> AI is already capable of completing several jobs in less than a minute, thus businesses and staff will need to stress the need to utilize AI for good rather than evil.</li><li><a href="https://a16z.com/american-dynamism-50-ai/">The American Dynamism 50: AI.</a> This list of 50 companies, compiled by a16z, addresses some of the most important issues facing the US in the areas of manufacturing, transportation, energy, and military. They’re all utilizing AI to speed up their work in one way or another. This is an excellent insight if you’re interested in practical uses of artificial intelligence.</li></ul><figure id="31ea"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*yu4aWUQijgnGjGs-.jpg"><figcaption><a href="https://lin-yijie.github.io/projects/Norton/">source</a></figcaption></figure><h1 id="64b3">Meme of the week</h1><figure id="de25"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*B8qpvUgzIO7nzHvx.jpg"><figcaption></figcaption></figure><h2 id="7d08">What do you think about it? Some news that captured your attention? Let me know in the comments</h2><h1 id="b905">If you have found this interesting:</h1><p id="e8dd"><i>You can look for my other articles, and you can also connect or reach me on<b> <a href="https://www.linkedin.com/in/salvatore-raieli/">LinkedIn</a>. </b>Check <a href="https://github.com/SalvatoreRa/ML-news-of-the-week"><b>this repository</b></a> containing weekly updated ML & AI news. <b>I am open to collaborations and projects</b> and you can reach me on LinkedIn.</i></p><p id="0ed5"><i>Here is the link to my GitHub repository, where I am collecting code and many resources related to machine learning, artificial intelligence, and more.</i></p><div id="85cf" class="link-block"> <a href="https://github.com/SalvatoreRa/tutorial"> <div> <div> <h2>GitHub — SalvatoreRa/tutorial: Tutorials on machine learning, artificial intelligence, data science…</h2> <div><h3>Tutorials on machine learning, artificial intelligence, data science with math explanation and reusable code (in python…</h3></div> <div><p>github.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*XNf7JDFdFBe8nKSQ)"></div> </div> </div> </a> </div><p id="b400"><i>or you may be interested in one of my recent articles:</i></p></article></body>

Research

Matryoshka Representation Learning. The new embeddings from OpenAI are scalable to meet your demands. This is thought to be caused by the learning strategy known as the nesting doll approach, which learns characteristics at different granularities.

Vivim: a Video Vision Mamba for Medical Video Object Segmentation. A new framework called Vivim efficiently processes lengthy video sequences for medical video object segmentation. In comparison to conventional techniques, Vivim provides faster and more accurate segmentation results by effectively compressing spatiotemporal data using the state space model methodology.

Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities. This study presents a unique way to improve transformers by utilizing disparate input from many modalities, e.g., audio data to improve an image model. By connecting the transformers of two distinct modalities in a unique way, the Multimodal Pathway enables a target modality to profit from the advantages of another.

pix2gestalt: Amodal Segmentation by Synthesizing Wholes. A framework called Pix2Gestalt is intended for zero-shot amodal segmentation. When an item is partially occluded, it can rebuild its entire shape and look with great skill. Pix2Gestalt, which makes use of large-scale diffusion models, performs exceptionally well in difficult situations, such as producing artistic images that break convention.

Large-Vocabulary 3D Diffusion Model with Transformer. The variety of objects that may be generated in 3D poses a significant difficulty. This study builds up the system to operate with a considerably bigger range of items in each 3D category and employs a changed architecture to enhance sampling efficiency.

SliceGPT: Compress Large Language Models by Deleting Rows and Columns. Another potential distillation work. Importantly, this one can work on models as small as Phi-2. This means you can remove 90% of the rows and columns of weight matrices with minimal reduction to quality at almost all scales.

Learning Universal Predictors. The process of teaching systems to learn from experience and swiftly adjust to new tasks is known as meta-learning. With artificial data produced by a Universal Turing Machine, this Google project enhances Meta-Learning and conducts both theoretical and experimental analysis of the outcomes.

CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion. CreativeSynth is an artistic picture editing technique that combines text and image inputs in a seamless manner. Its diffusion approach, which has specialized attention processes built in, allows for fine alteration of both style and content while maintaining the essential elements of the original artwork.

Annotated Hands for Generative Models. By adding three more channels to training photos for hand annotations, researchers have increased the capacity of generative models, such as GANs and diffusion models, to produce realistic hand images.

Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling. Many AI systems employ the concept of “up captioning” to enhance labels during training. This work from Apple rephrases C4 as instructions, Q&A pairs, and more in order to apply it to pre-training. The rephrasing step increased convergence by 10x, according to the study, making the model significantly more sample-efficient, albeit at the expense of the rephrasing step itself.

Continual Learning with Pre-Trained Models: A Survey. This work provides an extensive overview of the most recent developments in continuous learning, which is centered on continually adjusting to new information while preserving prior understanding.

MacGNN. The MAcro Recommendation Graph (MAG) and Macro Graph Neural Networks (MacGNN) are introduced in this research. These methods greatly reduce the number of nodes by assembling similar behavior patterns into macro nodes, which addresses the computational difficulty of Graph Neural Networks.

Machine learning predicts which rivers, streams, and wetlands the Clean Water Act regulates. Our framework can support permitting, policy design, and the use of machine learning in regulatory implementation problems.

Weaver: Foundation Models for Creative Writing. A group of models called Weaver have been trained especially to narrate stories. On a benchmark for storytelling, the biggest model (34B params) performs better than GPT-4.

Text Image Inpainting via Global Structure-Guided Diffusion Models. In this study, two datasets for handwritten words and scenes are introduced, along with a benchmark. With original, damaged, and assistant photos, the new Global Structure-guided Diffusion Model (GSDM) effectively recovers clean texts by making use of text structure. Both picture quality and identification accuracy demonstrate notable gains.

Multi-granularity Correspondence Learning from Long-term Noisy Videos. With Norton, the multi-granularity noisy correspondence problem in video-language studies is addressed, offering a novel strategy for enhancing long-term video comprehension.

GPAvatar: Generalizable and Precise Head Avatar from Image(s). With the use of a Multi Tri-planes Attention module and a dynamic point-based expression field, GPAvatar presents a novel technique for generating 3D head avatars from photos.

MobileDiffusion: Rapid text-to-image generation on-device. With certain architectural modifications, Google has demonstrated a latent consistency diffusion model that it trained for sub-second generation times on mobile devices.

SNP-S3: Shared Network Pre-training and Significant Semantic Strengthening for Various Video-Text Tasks. Shared Network Pre-training (SNP) enhances the joint learning of text and video. Compared to earlier models, this approach is more effective and adaptable and incorporates a novel technique called Significant Semantic Strengthening (S3) to improve comprehension of important terms in sentences.

Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation. An improved version of the Segment Anything Model (SAM) with a focus on hierarchical text segmentation is called Hi-SAM. Hi-SAM is an excellent text segmenter at several levels, ranging from strokes to paragraphs, and it can even analyze layouts.

News

Voltron Data acquires Claypot to unlock real-time AI with modular data systems. Today, San Francisco-based Voltron Data, a startup providing enterprises with a modular and composable approach to building systems for data analytics, confirmed to VentureBeat that is acquiring the real-time AI platform Claypot. The terms of the deal were not disclosed.

FTC investigating Microsoft, Amazon, and Google investments into OpenAI and Anthropic. The commission wants to understand the tangled web of investments between cloud providers and AI startups.

Google’s New AI Is Learning to Diagnose Patients. The DeepMind team turns to medicine with an AI model named AMIE

1/100th of the cost: CPU startup Tachyum claims that one of its processing units can rival dozens of Nvidia H200 GPUs — with a 99% saving that could turn the AI market on its head if true. The 5nm Prodigy processor can dynamically switch between AI, HPC, and cloud workloads and costs $23,000

ChatGPT is violating Europe’s privacy laws, Italian DPA tells OpenAI. OpenAI has been told it’s suspected of violating European Union privacy, following a multi-month investigation of its AI chatbot, ChatGPT, by Italy’s data protection authority.

This whimsical clock is the playful gadget AI needs right now. The Poem/1 clock dreams up a new poem every minute to tell you the time. Do you need it? No. But you might want it.

iOS 17.4: Apple continues work on AI-powered Siri and Messages features, with help from ChatGPT. Apple is widely expected to unveil major new artificial intelligence features with iOS 18 in June. Code found by 9to5Mac in the first beta of iOS 17.4 shows that Apple is continuing to work on a new version of Siri powered by large language model technology, with a little help from other sources.

Opera to launch new AI-powered browser for iOS in Europe following Apple’s DMA changes. Opera revealed today that it will launch a new AI-powered browser built on its own engine for iOS in Europe. The Norway-based company announced the change following the news that Apple is going to allow alternative browser engines to run on iOS as a result of the requirements of the European Digital Markets Act (DMA).

Mistral CEO confirms ‘leak’ of new open source AI model nearing GPT-4 performance. The past few days have been a wild ride for the growing open-source AI community — even by its fast-moving and freewheeling standards.

Microsoft LASERs away LLM inaccuracies.Microsoft’s LASER method seems counterintuitive, but it makes models trained on large amounts of data smaller and more accurate.

LLaVA-1.6: Improved reasoning, OCR, and world knowledge. The most recent iteration of the visual language model Llava features enhanced reasoning, global knowledge, and OCR. It complements Gemini in some duties. The model, code, and data will be made available by the Llava team.

ServiceNow’s statement on AI. The $150 billion market capitalization business ServiceNow revealed last week that, among all of its new product family launches, including its initial Pro SKU, its generation AI solutions generated the biggest net new ACV contribution for the first full quarter. It’s exciting to see that enterprise-level AI applications are already contributing to significant revenue growth.

Bard’s latest updates: Access Gemini Pro globally and generate images. You can now generate images in Bard in English in most countries around the world, at no cost. This new capability is powered by our updated Imagen 2 model

Amazon debuts ‘Rufus,’ an AI shopping assistant in its mobile app. Amazon announced today the launch of an AI-powered shopping assistant it’s calling Rufus who’s been trained on the e-commerce giant’s product catalog as well as information from around the web.

Resources

imp-v1–3b. An additional multimodal model was trained using SigLIP and Phi-2. This one is tiny enough to run on-device and provides very promising performance.

WebDataset. WebDataset is a library for writing I/O pipelines for large datasets. Its sequential I/O and sharding features make it especially useful for streaming large-scale datasets to a DataLoader.

LLMs-from-scratch.An unfinished yet intriguing series of exercises to teach language model building from the beginning.

Exploring ColBERT with RAGatouille. For RAG applications, ColBERT is a great paradigm for embedding queries and index data. This article runs some benchmarks and examines the method’s underlying intuition.

mamba.rs. Inspired by efforts on the Llama models, this project uses pure Rust to run inference for Mamba on the CPU.

🦙 Code Llama. Code Llama is a code-specialized version of Llama 2 that was created by further training Llama 2 on its code-specific datasets, sampling more data from that same dataset for longer.

Eagle 7B : Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages (RWKV-v5). A brand new era for the RWKV-v5 architecture and linear transformer has arrived — with the strongest multi-lingual model in open source today

InconsistencyMasks. A novel technique for picture segmentation called Inconsistency Masks (IM) functions even with sparse data. Tested on the ISIC 2018 dataset, our method performs better than conventional methods and even surpasses models trained on fully labeled datasets.

distortion-generator. A novel technique for picture distortion strikes a compromise between privacy and accuracy in biometric systems, rendering facial photos incomprehensible to humans yet identifiable to AI.

TaskingAI. TaskingAI brings Firebase’s simplicity to AI-native app development. The platform enables the creation of GPTs-like multi-tenant applications using a wide range of LLMs from various providers. It features distinct, modular functions such as Inference, Retrieval, Assistant, and Tool, seamlessly integrated to enhance the development process.

100x Faster Clustering with Lilac Garden. A difficulty in language model training is locating a sufficiently varied dataset. It is considerably more difficult to visualize this data. This useful tool facilitates data exploration to enhance filtering and overall quality through topic modeling and quick clustering.

float8_experimental. Although less precise model training is quicker and less expensive, it is less reliable. Quantized training has been the subject of several excellent contemporary studies. Building on those foundations, this repository offers float8 teaching through readable and hackable code.

Enchanted. Enchanted is an open-source, Ollama-compatible, elegant iOS/iPad mobile app for chatting with privately hosted models such as Llama 2, Mistral, Vicuna, Starling, and more. It’s essentially ChatGPT app UI that connects to your private Ollama models. You can download Enchanted from the App Store or build yourself from scratch.

Introduction to point processing. Whether you are doing medical image analysis or you use Photoshop, you are using point preprocessing

MF-MOS: A Motion-Focused Model for Moving Object Segmentation. A new model called MF-MOS makes use of LiDAR technology to more effectively identify moving objects during autonomous driving. Using residual maps for motion capture and range pictures for semantic guiding distinguishes motion from semantic information in a unique way.

Mctx: MCTS-in-JAX. Mctx is a library with a JAX-native implementation of Monte Carlo tree search (MCTS) algorithms such as AlphaZero, MuZero, and Gumbel MuZero. For computation speed up, the implementation fully supports JIT-compilation.

FireLLaVA: the first commercially permissive OSS LLaVA model. A new open-vision model called FireLlava can be used for commercial applications after it is trained on data. It performs similarly to the first Llava, but not quite as well as Llava 1.5.

uAgents: AI Agent Framework. uAgents is a library developed by Fetch.ai that allows for the creation of autonomous AI agents in Python. With simple and expressive decorators, you can have an agent that performs various tasks on a schedule or takes action on various events.

teknium/OpenHermes-2.5. Some of the top open models available have been trained using data from OpenHermes-2.5. More than one million high-quality data points are included in the collection. It’s now available for purchase.

OLMo: Open Language Model. A State-Of-The-Art, Truly Open LLM and Framework

BAAI/bge-m3. A flexible embedding model that performs very well in multi-functionality (dense, multi-vector, and sparse retrieval), multi-linguistic (supporting more than 100 languages), and multi-granularity (managing inputs ranging from brief phrases to documents with up to 8192 tokens) is presented by the BGE-M3 project. It makes use of a hybrid retrieval pipeline, which leverages its simultaneous embedding and sparse retrieval capabilities, to combine several techniques and re-ranking for increased accuracy and generalization.

RAGs. Using natural language, users can develop RAG pipelines from data sources with the help of the Streamlit app RAGs. All users need to do is specify the parameters and tasks they require from their RAG systems. You can query the RAG, and it will respond to inquiries about the information.

GPT Newspaper. GPT Newspaper project, an innovative autonomous agent designed to create personalized newspapers tailored to user preferences. GPT Newspaper revolutionizes the way we consume news by leveraging the power of AI to curate, write, design, and edit content based on individual tastes and interests.

Perspectives

Many AI Safety Orgs Have Tried to Criminalize Currently-Existing Open-Source AI. Numerous teams are attempting to address the difficulties posed by the quickly developing field of artificial intelligence.

AlphaFold found thousands of possible psychedelics. Will its predictions help drug discovery? Researchers have doubted how useful the AI protein-structure tool will be in discovering medicines — now they are learning how to deploy it effectively.

Reaching carbon neutrality requires energy-efficient training of AI. Artificial intelligence (AI) models have achieved remarkable success, but their training requires a huge amount of energy.

What will robots think of us? Two recent science fiction novels humorously illustrate the importance of correct robot mental models.

What Can be Done in 59 Seconds: An Opportunity (and a Crisis). AI is already capable of completing several jobs in less than a minute, thus businesses and staff will need to stress the need to utilize AI for good rather than evil.

The American Dynamism 50: AI. This list of 50 companies, compiled by a16z, addresses some of the most important issues facing the US in the areas of manufacturing, transportation, energy, and military. They’re all utilizing AI to speed up their work in one way or another. This is an excellent insight if you’re interested in practical uses of artificial intelligence.

WEEKLY AI NEWS: RESEARCH, NEWS, RESOURCES, AND PERSPECTIVES

ML news: Week 29 January — 4 February

FTC investigates Microsoft, the leak of a new Mistral model, and much more

GitHub — SalvatoreRa/ML-news-of-the-week: A collection of the the best ML news every week…

A collection of the the best ML news every week (research, news, resources) — GitHub — SalvatoreRa/ML-news-of-the-week…

Weekly AI and ML news — each week the best of the field

Edit description

Research

News

Resources

Perspectives

Meme of the week

What do you think about it? Some news that captured your attention? Let me know in the comments

If you have found this interesting:

GitHub — SalvatoreRa/tutorial: Tutorials on machine learning, artificial intelligence, data science…

Tutorials on machine learning, artificial intelligence, data science with math explanation and reusable code (in python…