Free AI web copilot to create summaries, insights and extended knowledge, download it at here

Abstract

ardy">G. H. Hardy</a> described proof by contradiction as “one of a mathematician’s finest weapons”, saying “It is a far finer gambit than any <a href="https://en.wikipedia.org/wiki/Gambit">chess gambit</a>: a chess player may offer the sacrifice of a pawn or even a piece, but a mathematician offers the game.” — <a href="https://en.wikipedia.org/wiki/Proof_by_contradictio">wikipedia</a></p></blockquote><p id="d647">Indirect reasoning is a system exploited by humans and could be exploited by models to be able to solve various problems where direct reasoning techniques fail. <b>How can we succeed in making LLMs benefit from IR?</b></p><p id="b2fb"><a href="https://arxiv.org/abs/2402.03667">A new study</a> shows how this is possible simply by using a new type of prompt.</p><div id="c00d" class="link-block"> <a href="https://arxiv.org/abs/2402.03667"> <div> <div> <h2>Large Language Models as an Indirect Reasoner: Contrapositive and Contradiction for Automated…</h2> <div><h3>Recently, increasing attention has been focused drawn on to improve the ability of Large Language Models (LLMs) to…</h3></div> <div><p>arxiv.org</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*QzjWrICPGMM1GNQF)"></div> </div> </div> </a> </div><blockquote id="151a"><p>In mathematics and some practical applications, there are circumstances where direct proof may not be feasible or effective. In such cases, the methods of indirect proof are often used to verify a statement. There are two popular methods for indirect proof, which are: contrapositive method and contradiction method.(<a href="https://arxiv.org/pdf/2402.03667.pdf">source</a>)</p></blockquote><p id="b8ba">The authors’ idea is to exploit both contradictions and contrasts to direct a model to the solution when direct evidence is not possible to obtain. The authors’ goal is to allow the model to conduct factual reasoning in natural language: having a question Q one must arrive at an answer A through reasoning P that exploits known facts F and rules R (rules are often part of prior knowledge and are not necessarily made explicit).</p><figure id="cd02"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*NpTbkKIvqkF3XUGRv8u7fQ.png"><figcaption>image source: <a href="https://arxiv.org/pdf/2402.03667.pdf">here</a></figcaption></figure><p id="987a" type="7">How to adapt indirect reasoning to an LLM?</p><p id="afbd">For the authors, the process is divided into two parts:</p><ul><li><b>Rule augmentation.</b> In this step, the model is prompted to augment the rule set.</li><li><b>Indirect reasoning.</b> Received the facts, rules, and questions the model performs indirect reasoning.</li></ul><p id="4a6c">The authors therefore define both a zero-shot and a few-shot template for prompts so that IR can be used with an LLM.</p><figure id="8bad"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*jmOXYfZS_YxILQaUmGhs9Q.png"><figcaption>example of a few-shot prompts. image source: <a href="https://arxiv.org/pdf/2402.03667.pdf">here</a></figcaption></figure><blockquote id="7307"><p>The evaluation of reasoning performance for a method includes the correctness investigation on the answer A and the reasoning process P. Therefore, here we use three metrics: accuracy of answer (AA), accuracy of reasoning processes (AP), and the overall accuracy (OA). (<a href="https://arxiv.org/pdf/2402.03667.pdf">source</a>)</p></blockquote><p id="e871">The authors practically define three metrics based on the number of examples with correct answers, correct process, and both correct.</p><p id="4df8">The authors use both COT and self-consistency in addition to their prompt to see how the model changes in response to indirect reasoning. They use both <a href="https://en.wikipedia.org/wiki/GPT-3">GPT-3.5</a> and <a href="https://deepmind.google/technologies/gemini/">Gemini</a> as models and test them on datasets of both natural languaging and mathematical.</p><p id="fd30">For example, they test it on a dataset of natural language questions (ProofWriter) and one of mathematical problems (ProofMath) where it is necessary to use contradiction proof to solve the problems. The use of IR increases the model’s capabilities when solving these types of problems.</p><figure id="033a"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*65LEZE_JbUkAExJkbePbFQ.png"><figcaption></figcaption></figure><p id="ed16">Another interesting result is that rule augmentation also helps the model even with DR alone.</p><figure id="c329"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*hgLuhVzB_w1IksSppwBBAQ.png"><figcaption></figcaption></figure><p id="ce16">Also, another advantage is that this approach reduces the steps to reach the conclusion (the process is therefore faster).</p><figure id="3eea"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*mTW3r8kHB-yMEl8ypZLuWw.png"><figcaption></figcaption></figure><h1 id="fc25">Parting thoughts</h1><figure id="56cf"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*m4SjFSTtZFsCQ-Cz"><figc

Options

aption>Photo by <a href="https://unsplash.com/@saif71?utm_source=medium&utm_medium=referral">Saif71.com</a> on <a href="https://unsplash.com?utm_source=medium&utm_medium=referral">Unsplash</a></figcaption></figure><blockquote id="918d"><p>In recent times, various LLMs have been widely adopted to solve tasks such as factual reasoning, dialog generation, and multi-modal content generation. These approaches have generated noticeable economic value and social impact in multiple applications. (<a href="https://arxiv.org/pdf/2402.03667.pdf">source</a>)</p></blockquote><p id="5e22">LLMs have entered production today and are used by the public, on the one hand, these models still have problems with factual reasoning. Several techniques have been developed over time to improve the reasoning capabilities of the models. Those techniques exploit processes called direct reasoning, here the authors show that there are problems that cannot be solved with DR but benefit from indirect reasoning.</p><h2 id="d2c6">What do you think? Will you test this prompt method? Let me know in the comments</h2><h1 id="4f23">If you have found this interesting:</h1><p id="e8dd"><i>You can look for my other articles, and you can also connect or reach me on<b> <a href="https://www.linkedin.com/in/salvatore-raieli/">LinkedIn</a>. </b>Check <a href="https://github.com/SalvatoreRa/ML-news-of-the-week"><b>this repository</b></a> containing weekly updated ML & AI news. <b>I am open to collaborations and projects</b> and you can reach me on LinkedIn.</i></p><p id="0ed5"><i>Here is the link to my GitHub repository, where I am collecting code and many resources related to machine learning, artificial intelligence, and more.</i></p><div id="85cf" class="link-block"> <a href="https://github.com/SalvatoreRa/tutorial"> <div> <div> <h2>GitHub — SalvatoreRa/tutorial: Tutorials on machine learning, artificial intelligence, data science…</h2> <div><h3>Tutorials on machine learning, artificial intelligence, data science with math explanation and reusable code (in python…</h3></div> <div><p>github.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*XNf7JDFdFBe8nKSQ)"></div> </div> </div> </a> </div><p id="b400"><i>or you may be interested in one of my recent articles:</i></p><div id="9530" class="link-block"> <a href="https://towardsdatascience.com/a-requiem-for-the-transformer-297e6f14e189"> <div> <div> <h2>A Requiem for the Transformer?</h2> <div><h3>Will be the transformer the model leading us to artificial general intelligence? Or will be replaced?</h3></div> <div><p>towardsdatascience.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*ICRUdM-ABN7vOObG)"></div> </div> </div> </a> </div><div id="c109" class="link-block"> <a href="https://levelup.gitconnected.com/cognition-is-struggling-natural-and-artificial-brains-evolve-from-constriction-3f397df30b19"> <div> <div> <h2>Cognition is Struggling: Natural and Artificial Brains Evolve from Constriction</h2> <div><h3>Evolutive forces shape the brain, what if we apply the same forces to AI?</h3></div> <div><p>levelup.gitconnected.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*3dTlGNLFC5ZqDXv3)"></div> </div> </div> </a> </div><div id="1b01" class="link-block"> <a href="https://levelup.gitconnected.com/tabula-rasa-why-do-tree-based-algorithms-outperform-neural-networks-db641862859b"> <div> <div> <h2>Tabula Rasa: Why Do Tree-Based Algorithms Outperform Neural Networks</h2> <div><h3>Tree-based algorithms are the winner in tabular data: Why?</h3></div> <div><p>levelup.gitconnected.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*01Civm9elwoTf6ho)"></div> </div> </div> </a> </div><div id="ec54" class="link-block"> <a href="https://levelup.gitconnected.com/grokking-learning-is-generalization-and-not-memorization-52c43c9025e4"> <div> <div> <h2>Grokking: Learning Is Generalization and Not Memorization</h2> <div><h3>Understanding how a neural network learns helps us to avoid that the model from forgetting what it learns</h3></div> <div><p>levelup.gitconnected.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*OcNYynlsjBy5NZZ2)"></div> </div> </div> </a> </div></article></body>

| LLMs| ARTIFICIAL INTELLIGENCE| NLP|

Indirect Reasoning for LLMs: Not Always There is a Direct Way to the Answer

Contrapositive and Contradiction for Automated Reasoning can help your model find the right answering

For every complex problem there is an answer that is clear, simple, and wrong. — H. L. Mencken

There are many prompt engineering techniques, but many of them have one thing in common: direct reasoning. What happens if we try the opposite? Can we find a solution to problems that the model couldn’t solve before?

We discuss this in this article.

Direct versus indirect reasoning

Large Language Models (LLMs) have shown incredible capabilities even in complex tasks such as language comprehension, logical reasoning, and mathematical reasoning. The success of these models is even more incredible when you consider that they did it zero-shot or few-shot way. This means that the models are capable of learning from context (in-context learning).

All You Need to Know about In-Context Learning

What is and how does it work what makes Large Language Models so powerful

towardsdatascience.com

This has caused several groups to focus on trying to understand how to increase these capabilities of the models, so techniques such as Chain-of-Thought (CoT) and many others have emerged.

Prompt Engineering to Leverage In-Context Learning in Large Language Models

How to modify your text prompt to obtain the best from an LLM without training

pub.towardsai.net

Chain-of-Thought (CoT) encourages the model to explain the various intermediate steps leading to the final solution. The idea is that by unfolding the various steps, the model can arrive at the final solution correctly (whereas by jumping directly to the conclusion, the model often gets it wrong).

CoT and other techniques follow what is called the Direct Reasoning (DR) framework, where logical chains are created from given facts to the final result. The problem with this approach is that not all problems can be solved this way. So the question arises, if we are faced with a problem that cannot be solved can we take advantage of Indirect Reasoning (IR)?

Indirect Reasoning (IR) is a complementary and alternative approach to solving problems. One of the most commonly used methods is to exploit logical procedures to prove that two propositions are equivalent. For example, one can prove that a proposition is true by assuming it is false and arriving at a contradiction: p → q and its contrapositive ¬q → ¬p, if we show that ¬q → ¬p lead to a contradiction

Indirect reasoning to solve problems

G. H. Hardy described proof by contradiction as “one of a mathematician’s finest weapons”, saying “It is a far finer gambit than any chess gambit: a chess player may offer the sacrifice of a pawn or even a piece, but a mathematician offers the game.” — wikipedia

Indirect reasoning is a system exploited by humans and could be exploited by models to be able to solve various problems where direct reasoning techniques fail. How can we succeed in making LLMs benefit from IR?

A new study shows how this is possible simply by using a new type of prompt.

Large Language Models as an Indirect Reasoner: Contrapositive and Contradiction for Automated…

Recently, increasing attention has been focused drawn on to improve the ability of Large Language Models (LLMs) to…

arxiv.org

In mathematics and some practical applications, there are circumstances where direct proof may not be feasible or effective. In such cases, the methods of indirect proof are often used to verify a statement. There are two popular methods for indirect proof, which are: contrapositive method and contradiction method.(source)

The authors’ idea is to exploit both contradictions and contrasts to direct a model to the solution when direct evidence is not possible to obtain. The authors’ goal is to allow the model to conduct factual reasoning in natural language: having a question Q one must arrive at an answer A through reasoning P that exploits known facts F and rules R (rules are often part of prior knowledge and are not necessarily made explicit).

How to adapt indirect reasoning to an LLM?

For the authors, the process is divided into two parts:

Rule augmentation. In this step, the model is prompted to augment the rule set.
Indirect reasoning. Received the facts, rules, and questions the model performs indirect reasoning.

The authors therefore define both a zero-shot and a few-shot template for prompts so that IR can be used with an LLM.

example of a few-shot prompts. image source: here

The evaluation of reasoning performance for a method includes the correctness investigation on the answer A and the reasoning process P. Therefore, here we use three metrics: accuracy of answer (AA), accuracy of reasoning processes (AP), and the overall accuracy (OA). (source)

The authors practically define three metrics based on the number of examples with correct answers, correct process, and both correct.

The authors use both COT and self-consistency in addition to their prompt to see how the model changes in response to indirect reasoning. They use both GPT-3.5 and Gemini as models and test them on datasets of both natural languaging and mathematical.

For example, they test it on a dataset of natural language questions (ProofWriter) and one of mathematical problems (ProofMath) where it is necessary to use contradiction proof to solve the problems. The use of IR increases the model’s capabilities when solving these types of problems.

Another interesting result is that rule augmentation also helps the model even with DR alone.

Also, another advantage is that this approach reduces the steps to reach the conclusion (the process is therefore faster).

Parting thoughts

In recent times, various LLMs have been widely adopted to solve tasks such as factual reasoning, dialog generation, and multi-modal content generation. These approaches have generated noticeable economic value and social impact in multiple applications. (source)

LLMs have entered production today and are used by the public, on the one hand, these models still have problems with factual reasoning. Several techniques have been developed over time to improve the reasoning capabilities of the models. Those techniques exploit processes called direct reasoning, here the authors show that there are problems that cannot be solved with DR but benefit from indirect reasoning.

What do you think? Will you test this prompt method? Let me know in the comments

If you have found this interesting:

You can look for my other articles, and you can also connect or reach me on LinkedIn. Check this repository containing weekly updated ML & AI news. I am open to collaborations and projects and you can reach me on LinkedIn.

Here is the link to my GitHub repository, where I am collecting code and many resources related to machine learning, artificial intelligence, and more.

GitHub — SalvatoreRa/tutorial: Tutorials on machine learning, artificial intelligence, data science…

Tutorials on machine learning, artificial intelligence, data science with math explanation and reusable code (in python…

github.com

or you may be interested in one of my recent articles:

A Requiem for the Transformer?

Will be the transformer the model leading us to artificial general intelligence? Or will be replaced?

towardsdatascience.com

Cognition is Struggling: Natural and Artificial Brains Evolve from Constriction

Evolutive forces shape the brain, what if we apply the same forces to AI?

levelup.gitconnected.com

Tabula Rasa: Why Do Tree-Based Algorithms Outperform Neural Networks

Tree-based algorithms are the winner in tabular data: Why?

levelup.gitconnected.com

Grokking: Learning Is Generalization and Not Memorization

Understanding how a neural network learns helps us to avoid that the model from forgetting what it learns

levelup.gitconnected.com