Summary

Researchers from Salesforce AI, MIT, and Columbia have developed the "Chain of Density" (CoD) prompt for text summarization, which enhances the density and detail of summaries generated by large language models (LLMs), such as GPT-4, to a level comparable to human-written summaries.

Abstract

The "Chain of Density" (CoD) prompt is a novel approach to text summarization that iteratively refines initial summaries to increase their entity density without compromising readability. This method, which involves identifying missing entities and fusing them into the summary while maintaining a fixed word count, has been shown to produce summaries that are preferred by humans for their richness in detail. The process, detailed in a recent study, includes generating an initial summary, iteratively adding key entities, and compressing the content to fit a token budget. The effectiveness of CoD has been demonstrated through the summarization of CNN/DailyMail articles, with the results indicating a strong preference for CoD-generated summaries over those produced by standard prompts. The CoD technique is designed to create highly informative yet concise summaries that can stand alone without the need for reference to the original article.

Opinions

The researchers believe that humans prefer summaries with a higher entity density, as they provide more information in a concise format.
The study suggests that the CoD prompt can produce summaries that approach the density and quality of those written by humans.
There is an opinion that standard prompts for LLMs may not produce summaries that are as dense or detailed as those generated by the CoD method.
The researchers imply that maintaining a balance between abstraction, compression, and fusion is crucial in creating effective summaries.
The findings indicate that users favor summaries that present information densely but coherently, avoiding overloading sentences with excessive detail.
The study acknowledges that while individual evaluations of summaries may vary, broader patterns and preferences for entity-dense summaries are evident.
The is optimism that the CoD technique will lead to the development of more sophisticated and user-friendly summarization tools in the future.

“Chain of Density” for Effective Text Summarization with Large Language Models

If you ever played with LLMs to perform text summarization, you’d know that condensing large volumes of information into digestible summaries is a challenging task.

However, there is a light at the of the tunnel!

Researchers from Salesforce AI, MIT, and Columbia recently showed the Chain of Density (CoD) prompt to summarize news articles (and probably others), that are not only succinct but also rich in details.

During their study, the researchers found out that:

Humans prefer GPT-4 summaries that are more dense than those generated by a vanilla prompt and almost as dense as human written summaries.

In this article, I want to walk you through CoD’s entity-centric approach, encompassing key actors, locations, and concepts, without overwhelming you with excessive density that may hinder comprehension.

Let’s have a brief look at CoD before trying it out.

Understanding CoD Prompt

When you employ CoD prompt, LLM begins with the generation of an initial entity-sparse summary. This summary serves as a foundation, which is then iteratively refined to incorporate additional entities without expanding the length.

The process involves a balance of abstraction, compression, and fusion, ensuring that each iteration results in a summary with a higher entity-to-token ratio than its predecessor.

Here’s CoD’s secret recipe in pseudo-code form:

function generateCoDSummary(document, iterations):
    summary = generateInitialSummary(document)
    for i in 1 to iterations:
        missing_entities = identifyMissingEntities(document, summary)
        summary = fuseEntities(summary, missing_entities)
        if length(summary) > token_budget:
            summary = compressSummary(summary, token_budget)
    return summary

This function outlines the core steps:

generating an initial summary,
identifying and fusing missing entities,
compressing the summary to adhere to a token budget.

Results are evaluated on CNN/DailyMail articles with findings indicating a strong preference for summaries that are more entity-dense than those produced by standard prompts, approaching the density of human-crafted summaries.

This is great, as you can now craft news summaries that are as rich and flavorful as your morning espresso shot!

Let’s see it in action.

Effective Text Summarization with Chain of Density

We will now put the CoD prompt to test with a real article.

Here’s CoD prompt shared in the paper:

Article: {{ ARTICLE }}

You will generate increasingly concise, entity-dense summaries of the above Article.

Repeat the following 2 steps 5 times.

Step 1. Identify 1-3 informative Entities ("; " delimited) from the Article which are missing from the previously generated summary.
Step 2. Write a new, denser summary of identical length which covers every entity and detail from the previous summary plus the Missing Entities.

A Missing Entity is:
- Relevant: to the main story.
- Specific: descriptive yet concise (5 words or fewer).
- Novel: not in the previous summary.
- Faithful: present in the Article.
- Anywhere: located anywhere in the Article.

Guidelines:
- The first summary should be long (4-5 sentences, ~80 words) yet highly non-specific, containing little information beyond the entities marked as missing. Use overly verbose language and fillers (e.g., "this article discusses") to reach ~80 words.
- Make every word count: rewrite the previous summary to improve flow and make space for additional entities.
- Make space with fusion, compression, and removal of uninformative phrases like "the article discusses".
- The summaries should become highly dense and concise yet self-contained, e.g., easily understood without the Article.
- Missing entities can appear anywhere in the new summary.
- Never drop entities from the previous summary. If space cannot be made, add fewer new entities.

Remember, use the exact same number of words for each summary.

Answer in JSON. The JSON should be a list (length 5) of dictionaries whose keys are "Missing_Entities" and "Denser_Summary".

Let’s apply this technique to this article from TechCrunch (you can try any other article.)

When you run the prompt with article provided, you will see that after every iteration, it will perform an analysis to identify new entities for the update as the following example:

# Iteration 4: Identify new entities and create a denser summary

# New missing entities to add
missing_entities_4 = "M12 venture fund; startup investors; accelerators"

# Create a denser summary including the new entities and keeping the same word count
denser_summary_4 = (
    "In response to the AI chip scarcity ..."
)

# Update the summaries list with the new information for iteration 4
summaries[3]["Missing_Entities"] = missing_entities_4
summaries[3]["Denser_Summary"] = denser_summary_4

summaries

Once it is finished, you will get a JSON output

[
  {
    "Missing_Entities": "Azure AI; Nvidia-based GPU clusters; Y Combinator",
    "Denser_Summary": "This article discusses the recent announcement from 
Microsoft, which revolves around an update to its startup program. 
Specifically, the program will now offer startups, particularly those 
associated with Y Combinator, access to Azure AI resources and Nvidia-based 
GPU clusters. The details of the program suggest an emphasis on providing 
these resources at no cost, highlighting Microsoft's strategic partnership 
with Y Combinator and its alumni for this initiative."
  },
  {
    "Missing_Entities": "Microsoft for Startups Founders Hub; private preview; Azure credits",
    "Denser_Summary": "In a recent update, Microsoft enhanced its startup 
program, Microsoft for Startups Founders Hub, with free access to Azure AI 
and Nvidia GPUs for select startups, starting with Y Combinator's network 
in private preview. This initiative underscores the tech giant's commitment 
to fostering innovation by providing critical resources, like Azure credits, 
to early-stage companies, thereby strengthening its partnership with Y 
Combinator and positioning Azure as a leading AI development platform."
  },
  {
    "Missing_Entities": "AI chip shortage; supercomputing resources; AI models",
    "Denser_Summary": "Addressing the AI chip shortage, Microsoft's latest 
Microsoft for Startups Founders Hub update offers free Azure AI 
supercomputing resources to selected startups, initiating with Y Combinator's 
affiliates in a private preview. The move, aimed at AI model development 
including large language models, leverages Azure credits to stimulate 
early-stage startup innovation, consolidating Azure's AI leadership."
  },
  {
    "Missing_Entities": "M12 venture fund; startup investors; accelerators",
    "Denser_Summary": "In response to the AI chip scarcity, Microsoft's 
Founders Hub now grants eligible startups, starting with Y Combinator, 
complimentary access to Azure's AI supercomputing and Nvidia GPUs. This 
initiative, part of the M12 venture fund's strategy, caters to AI model 
advancement and aims to scale startup innovation by integrating resources 
from startup investors and accelerators."
  },
  {
    "Missing_Entities": "time-bound access; test and trial operations; next wave of AI innovation",
    "Denser_Summary": "Microsoft confronts AI chip deficits by updating 
its Founders Hub to provide select startups, like Y Combinator, time-bound 
access to Azure's AI supercomputing facilities for model development, 
supported by M12 venture funds. This strategy aids startups to test and 
trial operations, fostering the next wave of AI innovation, and enhances 
collaboration with investors and accelerators."
  }
]

Have a look at the evolution of missing entities and the updates summaries, and try with different article length to better understand the behavior.

Conclusion

I think this is a good step in improving the quality of automated summaries by focusing on the balance of information density.

We understand that users prefer summaries that compactly present information without overloading each sentence with too many details, which can hamper readability and coherence.

One thing you should note that the study was confined to news summarization within a single domain and while individual summary evaluations were inconsistent, broader patterns emerged that align with earlier findings on LLM assessments.

Regardless, this opens the way for development of more nuanced and user-friendly summarization tools.

Hope this walkthrough was helpful, see you somewhere in the matrix.