“Chain of Density” for Effective Text Summarization with Large Language Models
If you ever played with LLMs to perform text summarization, you’d know that condensing large volumes of information into digestible summaries is a challenging task.
However, there is a light at the of the tunnel!
Researchers from Salesforce AI, MIT, and Columbia recently showed the Chain of Density (CoD) prompt to summarize news articles (and probably others), that are not only succinct but also rich in details.
During their study, the researchers found out that:
Humans prefer GPT-4 summaries that are more dense than those generated by a vanilla prompt and almost as dense as human written summaries.
In this article, I want to walk you through CoD’s entity-centric approach, encompassing key actors, locations, and concepts, without overwhelming you with excessive density that may hinder comprehension.
Let’s have a brief look at CoD before trying it out.

Understanding CoD Prompt
When you employ CoD prompt, LLM begins with the generation of an initial entity-sparse summary. This summary serves as a foundation, which is then iteratively refined to incorporate additional entities without expanding the length.
The process involves a balance of abstraction, compression, and fusion, ensuring that each iteration results in a summary with a higher entity-to-token ratio than its predecessor.
Here’s CoD’s secret recipe in pseudo-code form:
function generateCoDSummary(document, iterations):
summary = generateInitialSummary(document)
for i in 1 to iterations:
missing_entities = identifyMissingEntities(document, summary)
summary = fuseEntities(summary, missing_entities)
if length(summary) > token_budget:
summary = compressSummary(summary, token_budget)
return summaryThis function outlines the core steps:
- generating an initial summary,
- identifying and fusing missing entities,
- compressing the summary to adhere to a token budget.
Results are evaluated on CNN/DailyMail articles with findings indicating a strong preference for summaries that are more entity-dense than those produced by standard prompts, approaching the density of human-crafted summaries.
This is great, as you can now craft news summaries that are as rich and flavorful as your morning espresso shot!
Let’s see it in action.
Effective Text Summarization with Chain of Density
We will now put the CoD prompt to test with a real article.
Here’s CoD prompt shared in the paper:
Article: {{ ARTICLE }}
You will generate increasingly concise, entity-dense summaries of the above Article.
Repeat the following 2 steps 5 times.
Step 1. Identify 1-3 informative Entities ("; " delimited) from the Article which are missing from the previously generated summary.
Step 2. Write a new, denser summary of identical length which covers every entity and detail from the previous summary plus the Missing Entities.
A Missing Entity is:
- Relevant: to the main story.
- Specific: descriptive yet concise (5 words or fewer).
- Novel: not in the previous summary.
- Faithful: present in the Article.
- Anywhere: located anywhere in the Article.
Guidelines:
- The first summary should be long (4-5 sentences, ~80 words) yet highly non-specific, containing little information beyond the entities marked as missing. Use overly verbose language and fillers (e.g., "this article discusses") to reach ~80 words.
- Make every word count: rewrite the previous summary to improve flow and make space for additional entities.
- Make space with fusion, compression, and removal of uninformative phrases like "the article discusses".
- The summaries should become highly dense and concise yet self-contained, e.g., easily understood without the Article.
- Missing entities can appear anywhere in the new summary.
- Never drop entities from the previous summary. If space cannot be made, add fewer new entities.
Remember, use the exact same number of words for each summary.
Answer in JSON. The JSON should be a list (length 5) of dictionaries whose keys are "Missing_Entities" and "Denser_Summary".Let’s apply this technique to this article from TechCrunch (you can try any other article.)
When you run the prompt with article provided, you will see that after every iteration, it will perform an analysis to identify new entities for the update as the following example:
# Iteration 4: Identify new entities and create a denser summary
# New missing entities to add
missing_entities_4 = "M12 venture fund; startup investors; accelerators"
# Create a denser summary including the new entities and keeping the same word count
denser_summary_4 = (
"In response to the AI chip scarcity ..."
)
# Update the summaries list with the new information for iteration 4
summaries[3]["Missing_Entities"] = missing_entities_4
summaries[3]["Denser_Summary"] = denser_summary_4
summariesOnce it is finished, you will get a JSON output
[
{
"Missing_Entities": "Azure AI; Nvidia-based GPU clusters; Y Combinator",
"Denser_Summary": "This article discusses the recent announcement from
Microsoft, which revolves around an update to its startup program.
Specifically, the program will now offer startups, particularly those
associated with Y Combinator, access to Azure AI resources and Nvidia-based
GPU clusters. The details of the program suggest an emphasis on providing
these resources at no cost, highlighting Microsoft's strategic partnership
with Y Combinator and its alumni for this initiative."
},
{
"Missing_Entities": "Microsoft for Startups Founders Hub; private preview; Azure credits",
"Denser_Summary": "In a recent update, Microsoft enhanced its startup
program, Microsoft for Startups Founders Hub, with free access to Azure AI
and Nvidia GPUs for select startups, starting with Y Combinator's network
in private preview. This initiative underscores the tech giant's commitment
to fostering innovation by providing critical resources, like Azure credits,
to early-stage companies, thereby strengthening its partnership with Y
Combinator and positioning Azure as a leading AI development platform."
},
{
"Missing_Entities": "AI chip shortage; supercomputing resources; AI models",
"Denser_Summary": "Addressing the AI chip shortage, Microsoft's latest
Microsoft for Startups Founders Hub update offers free Azure AI
supercomputing resources to selected startups, initiating with Y Combinator's
affiliates in a private preview. The move, aimed at AI model development
including large language models, leverages Azure credits to stimulate
early-stage startup innovation, consolidating Azure's AI leadership."
},
{
"Missing_Entities": "M12 venture fund; startup investors; accelerators",
"Denser_Summary": "In response to the AI chip scarcity, Microsoft's
Founders Hub now grants eligible startups, starting with Y Combinator,
complimentary access to Azure's AI supercomputing and Nvidia GPUs. This
initiative, part of the M12 venture fund's strategy, caters to AI model
advancement and aims to scale startup innovation by integrating resources
from startup investors and accelerators."
},
{
"Missing_Entities": "time-bound access; test and trial operations; next wave of AI innovation",
"Denser_Summary": "Microsoft confronts AI chip deficits by updating
its Founders Hub to provide select startups, like Y Combinator, time-bound
access to Azure's AI supercomputing facilities for model development,
supported by M12 venture funds. This strategy aids startups to test and
trial operations, fostering the next wave of AI innovation, and enhances
collaboration with investors and accelerators."
}
]Have a look at the evolution of missing entities and the updates summaries, and try with different article length to better understand the behavior.

Conclusion
I think this is a good step in improving the quality of automated summaries by focusing on the balance of information density.
We understand that users prefer summaries that compactly present information without overloading each sentence with too many details, which can hamper readability and coherence.
One thing you should note that the study was confined to news summarization within a single domain and while individual summary evaluations were inconsistent, broader patterns emerged that align with earlier findings on LLM assessments.
Regardless, this opens the way for development of more nuanced and user-friendly summarization tools.
Hope this walkthrough was helpful, see you somewhere in the matrix.






