avatarPenny Grubb

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

3362

Abstract

ties. You can think of communities as a group of people, organizations, locations, or events that are closely related. For example, if you are building the graph with a movie script, the node representing the main character and the node representing her friend might be grouped as a community.</p><p id="98b4">After the communities are created, GraphRAG will start to generate a summary for each community. Those summaries describe the relationship or the topic within the group of nodes and their relations.</p><p id="14dd">We don’t just stop after creating the first level of communities. Once the first level of the community is built, GraphRAG will treat those communities as the nodes for the next level, and construct communities for a higher level. This approach can help create the overview at different levels of granularity. If your question is more for the high level (e.g. what’s the story theme), then this approach can help find the answer in a broader context. We will discuss more details in the next section.</p><figure id="4800"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*Ceqgh7YqTrOE95AonGln_Q.png"><figcaption>Clustering illustration (source: <a href="https://arxiv.org/pdf/2404.16130">original paper</a>)</figcaption></figure><h2 id="70fb">3. MapReduce approach for information extraction</h2><p id="ef94">Finally, we can explain why the former techniques help improve the quality of the answers generated. GraphRAG supports two kinds of query modes: global search and local search.</p><p id="5d7f"><b>Global search: Community Summary -> Global answer</b></p><p id="d4c8">Global search aims to provide the answer to questions that require understanding at a higher level. The solution is to aggregate the insight across the community summaries. The global search approach is very different from the traditional RAG, where the answer is based on semantically similar documents, we try first to generate the overview for elements in the document and use the summarized result to answer the question.</p><figure id="4477"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*YZqEe6E0uA1zRSrnaVdZjw.png"><figcaption>Global search illustration (source: <a href="https://microsoft.github.io/graphrag/posts/query/0-global_search/">Microsoft</a>)</figcaption></figure><p id="662f"><b>Local search: Knowledge Graph -> Local answer</b></p><p id="04a8">On the other hand, local search starts from the entities in question and uses the knowledge graph to find the most relevant information. For example, given the entity in the query, we may first use the information of connected nodes. In the official implementation, there’s also an option to use graph embedding to find the most relevant nodes in the graph.</p><p id="f38c">Now we have walked through all the interesting ideas behind GraphRAG, we can discuss what we can learn from it and how we can apply it in different scenarios.</p><figure id="543d"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*lcq4N_WFNy8XOhPrzeICHQ.png"><figcaption>Local search illustration (source: <a href="https://microsoft.github.io/graphrag/posts/query/1-local_search/">Microsoft</a>)</figcaption></figure><h1 id="e4fc">Implication: what can we learn from GraphRAG</h1><p id="1b50">Although GraphRAG is a powerful tool, there are still some reasons that we

Options

may not want to use it directly:</p><p id="f709"><b>Indexing cost is high</b></p><p id="02af">GraphRAG uses LLM to generate all the components in a graph, and its system prompt is also quite long (e.g. entity extraction prompt has roughly 1500 tokens). Even if you have only a few documents, the system prompt is still a burden as it increases the number of input tokens. Besides building the knowledge graph itself, the community summaries of communities also lead to a large number of output tokens.</p><p id="f486"><b>Not suitable for documents without obvious entities or documents that are well-structured</b></p><p id="a20a">Some documents might make it harder to construct the knowledge graph or it’s already well organized and you can directly leverage its structure. In this case, building a knowledge graph index is not necessary. For example, if you are using the API documents as the reference, having a knowledge graph could be overkill as the raw document already describes the relationship clearly. Another example is the spreadsheet data, in this case, it’s too complicated to express the relationship with a graph.</p><p id="f246">Finally, in a recent <a href="https://www.microsoft.com/en-us/research/blog/graphrag-new-tool-for-complex-data-discovery-now-on-github/">blog post</a>, Microsoft provides the following suggestion:</p><blockquote id="81e1"><p>The overall suitability of GraphRAG for any given use case, however, depends on whether the benefits of structured knowledge representations, readymade community summaries, and support for global queries outweigh the upfront costs of graph index construction.</p></blockquote><p id="2545">GraphRAG is unnecessarily the go-to solution for all cases. But still, we can borrow some of the ideas from GraphRAG’s implementation even if your use case is not suitable:</p><p id="caf5"><b>Implication 1: Pre-summarize the information at different levels</b></p><p id="af02">We can pre-aggregate the insights across documents and use them to generate the answer. When building the summary, we can create the summary with different amounts of details, and store the mapping between documents and the summary.</p><p id="aec8">At the query stage, we can first find the most relevant summaries using the similarity search. Next, we can either use the mapping to find the corresponding documents, or we can also use it to generate the answer like GraphRag.</p><p id="83f9"><b>Implication 2: Entity as the matching field</b></p><p id="a9bd">We can first use LLM to list relevant entities for our documents. When a query is passed, we first extract the entity from the query and use it to find the related documents directly. For example, we can use LLM to find the entities in the question and use full-text search or search filters to find the documents. If the document does not have obvious entities, we can also try to generate extra metadata or tags.</p><p id="a953"><b>Summary</b></p><p id="bf0d">GraphRAG provides a novel approach to solving the traditional RAG’s drawback, that is, answering questions that require the global context of the documents. Besides that, its local search feature also provides an alternative for using only the semantical similarity search. Even we don’t directly use the tool itself, we can still use the concept to improve the RAG implementation.</p></article></body>

How I Discover Good Stories

And how I avoid missing them when I’m busy

Photo: Penny Grubb

any topic can be absorbing when well written by someone who has done their homework

When kicking back with a cup of coffee, sometimes I like to surf fairly aimlessly and read what takes my fancy. I’ve discovered some fascinating new topics that way and I’ve run into some great writers. An unexpectedly interesting article can lead me off into a new world and engulf me for hours — any topic can be absorbing when well written by someone who has done their homework, and I love the feeling of taking on board new knowledge when I delve into unfamiliar areas.

having my cake and eating it

Therin lies the issue. It’s a free-reading path beset with rabbit holes that can consume half a day in the blink of an eye. I can’t risk that when life is too busy, but I still want my coffee break. I suppose I’m saying I want to have my cake and eat it. After all, cake and coffee make good companions.

When coffee break reading is restricted and on a deadline, there is no time to surf, I just want good interesting stuff to read, and I want it right in front of me, right now. Here’s how I do it.

Strategy 1 — I let others do the preparatory work for me. Writers like Ellie Jacobson, Katie Michaelson, and Sahil Patel produce regular round-ups of good writers, stories, and articles.

Each collection is catalogued in different ways adding extras to the links provided. It makes it easy — indeed a piece of cake — to grab half an hour of good reading.

Strategy 2 — this is something I was doing long before piggybacking on other people’s compilations. I bookmark pieces from writers I know and enjoy, plus articles whose headline or tagline tickles my fancy. When I’m in need of reading material I dive into my bookmarks.

A random set from the current list, including why I bookmarked them, looks like this:

I find it energising to read about people who live life to the full — indeed the full to overflowing when we’re talking about Jan Sebastian.

Jan’s writing goes everywhere and touches everything — her own life, other people’s, and anything she finds in between. Oh, and she gives out some great tips and tricks too.

Vidya Sury, Collecting Smiles has been to places I’ve never been and, importantly, can bring these places to life on the page.

Vidya drops fascinating snippets into her accounts. It’s not hard to see why the smiles follow her wherever she goes.

I always bookmark Anne Bonfert. For her day job, she works in the sky. Don’t be fooled by the article’s headline, I don’t mean ‘in the mountains’, I mean ‘in the sky’. Imagine the photos!

I don’t usually follow 30-day challenges — I’m not organised enough — but I’m enjoying this one from Ellie Jacobson and the headline makes an irresistible sequel to Anne Bonfert.

Some writers such as Susan Alison and Dennett can always be relied upon to weave good stories around interesting photos. That reliability makes them the backbone of my bookmark list. Dennett’s pictures of wild birds are always both exotic and awesome. With Susan, there’s the added bonus of unexpected artwork popping up too.

Both my focused and my random reading has brought me some great tips on all manner of topics — and Kris Bedenian is someone I always check out when she’s writing from the culinary front line. She has ideas that I’d never have thought up for myself.

As an aside on culinary matters, and stepping briefly away from the list, I still use the recipe for tomato sauce that arrived in a comment from Uvebruce.

My current list also features Neera Handa Dr, Linda Acaster, Hayden Moore, Madelyn, Linda Caroll, A Palace Of Ideas, Sanghita Pal & Maria Rattray. Some of the specific stories I’ve bookmarked are listed at the end.

Does this strategy lead to duds — uselessly uninteresting badly written tripe masquerading as stories or articles — as well as good reading?

Yes, occasionally I find something like blatant advertising or I trip over bad grammar and misspellings galore before reaching the end of the first sentence. I don’t read on. I quietly push the offending article aside, unbookmark it, and forget about it. If it was something truly grotesque then I would probably report it, but that hasn’t happened yet.

Are any of these duds present in this collection?

In a word, no!

A dilemma between strategies

There’s an interesting variation in results between the two strategies. The articles I bookmark for myself, if they’re not by authors whose work I know I enjoy, will be topics or headlines that have piqued my interest. In others’ lists, I find my way to stories that I would not have picked up on my own.

I’ve neglected my reading recently — I’ve added to my bookmarks but not had time to look in detail. Life exploded into a plethora of urgent tasks, places to go, and appointments to keep. Coffee and cake were out of reach. When I finally sat back to read, I was nonplussed to find other people’s compilations taking up a larger proportion of the list than they have before.

I’ve always either read through the stories I’ve bookmarked or I’ve dived into someone else’s list of recommendations. I’ve never done both at once. So I did something else I’ve never done before. As I worked my way through the list, I added links and comments to this article. I’ve taken advantage of other people’s lists for long enough — it’s time I reciprocated and produced one of my own.

I hope you’ll find something new and entertaining either in one of the links above or from the rest of the list detailed below — or preferably both.

More from my current list

Neera Handa Dr

Linda Acaster

Hayden Moore

Madelyn

Linda Caroll

A Palace Of Ideas

Sanghita Pal

Maria Rattray

Read more from Penny Grubb

Reading
Reciprocal
Lists
Strategy
Time Management
Recommended from ReadMedium