ELI5 Medical Texts with GPT-3

Explain difficult concepts in layman’s terms in Doctor.ai

Photo by National Cancer Institute on Unsplash

We humans get sick now and then. So we went to the doctors. However, the doctors have more or better information than the patients (that is also the reason why the patients go to see the doctors in the first place). This information asymmetry puts the patients at a disadvantage, leads to wasteful medical treatment, and more importantly, it can erode patient’s trust in doctors.

To combat this information asymmetry in healthcare, we not only need to broaden access to good medical information, but also deliver the information in such a simple form that the general public can understand. This has motivated four fellow Neo4j engineers and me to develop the medical chatbot called Doctor.ai. This chatbot manages a large amount of patients’ medical records as well as public medical information in its Neo4j knowledge graph. With the help of Alan AI and GPT-3, Doctor.ai is able to understand oral English, German, Chinese and Japanese, generate the correct Cypher to query its Neo4j backend and present the answers to the users. It can answer a wide range of questions such as the details of past visits as well as the disease-associated pathogens, drugs, symptoms and genes.

However, I have noticed that the disease descriptions in Doctor.ai are full of academic jargon. They come from the KEGG Disease database. That database is built mainly for scientific research and not for informing the public. So the texts there are just too hard for the general public to understand. For example, the description of small cell lung cancer looks like this.

Lung cancer is a leading cause of cancer death among men and women in industrialized countries. Small cell lung carcinoma (SCLC) is a highly aggressive neoplasm, which accounts for approximately 25% of all lung cancer cases. Molecular mechanisms altered in SCLC include induced expression of oncogene, MYC, and loss of tumorsuppressor genes, such as p53, PTEN, RB, and FHIT. The overexpression of MYC proteins in SCLC is largely a result of gene amplification. Such overexpression leads to more rapid proliferation and loss of terminal differentiation. Mutation or deletion of p53 or PTEN can lead to more rapid proliferation and reduced apoptosis. The retinoblastoma gene RB1 encodes a nuclear phosphoprotein that helps to regulate cell-cycle progression. The fragile histidine triad gene FHIT encodes the enzyme diadenosine triphosphate hydrolase, which is thought to have an indirect role in proapoptosis and cell-cycle control.

This description contains technical terms such as “neoplasm”, “oncogene”, “overexpression” and a list of gene names that hardly means anything for an average reader. It may even take years of academic training for a biomedical student to fully digest the whole text.

Fortunately, GPT-3 came to my rescue. In the GPT-3 example page, you can see an example called Summarize for a 2nd grader. This function is like our personal Explain Like I’m Five (ELI5). It takes in a complicated text and explains that in much simpler terms.

Figrue 1. Summarize lung cancer description for a second-grade student in the GPT-3 Playground. Image by author.

For example, the description of lung cancer above can be simplified by GPT-3 into either of these two versions.

Lung cancer is a type of cancer that starts in the lungs. Small cell lung carcinoma (SCLC) is a very aggressive type of lung cancer that makes up about 25% of all lung cancer cases. There are some genes that help control how fast cells grow and when they die. People with SCLC often have changes in these genes. This can make the cancer grow and spread more quickly.

And the second version is this.

Cancer of the lungs is one of the leading causes of cancer death among men and women in countries where people live close together and there is a lot of industry. Small cell lung cancer (SCLC) is a very aggressive cancer that makes up about 25% of all lung cancer cases. There are changes in the way some genes work in SCLC that cause the cancer to grow more quickly and make it harder to treat. Some of these genes cause the cancer to grow more quickly, and some cause the cancer to resist treatments.

In this tutorial, I am going to show you how to construct a Python wrapper for this function and integrate it into Doctor.ai. The wrapper is hosted on my GitHub repository here.

GitHub - dgg32/gpt-3_eli5

You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

github.com

And the Doctor.ai frontend with GPT-3’s ELI5 is here.

GitHub - dgg32/doctorai_eli5

This is the React front-end app for Doctor.ai with Alan Integrate Alan Speech-to-Text to Doctor.ai. Runs the app in the…

github.com

1. Python wrapper

The prompt for ELI5 is quite simple. It begins with the line “Summarize this for a second-grade student:” and then comes the main text. You can test your prompt in the GPT-3 Playground and see the result (Figure 1). Be aware that GPT-3 will generate a different result every time even from identical text when the Temperature is above 0, as shown above. You can control its artistic freedom by adjusting the Temperature parameter. The higher the value, the more variation you will see in the output.

Now let’s wrap this API call in Python so that we can use it programmatically.

First set the environment variable OPENAI_API_KEY in your system. On Linux or Mac, you can run this.

And then run the script like this.

The script takes an input file as its only parameter. In the input file, each line is one complicated description. The script goes through the file line by line and calles GPT-3. Finally, it displays the outputs.

2. Integrate GPT-3’s ELI5 in Doctor.ai

GPT-3 has been an integral part of Doctor.ai for some time now. It uses GPT-3 primarily to convert English to Cypher. It also translates German, Chinese and Japanese to English back and forth. In this project, I will add the ELI5 into its frontend.

However, I need to build the ELI5 functionality into the prompt. There are two routes. My first option is to allow the user to call this functionality by saying keywords such as “ELI5”, “Explain … in simple terms” or the like. This approach is less ambiguous, but it steps into the responsibility of GPT-3 and is less flexible too. The second approach is to let GPT-3 understand our intentions and prepend the keyword “ELI5” to the Cypher query. The frontend code sends only the Cypher to Neo4j, gets the answer and calls GPT-3’s ELI5 to explain the answer. Here I implement the second approach and build them into the code of my previous article. First, I implement the async function eli5.

Then in the DoctorAi_gpt3 component, I integrate the functionality this way.

Figure 2. `The structure of the DoctorAi_gpt3.js file. Image by author.`

The upgrade is quite straightforward. I add three English-to-Cypher training pairs in the prompt header. Each answer consists of the keyword “ELI5” and the Cypher. Afterwards, the script checks whether “ELI5” exists. If yes, it takes notice and strips the “ELI5” keyword so that only the Cypher is left. Then it queries the Neo4j backend. If ELI5 is needed, it calls the eli5 function and delivers the result.

3. Test ELI5 in Doctor.ai

After the integration, let’s test the ELI5 from GPT-3. The setup instructions can be found in my articles here and here. Here I tried three diseases in three rephrases: small cell lung cancer, COVID-19 and Down syndrome.

Figure 3. Doctor.ai with ELI5. Image by author.

The original descriptions for COVID-19 and Down syndrome are:

Coronavirus disease of 2019 (COVID-19) is a highly contagious respiratory infection that is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). SARS-CoV-2 infects alveolar epithelial cells [mainly alveolar epithelial type 2 (AEC2) cells] through the angiotensin-converting enzyme 2 (ACE2) receptor. Upon the occupancy of ACE2 by SARS-CoV-2, the increased serum level of free Angiotensin II (Ang II) due to a reduction of ACE2-mediated degradation promotes activation of the NF-kappa B pathway via Ang II type 1 receptor (AT1R), followed by interleukin-6 (IL-6) production. SARS-CoV-2 also activates the innate immune system; macrophage stimulation triggers the overproduction of pro-inflammatory cytokines, including IL-6, and the “cytokine storm”, which results in systemic inflammatory response syndrome and multiple organ failure. The combined effects of complement activation, dysregulated neutrophilia, endothelial injury, and hypercoagulability appear to be intertwined to drive the severe features of COVID-19.

and

Down syndrome (DS), a genetic condition characterized by mental retardation and distinctive facial appearance, is caused by trisomy of chromosome 21 (HSA21). Down syndrome (DS) is the most common chromosomal malformation in newborns. Throughout the world, the overall prevalence of DS is 1 per 1,000 live births, although in recent years this figure has been increasing. Roughly 95% of cases of DS are due to the presence of an extra (third) copy of HSA21. Most often, the non-disjunction event leading to DS occurs in maternal meiosis I. In about 5% of patients, 1 copy is translocated to another acrocentric chromosome, most often chromosome 14 or 21. In 2 to 4% of cases with free trisomy 21 there is recognizable mosaicism for a trisomic and a normal cell line. DS occurs at a much higher incidence in older mothers. Nonetheless, the vast majority of DS births are to younger mothers. Clinical and experimental studies have shown that age independent DNA hypo-methylation is associated with chromosomal instability and abnormal segregation. Recent studies have linked the increased frequency of polymorphism of methylenetetrahydrofolate reductase (MTHFR) and methionine synthase gene (MTRR) in mothers with DS child. The phenotypic features of DS are quite variable from person to person and include learning disability, heart defects, early-onset Alzheimer’s disease and childhood leukaemia. This phenotypic variation is likely to be caused by a combination of environmental and genetic causes. Genetic polymorphisms in both Hsa21 and non-Hsa21 genes may account for much of this variation. Trisomy of Hsa21 has a significant impact on the development of many tissues, most notably the heart and the brain. A recent paper has suggested that RCAN1 and DYRK1A, localized in the Down syndrome critical region (DSCR) of HSA21, may have an impact on the development of multiple tissues.

As you can see, GPT-3 did a good job for simplifying small cell lung cancer. The description is concise and complete. But for COVID-19, the original text is entirely about biochemistry. GPT-3 could do little in this case. It just shortened the description but kept many of the jargon intact. It went right into molecular biology that interests few. Furthermore, the text ends abruptly and feels incomplete.

But for Down syndrome, the ELI5 version is easier to understand than the original. For example, it simplifies “at a much higher incidence” into “more often”. But it also has small issues. For example, it used the shorthand “HSA21” without telling the reader what it is because it omits the full name in the first sentence. This can lead to confusion. The ELI5 of COVID-19 did not have this problem though. GPT-3 left out the interesting statistics such as the prevalence of Down syndrome, too.

Conclusion

Users will quickly abandon a chatbot that utters only difficult replies. So the ELI5 function is essential. Doctor.ai is going to not only bring authoritative medical information to the public, but also formulate it in plain English. In this project, I just did that with GPT-3.

GPT-3 is so versatile that it has multiple roles in Doctor.ai. It translates other languages to English, converts English to Cypher, extracts the subject-verb-object relationships from raw texts, and now does ELI5. All functions are delivered via one unified API. I just needed to state my purposes either with examples or with a declarative header.

However, as we can see in the article, the ELI5 of GPT-3 is not always perfect. An ideal ELI5 should be able to summarize the key messages of the original text and avoid rare and highly technical terms. When the original text was very technical, GPT-3 seemed to lose its ability to do that, as shown above in the COVID-19 example.

The output length is a tricky parameter to adjust. Too short, GPT-3 will omit some important information, as you can see in the Down syndrome example. Too long, it will burn a larger hole in our pocket. OpenAI charges us $0.06 per 1,000 tokens for the Davinci engine. This rate seems deceptively low but I noticed that my bill went up rapidly. So keep an eye on your usage.

Join Medium with my referral link - Sixing Huang

As a Medium member, a portion of your membership fee goes to writers you read, and you get full access to every story…

dgg32.medium.com