Create a Receipt Parsing Using OCR and a Large Language Model
In this tutorial, I will go through how I leverage an OCR to capture data from receipts and then leverages a Large Language Model (LLM) to extract pertinent details such as the total amount, date and time of the receipt, and additional relevant information.
To perform OCR, I will utilize the docTR tool from Mindee as outlined below.
To retrieve the information from the receipt, I will use Azure’s OpenAI capabilities.
Construct the OCR Output Data
Let’s begin the installation process for docTR and the necessary libraries on your machine. I will not going through the detail of the installation process as you can find comprehensive instructions in the provided Git repository
Let’s test the installation if is successful without error by executing this below code with the provided receipt image in Jpeg.

import os
import json
# Let's pick the desired backend
# os.environ['USE_TF'] = '1'
os.environ['USE_TORCH'] = '1'
import matplotlib.pyplot as plt
from doctr.io import DocumentFile
from doctr.models import ocr_predictor
# Read the file
doc = DocumentFile.from_images("receipt.jpg")
print(f"Number of pages: {len(doc)}")If there is no error, you will get this output:
Number of pages: 1Let’s proceed with the instantiation of a pre-trained model.
# Instantiate a pretrained model
predictor = ocr_predictor(pretrained=True)Export the output in JSON format.
result = predictor(doc)
# JSON export
json_export = result.export()
print(json_export)You will get this output:
{'pages': [{'page_idx': 0, 'dimensions': (600, 600), 'orientation': {'value': None, 'confidence': None}, 'language': {'value': None, 'confidence': None}, 'blocks': [{'geometry': ((0.2734375, 0.0), (0.6875, 0.1162109375)), 'lines': [{'geometry': ((0.33984375, 0.0), (0.6171875, 0.0234375)), 'words': [{'value': '#01-901', 'confidence': 0.9932250380516052, 'geometry': ((0.33984375, 0.001953125), (0.416015625, 0.0234375))}, {'value': 'SINGAPORE', 'confidence': 0.9812156558036804, 'geometry': ((0.4208984375, 0.001953125), (0.54296875, 0.01953125))}, {'value': '380011', 'confidence': 0.562835156917572, 'geometry': ((0.5458984375, 0.0), (0.6171875, 0.017578125))}]}, {'geometry': ((0.2734375, 0.017578125), (0.6875, 0.05078125)), 'words': [{'value': 'GST', 'confidence': 0.9999666213989258, 'geometry': ((0.2734375, 0.02734375), (0.3212890625, 0.0498046875))}, {'value': 'Reg:', 'confidence': 0.9997168183326721, 'geometry': ((0.322265625, 0.02734375), (0.3671875, 0.05078125))}, {'value': 'M2-0065333-5', 'confidence': 0.6861922740936279, 'geometry': ((0.3720703125, 0.0234375), (0.5087890625, 0.0439453125))}, {'value': 'UEN:', 'confidence': 0.9687079787254333, 'geometry': ((0.5087890625, 0.0205078125), (0.5625, 0.0419921875))}, {'value': '198304925E', 'confidence': 0.9952959418296814, 'geometry': ((0.56640625, 0.017578125), (0.6875, 0.0380859375))}]}, {'geometry': ((0.3603515625, 0.0439453125), (0.6015625, 0.0693359375)), 'words': [{'value': 'Phone', 'confidence': 0.9936328530311584, 'geometry': ((0.3603515625, 0.0498046875), (0.423828125, 0.068359375))}, {'value': ':', 'confidence': 0.9998807907104492, 'geometry': ((0.423828125, 0.0478515625), (0.4404296875, 0.0693359375))}, {'value': '67472780', 'confidence': 0.9968281388282776, 'geometry': ((0.4365234375, 0.0458984375), (0.5380859375, 0.0673828125))}, {'value': 'Fax:-', 'confidence': 0.9917964935302734, 'geometry': ((0.5380859375, 0.0439453125), (0.6015625, 0.0654296875))}]}, {'geometry': ((0.3720703125, 0.0703125), (0.5888671875, 0.095703125)), 'words': [{'value': 'Manager:', 'confidence': 0.6913022398948669, 'geometry': ((0.3720703125, 0.07421875), (0.4609375, 0.095703125))}, {'value': 'SIVAKUMAR', 'confidence': 0.9983320832252502, 'geometry': ((0.4658203125, 0.0703125), (0.5888671875, 0.0908203125))}]}, {'geometry': ((0.373046875, 0.09375), (0.5869140625, 0.1162109375)), 'words': [{'value': 'Contact', 'confidence': 0.992266833782196, 'geometry': ((0.373046875, 0.0966796875), (0.4482421875, 0.115234375))}, {'value': 'No.:', 'confidence': 0.9826020002365112, 'geometry': ((0.4482421875, 0.09375), (0.4912109375, 0.1162109375))}, {'value': '88008584', 'confidence': 0.8402541875839233, 'geometry': ((0.494140625, 0.09375), (0.5869140625, 0.1123046875))}]}], 'artefacts': []}, {'geometry': ((0.3046875, 0.134765625), (0.6611328125, 0.2314453125)), 'lines': [{'geometry': ((0.3056640625, 0.134765625), (0.6572265625, 0.16015625)), 'words': [{'value': 'Terminal:', 'confidence': 0.8031894564628601, 'geometry': ((0.3056640625, 0.142578125), (0.396484375, 0.16015625))}, {'value': 'BK0003', 'confidence': 0.8097429275512695, 'geometry': ((0.404296875, 0.140625), (0.4814453125, 0.1591796875))}, {'value': '13/02/2022', 'confidence': 0.8739034533500671, 'geometry': ((0.4892578125, 0.1376953125), (0.595703125, 0.158203125))}, {'value': '19:21', 'confidence': 0.9997132420539856, 'geometry': ((0.603515625, 0.134765625), (0.6572265625, 0.15625))}]}, {'geometry': ((0.3046875, 0.1630859375), (0.6611328125, 0.1904296875)), 'words': [{'value': 'ReceiptTaxInvoice', 'confidence': 0.4457036852836609, 'geometry': ((0.3046875, 0.166015625), (0.4892578125, 0.1904296875))}, {'value': 'BKA3500490695', 'confidence': 0.504152774810791, 'geometry': ((0.49609375, 0.1630859375), (0.6611328125, 0.18359375))}]}, {'geometry': ((0.3662109375, 0.1884765625), (0.59765625, 0.208984375)), 'words': [{'value': 'Quotation', 'confidence': 0.8169445991516113, 'geometry': ((0.3662109375, 0.1904296875), (0.458984375, 0.208984375))}, {'value': 'No.', 'confidence': 0.9977673292160034, 'geometry': ((0.4609375, 0.1884765625), (0.498046875, 0.208984375))}, {'value': ':', 'confidence': 0.9996732473373413, 'geometry': ((0.4990234375, 0.189453125), (0.5126953125, 0.2080078125))}, {'value': 'S031362', 'confidence': 0.5456238985061646, 'geometry': ((0.51171875, 0.1884765625), (0.59765625, 0.20703125))}]}, {'geometry': ((0.34375, 0.208984375), (0.6220703125, 0.2314453125)), 'words': [{'value': 'Cashier:', 'confidence': 0.9858759045600891, 'geometry': ((0.34375, 0.212890625), (0.4228515625, 0.2314453125))}, {'value': 'HONG', 'confidence': 0.9993447661399841, 'geometry': ((0.43359375, 0.2099609375), (0.4990234375, 0.2314453125))}, {'value': 'THI', 'confidence': 0.9992380142211914, 'geometry': ((0.5, 0.2099609375), (0.537109375, 0.2294921875))}, {'value': 'BE', 'confidence': 0.9985008239746094, 'geometry': ((0.5390625, 0.208984375), (0.572265625, 0.2314453125))}, {'value': 'DAO', 'confidence': 0.9940517544746399, 'geometry': ((0.5732421875, 0.208984375), (0.6220703125, 0.228515625))}]}], 'artefacts': []}, {'geometry': ((0.2451171875, 0.24609375), (0.40234375, 0.26953125)), 'lines': [{'geometry': ((0.2451171875, 0.24609375), (0.40234375, 0.26953125)), 'words': [{'value': 'No', 'confidence': 0.9999253749847412, 'geometry': ((0.2451171875, 0.24609375), (0.2822265625, 0.26953125))}, {'value': 'Description', 'confidence': 0.9901004433631897, 'geometry': ((0.294921875, 0.248046875), (0.40234375, 0.26953125))}]}], 'artefacts': []}, {'geometry': ((0.564453125, 0.2421875), (0.7177734375, 0.26953125)), 'lines': [{'geometry': ((0.564453125, 0.2421875), (0.7177734375, 0.26953125)), 'words': [{'value': 'Qty', 'confidence': 0.9939969778060913, 'geometry': ((0.564453125, 0.2421875), (0.6064453125, 0.26953125))}, {'value': 'Amount', 'confidence': 0.9966546297073364, 'geometry': ((0.640625, 0.2431640625), (0.7177734375, 0.26171875))}]}], 'artefacts': []}, {'geometry': ((0.2578125, 0.2724609375), (0.5908203125, 0.298828125)), 'lines': [{'geometry': ((0.2578125, 0.2724609375), (0.5908203125, 0.298828125)), 'words': [{'value': '1.', 'confidence': 0.9985117316246033, 'geometry': ((0.2578125, 0.2744140625), (0.2919921875, 0.298828125))}, {'value': '#OTIS', 'confidence': 0.9894990921020508, 'geometry': ((0.2919921875, 0.275390625), (0.3642578125, 0.2978515625))}, {'value': 'BARISTA', 'confidence': 0.42725348472595215, 'geometry': ((0.3662109375, 0.2763671875), (0.458984375, 0.2939453125))}, {'value': 'OAT', 'confidence': 0.999354898929596, 'geometry': ((0.4609375, 0.2744140625), (0.5068359375, 0.2939453125))}, {'value': 'MILK', 'confidence': 0.9774147272109985, 'geometry': ((0.5087890625, 0.2724609375), (0.5634765625, 0.2939453125))}, {'value': '1L', 'confidence': 0.9945043325424194, 'geometry': ((0.5595703125, 0.2724609375), (0.5908203125, 0.29296875))}]}], 'artefacts': []}, {'geometry': ((0.2490234375, 0.30859375), (0.45703125, 0.40234375)), 'lines': [{'geometry': ((0.306640625, 0.30859375), (0.45703125, 0.326171875)), 'words': [{'value': '9421906089017', 'confidence': 0.9027230143547058, 'geometry': ((0.306640625, 0.30859375), (0.45703125, 0.326171875))}]}, {'geometry': ((0.3046875, 0.3330078125), (0.4208984375, 0.3544921875)), 'words': [{'value': '2', 'confidence': 0.9997554421424866, 'geometry': ((0.3046875, 0.3330078125), (0.322265625, 0.3544921875))}, {'value': 'for', 'confidence': 0.9995049238204956, 'geometry': ((0.3212890625, 0.3330078125), (0.3525390625, 0.3544921875))}, {'value': '$11.95', 'confidence': 0.9979997277259827, 'geometry': ((0.353515625, 0.333984375), (0.4208984375, 0.3525390625))}]}, {'geometry': ((0.2490234375, 0.3798828125), (0.3828125, 0.40234375)), 'words': [{'value': 'Total', 'confidence': 0.9654089212417603, 'geometry': ((0.2490234375, 0.3798828125), (0.302734375, 0.40234375))}, {'value': 'Amount', 'confidence': 0.9976258873939514, 'geometry': ((0.3056640625, 0.3818359375), (0.3828125, 0.400390625))}]}], 'artefacts': []}, {'geometry': ((0.529296875, 0.3056640625), (0.607421875, 0.32421875)), 'lines': [{'geometry': ((0.529296875, 0.3056640625), (0.607421875, 0.32421875)), 'words': [{'value': '4x6.95', 'confidence': 0.629564642906189, 'geometry': ((0.529296875, 0.3056640625), (0.607421875, 0.32421875))}]}], 'artefacts': []}, {'geometry': ((0.6513671875, 0.3017578125), (0.724609375, 0.47265625)), 'lines': [{'geometry': ((0.662109375, 0.3017578125), (0.7216796875, 0.3232421875)), 'words': [{'value': '27.80', 'confidence': 0.9991148114204407, 'geometry': ((0.662109375, 0.3017578125), (0.7216796875, 0.3232421875))}]}, {'geometry': ((0.66796875, 0.328125), (0.7216796875, 0.349609375)), 'words': [{'value': '-3.90', 'confidence': 0.9843301177024841, 'geometry': ((0.66796875, 0.328125), (0.7216796875, 0.349609375))}]}, {'geometry': ((0.6513671875, 0.375), (0.7216796875, 0.3974609375)), 'words': [{'value': '$23.90', 'confidence': 0.9994686245918274, 'geometry': ((0.6513671875, 0.375), (0.7216796875, 0.3974609375))}]}, {'geometry': ((0.65234375, 0.4111328125), (0.72265625, 0.4326171875)), 'words': [{'value': '$23.90', 'confidence': 0.9990628361701965, 'geometry': ((0.65234375, 0.4111328125), (0.72265625, 0.4326171875))}]}, {'geometry': ((0.666015625, 0.4501953125), (0.724609375, 0.47265625)), 'words': [{'value': '$0.00', 'confidence': 0.9990418553352356, 'geometry': ((0.666015625, 0.4501953125), (0.724609375, 0.47265625))}]}], 'artefacts': []}, {'geometry': ((0.248046875, 0.416015625), (0.4931640625, 0.560546875)), 'lines': [{'geometry': ((0.2509765625, 0.416015625), (0.4931640625, 0.4384765625)), 'words': [{'value': 'MASIERICOOID', 'confidence': 0.16996675729751587, 'geometry': ((0.2509765625, 0.416015625), (0.4931640625, 0.4384765625))}]}, {'geometry': ((0.2490234375, 0.455078125), (0.376953125, 0.48046875)), 'words': [{'value': 'Change', 'confidence': 0.9970219731330872, 'geometry': ((0.2490234375, 0.4560546875), (0.330078125, 0.48046875))}, {'value': 'Due', 'confidence': 0.9999706745147705, 'geometry': ((0.33203125, 0.455078125), (0.376953125, 0.4775390625))}]}, {'geometry': ((0.2490234375, 0.48828125), (0.447265625, 0.51171875)), 'words': [{'value': 'Items', 'confidence': 0.9890830516815186, 'geometry': ((0.2490234375, 0.490234375), (0.306640625, 0.51171875))}, {'value': 'Purchased', 'confidence': 0.9993000030517578, 'geometry': ((0.310546875, 0.4892578125), (0.4189453125, 0.509765625))}, {'value': ':', 'confidence': 0.9981997013092041, 'geometry': ((0.419921875, 0.490234375), (0.43359375, 0.509765625))}, {'value': '4', 'confidence': 0.9994581341743469, 'geometry': ((0.4296875, 0.48828125), (0.447265625, 0.509765625))}]}, {'geometry': ((0.248046875, 0.53125), (0.3935546875, 0.560546875)), 'words': [{'value': '#Total', 'confidence': 0.9086952209472656, 'geometry': ((0.248046875, 0.53125), (0.322265625, 0.560546875))}, {'value': 'Saving', 'confidence': 0.9651548862457275, 'geometry': ((0.3232421875, 0.5341796875), (0.3935546875, 0.5595703125))}]}], 'artefacts': []}, {'geometry': ((0.4296875, 0.5322265625), (0.4970703125, 0.5546875)), 'lines': [{'geometry': ((0.4296875, 0.5322265625), (0.4970703125, 0.5546875)), 'words': [{'value': '-', 'confidence': 0.43670952320098877, 'geometry': ((0.4296875, 0.5361328125), (0.4453125, 0.55078125))}, {'value': '$3.90', 'confidence': 0.9483895301818848, 'geometry': ((0.4365234375, 0.5322265625), (0.4970703125, 0.5546875))}]}], 'artefacts': []}, {'geometry': ((0.2509765625, 0.564453125), (0.6005859375, 0.5908203125)), 'lines': [{'geometry': ((0.2509765625, 0.564453125), (0.6005859375, 0.5908203125)), 'words': [{'value': 'GST', 'confidence': 0.9998136162757874, 'geometry': ((0.2509765625, 0.568359375), (0.30078125, 0.5908203125))}, {'value': '%', 'confidence': 0.999920129776001, 'geometry': ((0.3017578125, 0.5673828125), (0.3271484375, 0.5908203125))}, {'value': 'Exclude', 'confidence': 0.8899426460266113, 'geometry': ((0.353515625, 0.568359375), (0.43359375, 0.5869140625))}, {'value': 'GST', 'confidence': 0.9998469352722168, 'geometry': ((0.435546875, 0.5654296875), (0.4853515625, 0.587890625))}, {'value': 'GST', 'confidence': 0.998401939868927, 'geometry': ((0.5048828125, 0.564453125), (0.5546875, 0.5869140625))}, {'value': 'Amt', 'confidence': 0.850462794303894, 'geometry': ((0.5546875, 0.564453125), (0.6005859375, 0.5869140625))}]}], 'artefacts': []}, {'geometry': ((0.6533203125, 0.5654296875), (0.734375, 0.6171875)), 'lines': [{'geometry': ((0.6533203125, 0.5654296875), (0.7314453125, 0.583984375)), 'words': [{'value': 'Amount', 'confidence': 0.9848493337631226, 'geometry': ((0.6533203125, 0.5654296875), (0.7314453125, 0.583984375))}]}, {'geometry': ((0.6630859375, 0.5947265625), (0.734375, 0.6171875)), 'words': [{'value': '$23.90', 'confidence': 0.9978439807891846, 'geometry': ((0.6630859375, 0.5947265625), (0.734375, 0.6171875))}]}], 'artefacts': []}, {'geometry': ((0.2802734375, 0.599609375), (0.2978515625, 0.6220703125)), 'lines': [{'geometry': ((0.2802734375, 0.599609375), (0.2978515625, 0.6220703125)), 'words': [{'value': '7', 'confidence': 0.9998346567153931, 'geometry': ((0.2802734375, 0.599609375), (0.2978515625, 0.6220703125))}]}], 'artefacts': []}, {'geometry': ((0.4140625, 0.5986328125), (0.484375, 0.6201171875)), 'lines': [{'geometry': ((0.4140625, 0.5986328125), (0.484375, 0.6201171875)), 'words': [{'value': '$22.34', 'confidence': 0.9993184804916382, 'geometry': ((0.4140625, 0.5986328125), (0.484375, 0.6201171875))}]}], 'artefacts': []}, {'geometry': ((0.541015625, 0.5966796875), (0.6005859375, 0.6181640625)), 'lines': [{'geometry': ((0.541015625, 0.5966796875), (0.6005859375, 0.6181640625)), 'words': [{'value': '$1.56', 'confidence': 0.9944227337837219, 'geometry': ((0.541015625, 0.5966796875), (0.6005859375, 0.6181640625))}]}], 'artefacts': []}, {'geometry': ((0.4404296875, 0.666015625), (0.5400390625, 0.6875)), 'lines': [{'geometry': ((0.4404296875, 0.666015625), (0.5400390625, 0.6875)), 'words': [{'value': 'MASTER', 'confidence': 0.8670670986175537, 'geometry': ((0.4404296875, 0.666015625), (0.5400390625, 0.6875))}]}], 'artefacts': []}, {'geometry': ((0.248046875, 0.701171875), (0.6865234375, 0.74609375)), 'lines': [{'geometry': ((0.248046875, 0.701171875), (0.642578125, 0.7216796875)), 'words': [{'value': 'DatelTime:', 'confidence': 0.8654562830924988, 'geometry': ((0.248046875, 0.7041015625), (0.337890625, 0.7216796875))}, {'value': '13022022192100', 'confidence': 0.6854404211044312, 'geometry': ((0.35546875, 0.7041015625), (0.525390625, 0.71875))}, {'value': '(Contactiess)', 'confidence': 0.5816012024879456, 'geometry': ((0.5361328125, 0.701171875), (0.642578125, 0.7216796875))}]}, {'geometry': ((0.248046875, 0.7255859375), (0.6865234375, 0.74609375)), 'words': [{'value': 'Mercid', 'confidence': 0.8570956587791443, 'geometry': ((0.248046875, 0.7275390625), (0.3134765625, 0.74609375))}, {'value': '000001050644651', 'confidence': 0.7285884022712708, 'geometry': ((0.34375, 0.7265625), (0.4970703125, 0.744140625))}, {'value': 'Terminal', 'confidence': 0.9665992259979248, 'geometry': ((0.5068359375, 0.7265625), (0.5830078125, 0.744140625))}, {'value': '-', 'confidence': 0.93905109167099, 'geometry': ((0.591796875, 0.728515625), (0.6015625, 0.7421875))}, {'value': '51523260', 'confidence': 0.9988250136375427, 'geometry': ((0.6015625, 0.7255859375), (0.6865234375, 0.7431640625))}]}], 'artefacts': []}, {'geometry': ((0.2490234375, 0.75), (0.4697265625, 0.79296875)), 'lines': [{'geometry': ((0.25, 0.75), (0.4111328125, 0.7724609375)), 'words': [{'value': 'Approval', 'confidence': 0.9914907813072205, 'geometry': ((0.25, 0.7509765625), (0.326171875, 0.7724609375))}, {'value': ':', 'confidence': 0.9201642274856567, 'geometry': ((0.3330078125, 0.7509765625), (0.3466796875, 0.7705078125))}, {'value': 'R69046', 'confidence': 0.9995259046554565, 'geometry': ((0.3427734375, 0.75), (0.4111328125, 0.7685546875))}]}, {'geometry': ((0.2490234375, 0.771484375), (0.4697265625, 0.79296875)), 'words': [{'value': 'RefNo', 'confidence': 0.9922246932983398, 'geometry': ((0.2490234375, 0.771484375), (0.3125, 0.79296875))}, {'value': '000011076745', 'confidence': 0.9994035959243774, 'geometry': ((0.34375, 0.771484375), (0.4697265625, 0.7890625))}]}], 'artefacts': []}, {'geometry': ((0.5078125, 0.748046875), (0.576171875, 0.814453125)), 'lines': [{'geometry': ((0.5078125, 0.748046875), (0.5595703125, 0.7666015625)), 'words': [{'value': 'Batch', 'confidence': 0.9954745173454285, 'geometry': ((0.5078125, 0.748046875), (0.5595703125, 0.7666015625))}]}, {'geometry': ((0.5078125, 0.7685546875), (0.552734375, 0.7880859375)), 'words': [{'value': 'Card', 'confidence': 0.9997015595436096, 'geometry': ((0.5078125, 0.7685546875), (0.552734375, 0.7880859375))}]}, {'geometry': ((0.5078125, 0.7958984375), (0.576171875, 0.814453125)), 'words': [{'value': 'Amount', 'confidence': 0.9982516169548035, 'geometry': ((0.5078125, 0.7958984375), (0.576171875, 0.814453125))}]}], 'artefacts': []}, {'geometry': ((0.6015625, 0.7470703125), (0.6669921875, 0.7646484375)), 'lines': [{'geometry': ((0.6015625, 0.7470703125), (0.6669921875, 0.7646484375)), 'words': [{'value': '000435', 'confidence': 0.9871779680252075, 'geometry': ((0.6015625, 0.7470703125), (0.6669921875, 0.7646484375))}]}], 'artefacts': []}, {'geometry': ((0.65625, 0.7685546875), (0.7373046875, 0.857421875)), 'lines': [{'geometry': ((0.6728515625, 0.7685546875), (0.7138671875, 0.783203125)), 'words': [{'value': '1641', 'confidence': 0.9989182949066162, 'geometry': ((0.6728515625, 0.7685546875), (0.7138671875, 0.783203125))}]}, {'geometry': ((0.666015625, 0.7939453125), (0.732421875, 0.8125)), 'words': [{'value': '$23.90', 'confidence': 0.9973084926605225, 'geometry': ((0.666015625, 0.7939453125), (0.732421875, 0.8125))}]}, {'geometry': ((0.65625, 0.8330078125), (0.7373046875, 0.857421875)), 'words': [{'value': '$23.90', 'confidence': 0.9831066131591797, 'geometry': ((0.65625, 0.8330078125), (0.7373046875, 0.857421875))}]}], 'artefacts': []}, {'geometry': ((0.4208984375, 0.8369140625), (0.57421875, 0.9228515625)), 'lines': [{'geometry': ((0.4345703125, 0.8369140625), (0.560546875, 0.8603515625)), 'words': [{'value': 'Net', 'confidence': 0.9999843835830688, 'geometry': ((0.4345703125, 0.8369140625), (0.4765625, 0.8603515625))}, {'value': 'Amount', 'confidence': 0.9871867895126343, 'geometry': ((0.4775390625, 0.8369140625), (0.560546875, 0.8583984375))}]}, {'geometry': ((0.4208984375, 0.8984375), (0.57421875, 0.9228515625)), 'words': [{'value': 'APPROVED', 'confidence': 0.9999109506607056, 'geometry': ((0.4208984375, 0.8984375), (0.57421875, 0.9228515625))}]}], 'artefacts': []}]}]}Let’s print the output using matplotlib.
synthetic_pages = result.synthesize()
plt.figure(figsize=(18, 16)) # Adjust the width and height as needed
plt.imshow(synthetic_pages[0]); plt.axis('off'); plt.show()
We need to remove the irrelevant information in JSON output such as dimension, orientation, language, geometry associated with blocks and lines. My focus is solely on extracting the data associated under words: value and geometry without confidence as I highlighted in the box below.

To proceed with the elimination of irrelevant information from the JSON output.
# Define a function to remove fields recursively
def remove_fields(obj, fields):
if isinstance(obj, list):
for item in obj:
remove_fields(item, fields)
elif isinstance(obj, dict):
for key in list(obj.keys()):
if key in fields:
del obj[key]
else:
remove_fields(obj[key], fields)
# Function to remove 'geometry' key from 'blocks' and 'lines'
def remove_geometry(data):
if isinstance(data, list):
for item in data:
remove_geometry(item)
elif isinstance(data, dict):
if 'geometry' in data:
del data['geometry']
for key, value in data.items():
remove_geometry(value)
# Fields to remove
fields_to_remove = ['confidence', 'page_idx', 'dimensions', 'orientation', 'language', 'artefacts']
# Remove the specified fields
remove_fields(json_export, fields_to_remove)
# Remove 'geometry' from 'blocks' and 'lines'
for page in json_export['pages']:
for block in page['blocks']:
if 'geometry' in block:
del block['geometry']
for line in block.get('lines', []):
if 'geometry' in line:
del line['geometry']
# Convert the modified data back to JSON
modified_json = json.dumps(json_export, separators=(',', ':'))
# Print the modified JSON
print(modified_json)Subsequently, save the output to a file named OCR.txt.
#Convert the JSON data to a string
json_export_str = str(modified_json)
# Write the JSON data to a file
with open("OCR.txt", "w") as file:
file.write(json_export_str)The resulting output will now appear as follows:

Now, we are prepared to provide this information to LLM.
Input into the LLM
We will proceed by importing the LangChain libraries and entering the Azure OpenAI API key.
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.prompts import PromptTemplate
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.chat_models import AzureChatOpenAI
from langchain.chains import RetrievalQA
import os
os.environ["OPENAI_API_TYPE"] = "azure"
os.environ["OPENAI_API_VERSION"] = ""
os.environ["OPENAI_API_BASE"] = ""
os.environ["OPENAI_API_KEY"] = ""We load the OCR.txt file, split its contents, and insert them into the FAISS database as vectors with OpenAI embeddings.
embedding_model = OpenAIEmbeddings(chunk_size=10)
OCR_Content = TextLoader('OCR.txt').load()
text_splitter = CharacterTextSplitter(chunk_overlap=100)
content = text_splitter.split_documents(OCR_Content)
faiss_db = FAISS.from_documents(content, embedding_model)
retriever = faiss_db.as_retriever(search_type="similarity", search_kwargs={"k": 4})We set the temperature to 0 and utilize the gpt-4 deployment. Additionally, we establish the prompt template.
Within the prompt, I explicitly stated:
Analyze the JSON receipt data provided and group “value” entries with similar “geometry” proximity under “words,” then summarize this information into one concise sentence.
llm = AzureChatOpenAI(
temperature=0,
deployment_name="gpt-4",
)
prompt_template = """
Task: Analyze the JSON receipt data provided and group "value" entries with similar "geometry" proximity under "words," then summarize this information into one concise sentence.
JSON Data:
{context}
User questions:
{question}
Respond to the user in JSON format and include the key-value pairs:
"""
QA_PROMPT = PromptTemplate(
template=prompt_template, input_variables=['context', 'question']
)We will use RetrievalQA with a specific question to extract information such as the amount, receipt number, date & time, and line items.
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
retriever=retriever,
chain_type_kwargs={"prompt": QA_PROMPT},
verbose=True
)
question = """
Please extract the following details:
Amount,
Receipt/Invoice number,
Date & Time,
Line Items
"""
result = qa_chain({"query": question})
print(result["result"])Here is the output:

It successfully extracted the amount, receipt number and receipt date & time accurately. The additional fine-tuning is necessary to improve the output for the line items.





