Getting the bounding box of the recognized words using python-tesseract
Use pytesseract.image_to_data() import pytesseract from pytesseract import Output import cv2 img = cv2.imread(‘image.jpg’) d = pytesseract.image_to_data(img, output_type=Output.DICT) n_boxes = len(d[‘level’]) for i in range(n_boxes): (x, y, w, h) = (d[‘left’][i], d[‘top’][i], d[‘width’][i], d[‘height’][i]) cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2) cv2.imshow(‘img’, img) cv2.waitKey(0) Among the data returned by pytesseract.image_to_data(): … Read more