在线视频亚洲,日韩欧美国产精品一区二区三区,国产一级免费

本文介紹了如何在 Python OpenCV 中檢測文本文檔圖像中的段落是否存在不一致的文本結(jié)構(gòu)的處理方法，對大家解決問題具有一定的參考價值，需要的朋友們下面隨著小編來一起學(xué)習(xí)吧！

問題描述

我試圖通過首先將其轉(zhuǎn)換為圖像然后使用 OpenCV 來識別 .pdf 文檔中的文本段落.但是我在文本行而不是段落上得到邊界框.如何設(shè)置一些閾值或其他限制來獲取段落而不是行?

這是示例輸入圖像:

這是我為上述示例得到的輸出:

我試圖在中間的段落上設(shè)置一個邊界框.我正在使用

這就是魔法發(fā)生的地方.我們可以假設(shè)一個段落是一段緊密相連的單詞，為了實現(xiàn)這一點，我們將相鄰的單詞進(jìn)行擴(kuò)張

結(jié)果

導(dǎo)入 cv2將 numpy 導(dǎo)入為 np# 加載圖像，灰度，高斯模糊，Otsu的閾值圖像 = cv2.imread('1.png')灰色 = cv2.cvtColor(圖像，cv2.COLOR_BGR2GRAY)模糊 = cv2.GaussianBlur(灰色, (7,7), 0)thresh = cv2.threshold(模糊, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]# 創(chuàng)建矩形結(jié)構(gòu)元素并擴(kuò)張內(nèi)核 = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))dilate = cv2.dilate(閾值，內(nèi)核，迭代=4)# 查找輪廓并繪制矩形cnts = cv2.findContours(擴(kuò)張，cv2.RETR_EXTERNAL，cv2.CHAIN_APPROX_SIMPLE)cnts = cnts[0] 如果 len(cnts) == 2 否則 cnts[1]對于 cnts 中的 c:x,y,w,h = cv2.boundingRect(c)cv2.rectangle(圖像, (x, y), (x + w, y + h), (36,255,12), 2)cv2.imshow('thresh', thresh)cv2.imshow('擴(kuò)張'，擴(kuò)張)cv2.imshow('圖像', 圖像)cv2.waitKey()

I am trying to identify paragraphs of text in a .pdf document by first converting it into an image then using OpenCV. But I am getting bounding boxes on lines of text instead of paragraphs. How can I set some threshold or some other limit to get paragraphs instead of lines?

Here is the sample input image:

Here is the output I am getting for the above sample:

I am trying to get a single bounding box on the paragraph in the middle. I am using this code.

import cv2
import numpy as np

large = cv2.imread('sample image.png')
rgb = cv2.pyrDown(large)
small = cv2.cvtColor(rgb, cv2.COLOR_BGR2GRAY)

# kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
kernel = np.ones((5, 5), np.uint8)
grad = cv2.morphologyEx(small, cv2.MORPH_GRADIENT, kernel)

_, bw = cv2.threshold(grad, 0.0, 255.0, cv2.THRESH_BINARY | cv2.THRESH_OTSU)

kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9, 1))
connected = cv2.morphologyEx(bw, cv2.MORPH_CLOSE, kernel)

# using RETR_EXTERNAL instead of RETR_CCOMP
contours, hierarchy = cv2.findContours(connected.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
#For opencv 3+ comment the previous line and uncomment the following line
#_, contours, hierarchy = cv2.findContours(connected.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)

mask = np.zeros(bw.shape, dtype=np.uint8)

for idx in range(len(contours)):
    x, y, w, h = cv2.boundingRect(contours[idx])
    mask[y:y+h, x:x+w] = 0
    cv2.drawContours(mask, contours, idx, (255, 255, 255), -1)
    r = float(cv2.countNonZero(mask[y:y+h, x:x+w])) / (w * h)

    if r > 0.45 and w > 8 and h > 8:
        cv2.rectangle(rgb, (x, y), (x+w-1, y+h-1), (0, 255, 0), 2)


cv2.imshow('rects', rgb)
cv2.waitKey(0)

解決方案

This is a classic use for dilate. Whenever you want to connect multiple items together, you can dilate them to join adjacent contours into a single contour. Here's a simple approach:

Convert image to grayscale and Gaussian blur
Otsu's threshold
Dilate to connect adjacent words together
Find contours and draw contours

Otsu's threshold

Here's where the magic happens. We can assume that a paragraph is a section of words that are close together, to achieve this we dilate to connect adjacent words

Result

import cv2
import numpy as np

# Load image, grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (7,7), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Create rectangular structuring element and dilate
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
dilate = cv2.dilate(thresh, kernel, iterations=4)

# Find contours and draw rectangle
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)

cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.imshow('image', image)
cv2.waitKey()

這篇關(guān)于如何在 Python OpenCV 中檢測文本文檔圖像中的段落是否存在不一致的文本結(jié)構(gòu)的文章就介紹到這了，希望我們推薦的答案對大家有所幫助，也希望大家多多支持html5模板網(wǎng)！

【網(wǎng)站聲明】本站部分內(nèi)容來源于互聯(lián)網(wǎng),旨在幫助大家更快的解決問題，如果有圖片或者內(nèi)容侵犯了您的權(quán)益，請聯(lián)系我們刪除處理，感謝您的支持！

久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

如何在 Python OpenCV 中檢測文本文檔圖像中的段落

問題描述

相關(guān)文檔推薦