久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

使用 OpenCV 進行圖像處理,從圖像中去除背景文本

Remove background text and noise from an image using image processing with OpenCV(使用 OpenCV 進行圖像處理,從圖像中去除背景文本和噪點)
本文介紹了使用 OpenCV 進行圖像處理,從圖像中去除背景文本和噪點的處理方法,對大家解決問題具有一定的參考價值,需要的朋友們下面隨著小編來一起學習吧!

問題描述

我有這些圖片

我想刪除背景中的文本.只有 captcha 字符 應該保留(即 K6PwKA、YabVzu).任務是稍后使用 tesseract 識別這些字符.

這是我嘗試過的方法,但準確性并不高.

導入 cv2導入 pytesseractpytesseract.pytesseract.tesseract_cmd = r"C:UsersHPO2KORAppDataLocalTesseract-OCR	esseract.exe"img = cv2.imread("untitled.png")gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)gray_filtered = cv2.inRange(gray_image, 0, 75)cv2.imwrite("cleaned.png", gray_filtered)

我該如何改進?

注意:我嘗試了所有關于這個問題的建議,但沒有一個對我有用.

根據 Elias 的說法,我嘗試使用 Photoshop 將驗證碼文本的顏色轉換為介于 [100, 105] 之間的灰度.然后我根據這個范圍對圖像進行閾值處理.但是我得到的結果并沒有從 tesseract 中得到令人滿意的結果.

gray_filtered = cv2.inRange(gray_image, 100, 105)cv2.imwrite("cleaned.png", gray_filtered)gray_inv = ~gray_filteredcv2.imwrite("cleaned.png", gray_inv)數據 = pytesseract.image_to_string(gray_inv, lang='eng')

輸出:

'KEP wKA'

結果:

編輯 2:

def get_text(img_name):較低 = (100, 100, 100)上 = (104, 104, 104)img = cv2.imread(img_name)img_rgb_inrange = cv2.inRange(img,下,上)neg_rgb_image = ~img_rgb_inrangecv2.imwrite('neg_img_rgb_inrange.png', neg_rgb_image)數據 = pytesseract.image_to_string(neg_rgb_image, lang='eng')返回數據

給:

文本為

GXuMuUZ

有什么辦法可以緩和一點

解決方案

這里有兩種可能的方法和一種糾正扭曲文本的方法:

方法一:形態學運算+輪廓濾波

  1. 獲取二進制圖像.

    輪廓區域過濾->反轉->應用模糊得到結果

    OCR 的結果

    YabVzu

    代碼

    導入 cv2導入 pytesseract將 numpy 導入為 nppytesseract.pytesseract.tesseract_cmd = rC:Program FilesTesseract-OCR	esseract.exe"# 加載圖片,灰度,Otsu的閾值圖像 = cv2.imread('2.png')灰色 = cv2.cvtColor(圖像,cv2.COLOR_BGR2GRAY)thresh = cv2.threshold(灰色, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]# 變形打開以消除噪音內核 = cv2.getStructuringElement(cv2.MORPH_RECT, (2,2))開放 = cv2.morphologyEx(閾值,cv2.MORPH_OPEN,內核,迭代 = 1)# 尋找輪廓并去除小噪聲cnts = cv2.findContours(打開,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)cnts = cnts[0] 如果 len(cnts) == 2 否則 cnts[1]對于 cnts 中的 c:面積 = cv2.contourArea(c)如果面積 <50:cv2.drawContours(開口,[c],-1,0,-1)# 反轉并應用輕微的高斯模糊結果 = 255 - 打開結果 = cv2.GaussianBlur(結果, (3,3), 0)# 執行 OCR數據 = pytesseract.image_to_string(結果,lang='eng',config='--psm 6')打印(數據)cv2.imshow('thresh', thresh)cv2.imshow('開場', 開場)cv2.imshow('結果', 結果)cv2.waitKey()

    方法二:顏色分割

    觀察到要提取的所需文本與圖像中的噪聲具有可區分的對比度,我們可以使用顏色閾值來隔離文本.這個想法是轉換為 HSV 格式然后顏色閾值以獲得使用較低/較高顏色范圍的掩碼.從我們是否使用相同的過程到 Pytesseract 進行 OCR.


    輸入圖像->掩碼->結果

    代碼

    導入 cv2導入 pytesseract將 numpy 導入為 nppytesseract.pytesseract.tesseract_cmd = rC:Program FilesTesseract-OCR	esseract.exe"# 加載圖片,轉換為HSV,顏色閾值得到mask圖像 = cv2.imread('2.png')hsv = cv2.cvtColor(圖像,cv2.COLOR_BGR2HSV)較低 = np.array([0, 0, 0])上 = np.array([100, 175, 110])掩碼 = cv2.inRange(hsv, 下, 上)# 反轉圖像和 OCR反轉 = 255 - 掩碼數據 = pytesseract.image_to_string(反轉,lang='eng',config='--psm 6')打印(數據)cv2.imshow('掩碼', 掩碼)cv2.imshow('反轉',反轉)cv2.waitKey()

    糾正扭曲的文字

    OCR 在圖像水平時效果最佳.為了確保文本是 OCR 的理想格式,我們可以執行透視變換.在去除所有噪聲以隔離文本之后,我們可以執行變形關閉以將單個文本輪廓組合成單個輪廓.從這里我們可以使用

    與其他圖像一起輸出

    更新代碼以包含透視變換

    導入 cv2導入 pytesseract將 numpy 導入為 np從 imutils.perspective 導入four_point_transformpytesseract.pytesseract.tesseract_cmd = rC:Program FilesTesseract-OCR	esseract.exe"# 加載圖片,轉換為HSV,顏色閾值得到mask圖像 = cv2.imread('1.png')hsv = cv2.cvtColor(圖像,cv2.COLOR_BGR2HSV)較低 = np.array([0, 0, 0])上 = np.array([100, 175, 110])掩碼 = cv2.inRange(hsv, 下, 上)# 變形關閉以將單個文本連接成單個輪廓內核 = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))關閉 = cv2.morphologyEx(掩碼,cv2.MORPH_CLOSE,內核,迭代 = 3)# 找到旋轉的邊界框,然后進行透視變換cnts = cv2.findContours(關閉,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)cnts = cnts[0] 如果 len(cnts) == 2 否則 cnts[1]矩形 = cv2.minAreaRect(cnts[0])box = cv2.boxPoints(rect)盒子 = np.int0(盒子)cv2.drawContours(圖像,[框],0,(36,255,12),2)扭曲 =four_point_transform(255 - 掩碼,box.reshape(4, 2))# 字符識別數據 = pytesseract.image_to_string(扭曲,lang='eng',config='--psm 6')打印(數據)cv2.imshow('掩碼', 掩碼)cv2.imshow('關閉',關閉)cv2.imshow('扭曲',扭曲)cv2.imshow('圖像', 圖像)cv2.waitKey()

    注意:顏色閾值范圍是使用此 HSV 閾值腳本確定的

    導入 cv2將 numpy 導入為 np什么都沒有(x):經過# 加載圖片圖像 = cv2.imread('2.png')# 創建一個窗口cv2.namedWindow('圖像')# 創建顏色變化的軌跡欄# Opencv 的色調為 0-179cv2.createTrackbar('HMin', 'image', 0, 179, 沒有)cv2.createTrackbar('SMin', 'image', 0, 255, nothing)cv2.createTrackbar('VMin', 'image', 0, 255, nothing)cv2.createTrackbar('HMax', 'image', 0, 179, 沒有)cv2.createTrackbar('SMax', 'image', 0, 255, nothing)cv2.createTrackbar('VMax', 'image', 0, 255, nothing)# 設置 Max HSV 軌跡欄的默認值cv2.setTrackbarPos('HMax', '圖像', 179)cv2.setTrackbarPos('SMax', '圖像', 255)cv2.setTrackbarPos('VMax', 'image', 255)# 初始化 HSV 最小/最大值hMin = sMin = vMin = hMax = sMax = vMax = 0phMin = psMin = pvMin = phMax = psMax = pvMax = 0而(1):# 獲取所有trackbar的當前位置hMin = cv2.getTrackbarPos('HMin', 'image')sMin = cv2.getTrackbarPos('SMin', 'image')vMin = cv2.getTrackbarPos('VMin', 'image')hMax = cv2.getTrackbarPos('HMax', 'image')sMax = cv2.getTrackbarPos('SMax', 'image')vMax = cv2.getTrackbarPos('VMax', 'image')# 設置要顯示的最小和最大 HSV 值較低 = np.array([hMin, sMin, vMin])上 = np.array([hMax, sMax, vMax])# 轉換為HSV格式和顏色閾值hsv = cv2.cvtColor(圖像,cv2.COLOR_BGR2HSV)掩碼 = cv2.inRange(hsv, 下, 上)結果= cv2.bitwise_and(圖像,圖像,掩碼=掩碼)# 如果 HSV 值發生變化,打印如果((phMin!= hMin)|(psMin!= sMin)|(pvMin!= vMin)|(phMax!= hMax)|(psMax!= sMax)|(pvMax!= vMax)):print("(hMin = %d , sMin = %d, vMin = %d), (hMax = %d , sMax = %d, vMax = %d)" % (hMin , sMin , vMin, hMax, sMax, vMax))phMin = hMinpsMin = sMinpvMin = vMinphMax = hMaxpsMax = sMaxpvMax = vMax# 顯示結果圖片cv2.imshow('圖像', 結果)如果 cv2.waitKey(10) &0xFF == ord('q'):休息cv2.destroyAllWindows()

    I have these images

    For which I want to remove the text in the background. Only the captcha characters should remain(i.e K6PwKA, YabVzu). The task is to identify these characters later using tesseract.

    This is what I have tried, but it isn't giving much good accuracy.

    import cv2
    import pytesseract
    
    pytesseract.pytesseract.tesseract_cmd = r"C:UsersHPO2KORAppDataLocalTesseract-OCR	esseract.exe"
    img = cv2.imread("untitled.png")
    gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    gray_filtered = cv2.inRange(gray_image, 0, 75)
    cv2.imwrite("cleaned.png", gray_filtered)
    

    How can I improve the same?

    Note : I tried all the suggestion that I was getting for this question and none of them worked for me.

    EDIT : According to Elias, I tried finding the color of the captcha text using photoshop by converting it to grayscale which came out to be somewhere in between [100, 105]. I then threshold the image based on this range. But the result which I got did not give satisfactory result from tesseract.

    gray_filtered = cv2.inRange(gray_image, 100, 105)
    cv2.imwrite("cleaned.png", gray_filtered)
    gray_inv = ~gray_filtered
    cv2.imwrite("cleaned.png", gray_inv)
    data = pytesseract.image_to_string(gray_inv, lang='eng')
    

    Output :

    'KEP wKA'
    

    Result :

    EDIT 2 :

    def get_text(img_name):
        lower = (100, 100, 100)
        upper = (104, 104, 104) 
        img = cv2.imread(img_name)
        img_rgb_inrange = cv2.inRange(img, lower, upper)
        neg_rgb_image = ~img_rgb_inrange
        cv2.imwrite('neg_img_rgb_inrange.png', neg_rgb_image)
        data = pytesseract.image_to_string(neg_rgb_image, lang='eng')
        return data
    

    gives :

    and the text as

    GXuMuUZ
    

    Is there any way to soften it a little

    解決方案

    Here are two potential approaches and a method to correct distorted text:

    Method #1: Morphological operations + contour filtering

    1. Obtain binary image. Load image, grayscale, then Otsu's threshold.

    2. Remove text contours. Create a rectangular kernel with cv2.getStructuringElement() and then perform morphological operations to remove noise.

    3. Filter and remove small noise. Find contours and filter using contour area to remove small particles. We effectively remove the noise by filling in the contour with cv2.drawContours()

    4. Perform OCR. We invert the image then apply a slight Gaussian blur. We then OCR using Pytesseract with the --psm 6 configuration option to treat the image as a single block of text. Look at Tesseract improve quality for other methods to improve detection and Pytesseract configuration options for additional settings.


    Input image -> Binary -> Morph opening

    Contour area filtering -> Invert -> Apply blur to get result

    Result from OCR

    YabVzu
    

    Code

    import cv2
    import pytesseract
    import numpy as np
    
    pytesseract.pytesseract.tesseract_cmd = r"C:Program FilesTesseract-OCR	esseract.exe"
    
    # Load image, grayscale, Otsu's threshold
    image = cv2.imread('2.png')
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
    
    # Morph open to remove noise
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,2))
    opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)
    
    # Find contours and remove small noise
    cnts = cv2.findContours(opening, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if len(cnts) == 2 else cnts[1]
    for c in cnts:
        area = cv2.contourArea(c)
        if area < 50:
            cv2.drawContours(opening, [c], -1, 0, -1)
    
    # Invert and apply slight Gaussian blur
    result = 255 - opening
    result = cv2.GaussianBlur(result, (3,3), 0)
    
    # Perform OCR
    data = pytesseract.image_to_string(result, lang='eng', config='--psm 6')
    print(data)
    
    cv2.imshow('thresh', thresh)
    cv2.imshow('opening', opening)
    cv2.imshow('result', result)
    cv2.waitKey()     
    

    Method #2: Color segmentation

    With the observation that the desired text to extract has a distinguishable contrast from the noise in the image, we can use color thresholding to isolate the text. The idea is to convert to HSV format then color threshold to obtain a mask using a lower/upper color range. From were we use the same process to OCR with Pytesseract.


    Input image -> Mask -> Result

    Code

    import cv2
    import pytesseract
    import numpy as np
    
    pytesseract.pytesseract.tesseract_cmd = r"C:Program FilesTesseract-OCR	esseract.exe"
    
    # Load image, convert to HSV, color threshold to get mask
    image = cv2.imread('2.png')
    hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
    lower = np.array([0, 0, 0])
    upper = np.array([100, 175, 110])
    mask = cv2.inRange(hsv, lower, upper)
    
    # Invert image and OCR
    invert = 255 - mask
    data = pytesseract.image_to_string(invert, lang='eng', config='--psm 6')
    print(data)
    
    cv2.imshow('mask', mask)
    cv2.imshow('invert', invert)
    cv2.waitKey()
    

    Correcting distorted text

    OCR works best when the image is horizontal. To ensure that the text is in an ideal format for OCR, we can perform a perspective transform. After removing all the noise to isolate the text, we can perform a morph close to combine individual text contours into a single contour. From here we can find the rotated bounding box using cv2.minAreaRect and then perform a four point perspective transform using imutils.perspective.four_point_transform. Continuing from the cleaned mask, here's the results:

    Mask -> Morph close -> Detected rotated bounding box -> Result

    Output with the other image

    Updated code to include perspective transform

    import cv2
    import pytesseract
    import numpy as np
    from imutils.perspective import four_point_transform
    
    pytesseract.pytesseract.tesseract_cmd = r"C:Program FilesTesseract-OCR	esseract.exe"
    
    # Load image, convert to HSV, color threshold to get mask
    image = cv2.imread('1.png')
    hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
    lower = np.array([0, 0, 0])
    upper = np.array([100, 175, 110])
    mask = cv2.inRange(hsv, lower, upper)
    
    # Morph close to connect individual text into a single contour
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
    close = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel, iterations=3)
    
    # Find rotated bounding box then perspective transform
    cnts = cv2.findContours(close, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if len(cnts) == 2 else cnts[1]
    rect = cv2.minAreaRect(cnts[0])
    box = cv2.boxPoints(rect)
    box = np.int0(box)
    cv2.drawContours(image,[box],0,(36,255,12),2)
    warped = four_point_transform(255 - mask, box.reshape(4, 2))
    
    # OCR
    data = pytesseract.image_to_string(warped, lang='eng', config='--psm 6')
    print(data)
    
    cv2.imshow('mask', mask)
    cv2.imshow('close', close)
    cv2.imshow('warped', warped)
    cv2.imshow('image', image)
    cv2.waitKey()
    

    Note: The color threshold range was determined using this HSV threshold script

    import cv2
    import numpy as np
    
    def nothing(x):
        pass
    
    # Load image
    image = cv2.imread('2.png')
    
    # Create a window
    cv2.namedWindow('image')
    
    # Create trackbars for color change
    # Hue is from 0-179 for Opencv
    cv2.createTrackbar('HMin', 'image', 0, 179, nothing)
    cv2.createTrackbar('SMin', 'image', 0, 255, nothing)
    cv2.createTrackbar('VMin', 'image', 0, 255, nothing)
    cv2.createTrackbar('HMax', 'image', 0, 179, nothing)
    cv2.createTrackbar('SMax', 'image', 0, 255, nothing)
    cv2.createTrackbar('VMax', 'image', 0, 255, nothing)
    
    # Set default value for Max HSV trackbars
    cv2.setTrackbarPos('HMax', 'image', 179)
    cv2.setTrackbarPos('SMax', 'image', 255)
    cv2.setTrackbarPos('VMax', 'image', 255)
    
    # Initialize HSV min/max values
    hMin = sMin = vMin = hMax = sMax = vMax = 0
    phMin = psMin = pvMin = phMax = psMax = pvMax = 0
    
    while(1):
        # Get current positions of all trackbars
        hMin = cv2.getTrackbarPos('HMin', 'image')
        sMin = cv2.getTrackbarPos('SMin', 'image')
        vMin = cv2.getTrackbarPos('VMin', 'image')
        hMax = cv2.getTrackbarPos('HMax', 'image')
        sMax = cv2.getTrackbarPos('SMax', 'image')
        vMax = cv2.getTrackbarPos('VMax', 'image')
    
        # Set minimum and maximum HSV values to display
        lower = np.array([hMin, sMin, vMin])
        upper = np.array([hMax, sMax, vMax])
    
        # Convert to HSV format and color threshold
        hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
        mask = cv2.inRange(hsv, lower, upper)
        result = cv2.bitwise_and(image, image, mask=mask)
    
        # Print if there is a change in HSV value
        if((phMin != hMin) | (psMin != sMin) | (pvMin != vMin) | (phMax != hMax) | (psMax != sMax) | (pvMax != vMax) ):
            print("(hMin = %d , sMin = %d, vMin = %d), (hMax = %d , sMax = %d, vMax = %d)" % (hMin , sMin , vMin, hMax, sMax , vMax))
            phMin = hMin
            psMin = sMin
            pvMin = vMin
            phMax = hMax
            psMax = sMax
            pvMax = vMax
    
        # Display result image
        cv2.imshow('image', result)
        if cv2.waitKey(10) & 0xFF == ord('q'):
            break
    
    cv2.destroyAllWindows()
    

    這篇關于使用 OpenCV 進行圖像處理,從圖像中去除背景文本和噪點的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!

    【網站聲明】本站部分內容來源于互聯網,旨在幫助大家更快的解決問題,如果有圖片或者內容侵犯了您的權益,請聯系我們刪除處理,感謝您的支持!

相關文檔推薦

How to draw a rectangle around a region of interest in python(如何在python中的感興趣區域周圍繪制一個矩形)
How can I detect and track people using OpenCV?(如何使用 OpenCV 檢測和跟蹤人員?)
How to apply threshold within multiple rectangular bounding boxes in an image?(如何在圖像的多個矩形邊界框中應用閾值?)
How can I download a specific part of Coco Dataset?(如何下載 Coco Dataset 的特定部分?)
Detect image orientation angle based on text direction(根據文本方向檢測圖像方向角度)
Detect centre and angle of rectangles in an image using Opencv(使用 Opencv 檢測圖像中矩形的中心和角度)
主站蜘蛛池模板: 精品视频免费在线 | 户外露出一区二区三区 | 久久夜视频 | 成人区一区二区三区 | 国产乱码精品一品二品 | 一区二区三区免费网站 | 91伊人| 亚洲精选久久 | 日韩成人免费在线视频 | 免费一级做a爰片久久毛片潮喷 | 欧美女优在线观看 | 日本不卡一区二区 | 午夜电影网站 | 亚洲国产成人精品久久久国产成人一区 | 欧美日韩国产一区二区三区 | 中文字幕 国产 | 成人午夜在线视频 | 国产 日韩 欧美 中文 在线播放 | 亚洲v日韩v综合v精品v | 亚洲精品资源 | 亚洲视频在线一区 | 精品久久一区二区 | 91av视频在线观看 | 大久| 懂色tv| 91色在线 | 国产免费人成xvideos视频 | 91www在线观看 | 久久中文字幕一区 | 国产精品久久久久久婷婷天堂 | 黄色永久免费 | 九九综合 | 欧美激情在线一区二区三区 | 农村黄性色生活片 | 中文字幕一区二区三区在线观看 | 中文字幕观看 | 欧美一二精品 | 午夜在线精品 | 久久国际精品 | 亚洲午夜精品视频 | 日韩第一区 |