問題描述
我媽媽時不時地需要翻閱這些類型的照片以從圖像中提取數字并將其重命名為數字.
every now and then my mom has to shift through these type of photos to extract the number from the image and rename it to the number.
我正在嘗試使用 OpenCV、Python、Tesseract 來完成該過程.我真的很想用數字提取圖像的一部分.我怎么能這樣做?任何建議我都是 OpenCV 的新手.
I'm trying to use OpenCV, Python, Tesseract to get the process done. I'm really lost trying to extract the portion of the image with the numbers. How could I do this? Any suggestions i'm really new at OpenCV.
我嘗試使用閾值和輪廓提取白色矩形板,但無濟于事,因為我為閾值選擇的 RGB 并不總是有效,我不知道如何選擇輪廓.
I tried to extract the white rectangular board using thresholds and contours, but no avail because the RGB I choose for thresh doesn't always work and I don't know how to choose the contour.
看這篇論文http://yoni.wexlers.org/papers/2010TextDetection.pdf .看起來很有前途
Looking at this paper http://yoni.wexlers.org/papers/2010TextDetection.pdf . Looks prominisn
推薦答案
我一直在看這個,一路上得到了一些靈??感....
I have been having another look at this, and had a couple of inspirations along the way....
Tesseract 可以接受自定義字典,如果你再深入一點,似乎從 v3.0 開始,它接受命令行參數
digits
以使其僅識別數字 -似乎對您的需求有用.
Tesseract can accept custom dictionaries, and if you dig a little more, it appears that from v3.0, it accepts the command-line parameter
digits
to make it recognise digits only - seems a useful idea for your needs.
可能沒有必要找到帶有數字的板 - 使用不同的圖像切片多次運行 Tesseract 并讓它自己嘗試可能會更容易,因為這是它應該做的做.
It may not be necessary to find the boards with the digits on - it may be easier to run Tesseract multiple times with various slices of the image and let it have a try itself as that is what it is supposed to do.
因此,我決定對圖像進行預處理,將黑色 25% 以內的所有內容更改為純黑色,并將其他所有內容更改為純白色.這給出了這樣的預處理圖像:
So, I decided to preprocess the image by changing everything that is within 25% of black to pure black, and everything else to pure white. That gives pre-processed images like this:
接下來,我生成一系列圖像并將它們傳遞給 Tesseract,一次一個.我決定假設數字可能在圖像高度的 40% 到 10% 之間,所以我在圖像高度的 40、30、20 和 10% 的條帶上做了一個循環.然后,我以 20 步從上到下將條帶從圖像上向下滑動,將每個條帶傳遞到 Tesseract,直到條帶基本上穿過圖像的底部.
Next, I generate a series of images and pass them, one at a time to Tesseract. I decided to assume that the digits are probably between 40% to 10% of the image height, so I made a loop over strips 40, 30, 20 and 10% of the image height. I then slide the strip down the image from top to bottom in 20 steps passing each strip to Tesseract, till the strip is essentially across the bottom of the image.
這是 40% 的片段 - 動畫的每一幀都傳遞給 Tesseract:
Here are the 40% strips - each frame of the animation is passed to Tesseract:
這是 20% 的片段 - 動畫的每一幀都傳遞給 Tesseract:
Here are the 20% strips - each frame of the animation is passed to Tesseract:
得到條帶后,我很好地調整了它們的大小以適應 Tesseract 的最佳位置,并清除它們的噪音等.然后,我將它們傳遞到 Tesseract 并通過計算它找到的位數來粗略地評估識別的質量.最后,我按位數對輸出進行排序 - 可能更多位數可能更好......
Having got the strips, I resize them nicely for Tesseract's sweet spot and clean them up from noise etc. Then, I pass them into Tesseract and assess the quality of the recognition, somewhat crudely, by counting the number of digits it found. Finally, I sort the output by number of digits - presumably more digits is maybe better...
有一些粗糙的邊緣和細節,你可以用它來處理,但這是一個開始!
There are some rough edges and bits that you could dink around with, but it is a start!
#!/bin/bash
image=${1-c1.jpg}
# Make everything that is nearly black go fully black, everything else goes white. Median for noise
# convert -delay 500 c1.jpg c2.jpg c3.jpg -normalize -fuzz 25% -fill black -opaque black -fuzz 0 -fill white +opaque black -median 9 out.gif
convert "${image}" -normalize
-fuzz 25% -fill black -opaque black
-fuzz 0 -fill white +opaque black
-median 9 tmp_$$.png
# Get height of image - h
h=$(identify -format "%h" "${image}")
# Generate strips that are 40%, 30%, 20% and 10% of image height
for pc in 40 30 20 10; do
# Calculate height of this strip in pixels - sh
((sh=(h*pc)/100))
# Calculate offset from top of picture to top of bottom strip - omax
((omax=h-sh))
# Calculate step size, there will be 20 steps
((step=omax/20))
# Cut strips sh pixels high from the picture starting at top and working down in 20 steps
for (( off=0;off<$omax;off+=$step)) do
t=$(printf "%05d" $off)
# Extract strip and resize to 80 pixels tall for tesseract
convert tmp_$$.png -crop x${sh}+0+${off}
-resize x80 -median 3 -median 3 -median 3
-threshold 90% +repage slice_${pc}_${t}.png
# Run slice through tesseract, seeking only digits
tesseract slice_${pc}_${t}.png temp digits quiet
# Now try and assess quality of output :-) ... by counting number of digits
digits=$(tr -cd "[0-9]" < temp.txt)
ndigits=${#digits}
[ $ndigits -gt 0 ] && [ $ndigits -lt 6 ] && echo $ndigits:$digits
done
done | sort -n
Cow 618 的輸出(第一個數字是找到的位數)
Output for Cow 618 (first number is the number of digits found)
2:11
2:11
3:573
5:33613 <--- not bad
Cow 2755 的輸出(第一個數字是找到的位數)
Output for Cow 2755 (first number is the number of digits found)
2:51
3:071
3:191
3:517
4:2155 <--- pretty close
4:2755 <--- nailed that puppy :-)
4:2755 <--- nailed that puppy :-)
4:5212
5:12755 <--- pretty close
Cow 3174 的輸出(第一個數字是找到的位數)
Output for Cow 3174 (first number is the number of digits found)
3:554
3:734
5:12732
5:31741 <--- pretty close
很酷的問題 - 謝謝!
Cool question - thank you!
這篇關于從圖像中提取奶牛編號的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!