hello
my project involves OCR, and I am looking for ways to post-processly improve the image that the scanner produces in favor of a better OCR success rate (OCR rate of 1% means 100 mistakes in a page and this is hard to correct if we are talking about many pages)
scanner driver itself offers adjustments on contrast, brightness, gamma correction that may improve OCR success rate, but those may depend on the specific document and I couldn't be able to find manualy the optimum combination
I don't know how the OCR program (Abbyy Finereader) reads exactly the characters, but I suppose there is something that can be done to help it, using a more sophisticated image editing software that the image adjustments that can be done through the scanner driver or the OCR program
basicaly, it would be beneficial to increase:
- contrast between black characters of the text and white background of the page
- make white of the page, true white, and black of the characters, true black
- smooth the edges of the characters
- remove artifacts, small dots that would confuse the OCR program, etc
- find automaticaly the optimum combination of contrast, brightness, gamma correction, sharpness
is there anything that can be done?
thanks
my project involves OCR, and I am looking for ways to post-processly improve the image that the scanner produces in favor of a better OCR success rate (OCR rate of 1% means 100 mistakes in a page and this is hard to correct if we are talking about many pages)
scanner driver itself offers adjustments on contrast, brightness, gamma correction that may improve OCR success rate, but those may depend on the specific document and I couldn't be able to find manualy the optimum combination
I don't know how the OCR program (Abbyy Finereader) reads exactly the characters, but I suppose there is something that can be done to help it, using a more sophisticated image editing software that the image adjustments that can be done through the scanner driver or the OCR program
basicaly, it would be beneficial to increase:
- contrast between black characters of the text and white background of the page
- make white of the page, true white, and black of the characters, true black
- smooth the edges of the characters
- remove artifacts, small dots that would confuse the OCR program, etc
- find automaticaly the optimum combination of contrast, brightness, gamma correction, sharpness
is there anything that can be done?
thanks
Comment