Hello folks ! once again, this tutorial is all about building special functions that could be used to process images:OCR, bar-code,grey-scaling and etc. Actually this guide should not be included in the techno-blog; but I was compromised to put the document here to avoid head ache experience again.It's been awhile since I've debugged the exact-image libraries so I've forgotten the "howtos" necessary for a quick set-up -not even worried about taking notes then. Just recently ,when I decided to upgrade my deployed OS into a newer version ,I didn't expect that it would take almost 3 days for me to compile those object modules and libraries-really a head ache . So this time, I've realized that it's not a good practice to ignore even a little pieces of notes (patches,revision,repositories and etc) in recalling included files patched in the program -especially if its free.
Anyway let me share to you the usefulness of the ExactCode software.The software is a fast, modern and generic image processing library .It includes codecs allowing library users to implement their own data sources and destinations, such as in memory locations or network transfers.It is know as viable alternative to ImageMagick. The software was prototyped the needed code in C++, just for speed, and achieved processing times about 1/20th of what ImageMagick consumed. It features explore several new algorithms, e.g. for de-screening, data-dependent triangulation scaling, loss-less JPEG transforms and others needed for fast image processing.
Below are the instructions on how you can install and build exactimage which includes programs for fast image processing.I've also attached video on how the OCR(hocr2pdf) program functions in searching different texts in a PDF viewer. You can use each program in the command line as I've written how-to's and instructions in the testing portion of this blog.You may cut and paste all included examples and see for yourself if it indeed does its job more than what is expected.But , hey,don't forget to jot some notes before you will forget its procedures.Otherwise you will experience headache in the future as you wanted to try it once more. A sort of advise folks!
Linux OS: Fedora 18 64 bit
Server ,i7 core
ExactCode image processing library
root@localhost# wget http://exactcode.de/exact-image.0.8.x.tar.bz2
root@localhost# svn co https://exactcode.de/exact-image/trunk exact-image.8.x
root@localhost# yum install
gcc gcc-c++ libstdc++
perl perl-devel perl-ExtUtils-Embed
root@localhost# tar -jxvf exact-image.8.x.tar.bz2
root@localhost# cd exact-image.8.x/
root@localhost# ./configure --prefix=/usr/local/scanner
root@localhost# make && make install
This CLI based program can createa searchable PDF from hOCR input
hocr2pdf: Is a command line front-end for the image processing library to create perfectly layouted, searchable PDF files from hOCR, annotated HTML, input obtained from an OCR system.
(1) hOCR, annotated HTML, input must be provided to STDIN, and the image data is read using the filename from the -i or --input argument. For example:
roott@localhost# hocr2pdf -i scan.tiff -o test.pdf < cuneiform-out.hocr
(2) By default the text layer is hidden by the real image data. Including image data can be disabled via the -n, --no-image, so that just the recognized text from the OCR is visible - e.g. for debugging or to save storage space:
root@localhost# hocr2pdf -i scan.tiff -n -o test.pdf < cuneiform-out.hocr
(3) If too many gabs between letters in individual words as this might be a problem with imprecise OCR data or justified text with huge gabs. Hocr2pd in ExactImage includes a special mode activated with the command line argument -s, --sloppy-text, to group glyphs between whitespace to words which can help PDF viewers to produce better results while cut and pasting text:
root@localhost#hocr2pdf -i scan.tiff -s -o test.pdf < cuneiform-out.hocr
0) Exact-Image.8.8 files
(1)Plane image (Tiff file)
(2) hOCR generated text
3) Hocr2pdf script which is called every processing OCR
4) OCR searchable texts in PDF viewer
png error 
Note: libpng12 in ExactImage depreciated and causes bug in the compilation so better delete "png.hh" and "png.cc"
root@localhost# cd /codecs
root@localhost# rm -rf png.*