Skip to content

Tag: gscan2pdf

Linux OCR with Tesseract

I'm scanning old Flor y Fauna news letters for my Dutch Hardwood Investment Wiki. I need to do this because most of these newsletters, although produced digitally, are available in the Sicirec archive only in paper form. The only graphical item these news-letters sport is a simple graphical header, so I want to convert the scans to text and put the text in a wiki article for each newsletter; I don't want to upload dozens of image-heavy PDFs just to show the original (crappy) layout. Read More »