-
Categories
-
Tags/Keywords
Apache bash blog blog.bigsmoke.us CLI CSS Debian design DNS Firefox Gentoo Google HTML HTTP Javascript Lenovo Linux MediaWiki mod_rewrite MySQL neustar Opschoot PHP plugin PostgreSQL Ruby samba Screen shell Sicirec SpamAssassin SSH Subversion svn T61 Ubuntu URL van der Molen VIM WordPress WWW www.sicirec.org X XTerm zimbra -
Recent Posts
-
Recent Comments
Tag: OCR
Linux OCR with Tesseract
I'm scanning old Flor y Fauna news letters for my Dutch Hardwood Investment Wiki. I need to do this because most of these newsletters, although produced digitally, are available in the Sicirec archive only in paper form. The only graphical item these news-letters sport is a simple graphical header, so I want to convert the scans to text and put the text in a wiki article for each newsletter; I don't want to upload dozens of image-heavy PDFs just to show the original (crappy) layout. Read More »
