-
Categories
-
Tags/Keywords
Apache bash blog blog.bigsmoke.us CLI CSS Debian DNS Firefox Gentoo Google HTML HTTP iptables Linux MediaWiki mod_rewrite MySQL network PHP plugin postfix RAID Ruby samba Screen shell Sicirec smb SSH ssl Subversion svn T61 thunderbird Ubuntu van der Molen VIM Windows WordPress WWW X xen XTerm zimbra -
Recent Posts
-
Recent Comments
Tag: scanning
Linux OCR with Tesseract
I'm scanning old Flor y Fauna news letters for my Dutch Hardwood Investment Wiki. I need to do this because most of these newsletters, although produced digitally, are available in the Sicirec archive only in paper form. The only graphical item these news-letters sport is a simple graphical header, so I want to convert the scans to text and put the text in a wiki article for each newsletter; I don't want to upload dozens of image-heavy PDFs just to show the original (crappy) layout. Read More »
Removing unwanted grey values in scanning white papers
When doing automated scanning, like I do to for properly organizing paper administration, the resulting images can get quite large because the background has near-white information that is still very complex to save. Imagemagick has nice solution for that; -white-threshold x%. It also has -black-threshold, should it be necessary.
Read More »