simply-thai.com - Thai Market
Home Page | Baht Rate | Thai Chefs | Clients | Thai Visa Services | Main Site Menu

Google   
 

Simply Thai Computers
Computers Home | Systems | Useful Links | Help Index

Photography and Video Production in Bangkok
Digitizing Documents
Digitizing is the process of turning real world, physical information into a format that can be recognized by computers. The converted information is called digital because it consists of ones and zeros (digits).

Turning paper and other hard documents into digital — and editable — text files requires
A scanner and OCR (Optical Character Recognition)
software.Without such software, scanned documents are seen only as images (pictures) that cannot be edited. Worse, images (i.e. gif and JPEG) are significantly larger than text files, with the result that transfer times (uploading and downloading) on the Internet are considerably greater — much greater. It is, therefore, essential to convert scanned documents into editable text files, such as Microsoft Word documents,before sending these over the Internet.

What is OCR?
Optical character recognition (OCR)
is the process of turning a scanned image into computer-editable text so that you do not have to retype the text manually. When initially scanned, text documents are nothing more than electronic pictures, or photographs, each of which is comprised of many tiny dots (pixels). The characters, or text, you see in such images cannot be edited, in that word-processing programs are unable to recognize the alphanumeric characters. In short, that contract you just finished scanning for editing purposes is seen only as one very large image, and might just as well have been a picture of a tree. To create an editable text file, your scans must be processed through an OCR program.

The OCR Test
Simply-Thai.com
Computers Has tested the two most popular programs on the market today: OmniPage Pro by Caere Corp, and TextBridge Pro Millennium by ScanSoft.

A HP ScanJet 6300C USB flatbed scanner was used to perform the tests.

The first test involved a simple text document, with each program responding almost exactly the same, taking roughly 30 seconds to convert the scanned page into an editable MS Word document. As well, character (text) recognition was flawless, with both OmniPage and TextBridge returning a score of 100% accuracy. This is a remarkable achievement when compared to the early days of OCR technology.

Again, these tests involved a simple typed document containing no pictures.

The second test involved a page from a magazine containing both text and pictures in a multi-column layout.

Once again, each of the two programs took about the same length of time to process the scan - about 60 seconds in total - double that of a simple text document. As well, character recognition was again executed in a flawless manner, with each utility registering a perfect score.

The difference was in replicating the layout of the magazine page. While not perfect, OmniPage retained the characteristics and layout of the original document, reproducing the columns and images on a single page in Microsoft Word. Conversely, TextBridge transposed the scan over two pages of a MS Word document, failing to replicate the exact appearance of the original document. Upon closer inspection, unnecessary line breaks were inserted in the TextBridge scan, causing the physical size of the document to lengthen.

Newsworthy: On March 13, 2000, ScanSoft acquired the assets of Caere Corporation, with the result that this same company now owns the two most popular OCR programs on the market.

Conclusion
For speed and accuracy, each of the two OCR programs were found to be virtually identical. If replicating the layout of multi-column articles is your thing, though, you might want to consider OmnPage Pro.Based on our own tests, this particular utility would appear to have a slight edge over TextBridge Pro, but certainly not enough to justify a price tag of $499.00 versus only $79.99 for TextBridge Pro.In fact, you'd pretty much have to be experiencing a total electrical blackout above the shoulders to fork over the additional $400.00 plus for OmniPage Pro.

Conclusion: stick with TextBridge; you'll be much richer for the experience.

To learn more about these products, as well as a sister utility known as OmniForm, visit the ScanSoft web site.

 

UK Thai Restaurants | Thai Chef Recruitment | Translations | Links | Computers
Fruit Carving | Thailand | Money | language | Marriage | Faq 1 | Faq 2 | Faq 3
Climate | Rainy SeasonVisa's | Chiangmai | Doi-Inthanon | Hilltribes
Adventure Activities
Golf Tours | Angkor Wat Tour | Historical Timeline
Monarchy | Thai Kings | Religion | Thai Visa Services | Thailand Hotels | Bangkok
Photography and Video Production in Bangkok
Novels and Bestsellers Set in Thailand
Phuket Pearl Center & Showroom
For Sale Gulf of Thailand Beach Front Resort development sites


copyright - 1998-2008 - simply-thai.com - Privacy Policy
 All Rights Reserved

We are Hosted by
www.hostingbangkok.com
We are Hosted by www.hostingbangkok.com