Lehigh University
COLLEGE HOME | LEHIGH HOME | SEARCH



•  Home
•  Research
•  Courses
•  Publications
•  Patents
•  Talks
•  Conferences
•  Students
•  Resources
•  Other Activities
•  Vita (PDF)


•  PatRec Lab


   


Daniel P. Lopresti

“Robust Document Image Understanding Technologies” (with H. Baird, B. Davison, and W. M. Pottenger), Proceedings of the First ACM Workshop on Hardcopy Document Processing (in association with Thirteenth Conference on Information and Knowledge Management), November 2004, Washington, DC, pp. 9-14.


Abstract
No existing document image understanding technology, whether experimental or commercially available, can guarantee high accuracy across the full range of documents of interest to industrial and government agency users. Ideally, users should be able to search, access, examine, and navigate among document images as effectively as they can among encoded data files, using familiar interfaces and tools as fully as possible. We are investigating novel algorithms and software tools at the frontiers of document image analysis, information retrieval, text mining, and visualization that will assist in the full integration of such documents into collections of textual document images as well as "born digital" documents.  Our approaches emphasize versatility first: that is, methods which work reliably across the broadest possible range of documents.


Paper  (PDF 56 kbytes)

image


© 2004 P.C. Rossin College of Engineering & Applied Science
Computer Science & Engineering, Packard Laboratory, Lehigh University, Bethlehem PA 18015