Re: Transparency and IP

From: Samantha Atkins (samantha@objectent.com)
Date: Wed Sep 13 2000 - 00:53:28 MDT


David Lubkin wrote:
>
> On 9/12/00, at 3:45 PM, hal@finney.org wrote:
>
> >Whether or not this was truly a concern of David Brin several years ago,
> >it has apparently become an issue for the industry as a whole. Still it
> >seems that novelists have several years breathing room before they have
> >to worry. It is hardly practical today to unbind, scan and text-convert
> >books into electronic form.
>
> Au contraire. First, from a technical standpoint, it's trivial. Chop
> the spine off a book. Put the pages on a scanner with a document feeder.
> Use a batch-mode OCR program. Anyone can do it for a few hundred dollars.
>

Somehow I doubt you actually tried this. The results are not very good
at all. You would have to position the pages well (no mean trick as
they aren't exactly 8-1/2 * 11 usually) flip the stack of pages over to
scan both sides, have the OCR or post processing paste the results
together in a continuous narrative, post process a lot more with a
really good dictionary/grammar program to try to fix the 10% or so
minimum OCR errors likely fro the process thus far and still have a
pretty major editing job to make the results really good.

If you know better I would very much like to know how to improve on
this. I've been wanting my library online for many years now.

- samantha



This archive was generated by hypermail 2b29 : Mon Oct 02 2000 - 17:37:51 MDT