I have been converting PDF files, notes, web clippings and anything text to MP3 files so I could bring them with me and listen while on the fly. Recently, I have a couple of books that I simply could not finish due mainly to their being rich in content. Instead of buying an audiobook version of books that I already have, there must be a way to convert these books to MP3’s. Last Jan 31, I decided to do just that and converted a book accordingly. For this project, I had to use the following:
- Exacto Knife
- Optional Rototrim cutter.
- Fujitsu Scansnap S510M scanner with bundled Abbyy Finereader OCR software
- TextAloud text to speech conversion software.
The total process took me around 3 hours with most of it involving the careful cutting of the book one page at a time. Here is the step by step:
- Cut each page. Using the exacto knife, just cut each page as uniformly as you can. In my case, I did not bother insuring that every page was of uniform size. After cutting every page, I then used the Rototrim to insure that every page is of uniform size. If you do not have a Rototrim, you just have to be careful during the cutting process to insure that every page is cut uniformly. I did a final check to insure that all pages are uniform in size and are in their correct order. This process took about 1 and a half hours to complete.
- Scan Pages to searchable PDF. I put the S510M through its paces. I set the Scansnap software to scan in duplex mode and scanning to searchable PDF (the s510M uses the AbbyFinereader OCR software that came bundled). I then scanned 50 pages at a time which is the maximum the sheet feeder could accommodate. I was extremely amazed at how accurate the scans were. I managed to scan all 257 pages of the book without a single error. Yes, The S510M with Abbyy Finereader can do an amazing job in this department. This process took only 30 minutes to complete.
- Remove page numbers and headers. I opened the searchable PDF file using Adobe Acrobat Professional 8.0 which again came bundled with the S510M. I selected all the text and pasted them in my favorite text editor -Textedit. In Textedit, I removed every header and page number that appeared on each page, I then saved the edited file.I left the Chapter numbers alone, which will come useful in the next step. Another 30 minutes.
- Convert text to speech. With the text file from #3 above open, I fired-up TextAloud. I then selected every chapter in the text file and dumped each one in TextAloud for conversion. I bought two optional voices for TextAloud _ Kate and Paul. For this audiobook, I chose Paul. It took me another 45 minutes to complete this process.
- Bind the original pages of the book (optional). 15 minutes using a desktop binding machine.
In about 3 hours from start to finish, I was done with the project with all 13 chapters of the book stored in my iPhone. Each chapter averaged about 5mb once converted to mp3.
Now, I checked Amazon and determined that I have had the book since October but have not gotten to finish reading it for more than 2 months. With the mp3 of the book inside the iPhone, I am now on my second reading.
Do you have any book worth digitizing?
Fantastic man!
I read your - How to digitalize - post and i was deeply impressed.
I'm currently living in japan, and i ordered an ebook reader caled bebook and i'm in love.
I have access to tons of english books online but japanese pdf's are very hard to come by. So I'm thinking of digitalizing them and see what happens. Does your machine digitalize japanese text into searchable text?
I'm going to see if doing your method is a fast way to convert my japanese libraries to pdf so i can take em on my ebook reader.
If you have any more tips on what machinery you are using, how you have improved on your methods, or resources, feel free to contact me!!
On the road to digitalization.
Posted by: Exrulez | May 30, 2009 at 08:56 PM
I think you have a thorough understanding in this matter. You describe in detail all here.
Posted by: RamonGustav | August 24, 2010 at 01:01 PM
Happy New Year! The author write more I liked it.
Posted by: school_dubl | December 29, 2010 at 09:16 AM