Help! Text boxes into text?
Thread poster: Eleni Makantani
Eleni Makantani
Eleni Makantani
Greece
Local time: 17:41
English to Greek
+ ...
Apr 23, 2012

Hello everyone,

I am working on an MS Word file with high repetitivity if worked on Trados, which is much wished.

The only problem is that all text in that file is in text boxes (apparently, it was a pdf file converted in Word by the client using OCR - the pdf is not available), which makes it extremely hard to work with Word+Trados. Also, it cannot be worked on TagEditor, as there appear to be a million tags even in between single words.

The question is:
... See more
Hello everyone,

I am working on an MS Word file with high repetitivity if worked on Trados, which is much wished.

The only problem is that all text in that file is in text boxes (apparently, it was a pdf file converted in Word by the client using OCR - the pdf is not available), which makes it extremely hard to work with Word+Trados. Also, it cannot be worked on TagEditor, as there appear to be a million tags even in between single words.

The question is: do you know any way to convert text boxes in plain text without losing their format/ content?

Many thanks for any answer!
Collapse


 
Jean Lachaud
Jean Lachaud  Identity Verified
United States
Local time: 10:41
English to French
+ ...
try Werecat Apr 23, 2012

http://www.volny.cz/ddaduc/werecat.html

Please read the warnings carefully.

I use Werecat with Wordfast, so, if it does actuially work with your version of Word, it ought to work with Trados, I suppose.

FWIW, I used Werecat recently in Word 2010/Win 7, and it worked as usual, but it involved a limited number of text boxes.


 
Tony M
Tony M
France
Local time: 16:41
Member
French to English
+ ...
SITE LOCALIZER
Werecat Apr 23, 2012

PDF OCR to DOC is a pain when it uses text boxes to 're-create' the formatting! Much better to turn off the option at the time of conversion — but of course, by the time we get to see it, ti's too late

Werecat is a very helpful little utility, downloadable freeware, which was originally designed for Wordfast users — it extracts text from text boxes (with tags) into a Word .DOC that you can then translate as normal
... See more
PDF OCR to DOC is a pain when it uses text boxes to 're-create' the formatting! Much better to turn off the option at the time of conversion — but of course, by the time we get to see it, ti's too late

Werecat is a very helpful little utility, downloadable freeware, which was originally designed for Wordfast users — it extracts text from text boxes (with tags) into a Word .DOC that you can then translate as normal, and then it will put the text back into the right places for you!

There are some provisos: you mustn't either add or remove any hard returns, otherwise this messes up the re-insertion; if your translation makes this unavoidable, then you MUST repair them after cleaning and before re-insertion.

Otherwise, it works superbly well for .DOC and .PPT files (at least up to office XP, don't know about the latest versions...)

If you are not sure of yourself, feel free to send me the files and I'll pre- and post-process them for you.
Collapse


 
Eleni Makantani
Eleni Makantani
Greece
Local time: 17:41
English to Greek
+ ...
TOPIC STARTER
Thanks to both of you Apr 23, 2012

Thank you for your answers, I will certainly try our Warecat to see how it works. I also appreciate very much Tony's offer to help. In the mean time, I found my way around the problem:

I transformed the problematic word file back into pdf, using doPDF freeware and then I OCR-ed it again, seeing to avoid text boxes. Inconvenient as it may sound, this procedure worked like a wonder! I guess that cool blood and imagination are first-rank properties in our line of business...

... See more
Thank you for your answers, I will certainly try our Warecat to see how it works. I also appreciate very much Tony's offer to help. In the mean time, I found my way around the problem:

I transformed the problematic word file back into pdf, using doPDF freeware and then I OCR-ed it again, seeing to avoid text boxes. Inconvenient as it may sound, this procedure worked like a wonder! I guess that cool blood and imagination are first-rank properties in our line of business...

Thank you again!

[Edited at 2012-04-23 21:53 GMT]
Collapse


 
Maria Ramon
Maria Ramon  Identity Verified
United States
Local time: 09:41
Dutch to English
+ ...
Wordfast PRO Apr 24, 2012

Wordfast PRO works wonders when there are text boxes in Word documents.
That is what I would recommend using.


 
Sergei Leshchinsky
Sergei Leshchinsky  Identity Verified
Ukraine
Local time: 17:41
Member (2008)
English to Russian
+ ...
Try a smarter PDF -> DOC converter, Apr 24, 2012

... if you have the source PDF file.

(Try SilidDocuments PDFtoWord.)

[Редактировалось 2012-04-24 07:01 GMT]


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Help! Text boxes into text?






Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »