Internet Recipe Hunting and Recipe Reformatting Tips III – Tuning Multiple-Page Complex PDF File
By Harry {doc} Babad and Julie M. Willingham
Okay, now comes the hard part: Our objective is fixing a multiple-page PDF recipe to compact it and make it easy to use. [NOTE: I’ve now tried all of these methods in Acrobat Pro 8 and they work, well and the interface is faster. Killing frames is more accurate and in general editing goes smother. …doc]
Reminder: There are three reasons to tune a recipe that you printed to PDF: To trim unneeded information for the recipe, shrinking the file size; To add an image to the recipe (See Part I of this tutorial); To modify any part of the recipe — changing ingredients, modifying cooking instructions, or adding background information. |
For this part of my tutorial, I’ll work with Darlene Schmidt’s the Easy Thai Green Curry Chicken, a multiple hotlink recipe from http://thaifood.about.com/od/thairecipes/ss/greencurry.htm.
§ § § § § § § § § § § § § § § § § §
Introduction
Darlene’s web-posted recipe provides us with detailed and well-illustrated information on making this wonderful Thai dish. The information is aimed beginner-level cooks. By illustrating techniques, as well as the core details of the recipe, it’s just the thing to get you started in Thai cooking.
Core Recipe Details, You Ask — These include the recipe’s background (at times), an ingredients list, preparation of ingredients, cooking procedure, and, sometimes, an image of the final dish. Some sites also provide nutritional information by portion of the final recipe.
Darlene’s Web Pages — The downloaded group of eight links contain about 50 to 80% duplicated, and, to me, unneeded information. My download resulted in 16 pages of material. Note that the number of pages you get when printing a website to PDF depends on the font size you’ve selected in your browser to view the web page. Since I always enlarge the size of the web image before printing, usually by two sizes, I may get more pages than you would. My excuses: the file is easier to read, and, when downloaded at the larger font size, easier to manipulate.
I could have copied the essentials into a MS Word document by dragging-and-dropping text and/or images from the web page. It is a fast and easy way to capture and trim complex web-based material into a more pleasing format and more focused document, but you lose most of the original’s formatting. (See Part II of this article.) |
What follows is not line-by-line or page-by-page instruction (e.g., an operating procedure). It’s about cleaning up a complex, multi-page recipe you’ve downloaded as a PDF, by removing extraneous material. Then I illustrate how to complete the customization by shifting content from the remaining pages to the white space now in the PDF to further reduce its size. I provide only stepwise notes in a sequence that you can use to guide your efforts. (Not even I can get away with a 25-30-page macC article—our editor is very firm on that.)
Introduction To The Tools Of The Trade
I use Adobe Acrobat Professional as my primary tool when modifying PDFs, but all of the changes I describe can be done using Acrobat Standard. (I own the professional version for its other features, which are unrelated to recipe manipulation.) I also sometimes use software for cropping images or changing their resolution, such as Photoshop Elements, or Graphic Converter. There are also shareware tools that allow PDF file modification — Check them out at the MacUpdate Site.
Okay, Here’s What We’re Going To Do
If you were doing this for real, rather than as a tutorial, you would download each of the eight links as separate PDF files, making sure their names are distinct. (Save them to your desktop by printing them to PDF.) Fortunately, only two things matter when working with a complex PDF file collection: (1) the methods you’ll learn to use, and (2) an image, in your mind, of how you want the final recipe to look.
The examples that follow in this Part III tutorial use only the first three downloaded links from Darlene’s recipe. To collect only the minimum needed recipe information, use the print this page feature provided on the About.com website. But even when using print this page, lengthy hot linked recipes still have too much extraneous materials for my taste. This website feature removes sidebar material and, at times, images. (Okay, as you read in Part I, it’s easy to add an image back into a PDF.) |
List of Preparation Topics (Hotlinks) Thai Green Curry Ingredients Pounding Coriander Seeds Processing the Ingredients Processed Ingredients for Thai Green Curry Chicken recipe Simmering Oil, Paste, and Coconut milk in Wok Boiling Chicken - Thai Green Curry Adding Peppers -Thai Green Curry Thai Green Curry Chicken |
You can also download each PDF recipe part, edit each individually, and combine the results when you’re done. It does not make a difference which order you work on the recipes pieces. But I prefer to work with a combined PDF because it is easier for me to be consistent in the changes and deletions I make.
Combining PDF Files — When combining PDFs, I use Acrobat’s create PDF from multiple files feature [File > Create PDF > From multiple files]. This is less time consuming that importing the sections one at a time.
The Hard Way - One Page at a Time |
Now all we have to do is edit the pages using Advanced Editing Tools in Acrobat. For the purposes of this article, I’m going to use screen shots of recipe topics 1-3. These are the first three hotlinks in the Thai Green Chicken Curry recipe. Page 1 is the most complex. It contains introductory material, as well as the ingredients list. It also contains credits for both the recipe author and the site. Pages 2 and 3 each contain only an image and a short description for a simple preparation instruction. The rest of the content of these pages is replicated information from Page 1.
An Overview Of What We’re Going To Do
Using Acrobat’s Advanced Editing Tools, you will simplify the PDF page you’ve downloaded. You will also use cut-and-paste operations, and mouse around to select the sections of the PDF you want to modify or delete.
Acrobat Tools Used — You will primarily use the Touchup Objects [TTT] tool. To change text wording, the tool of choice is the Correct Textual Mistakes [TxT] tool. To add an external image to a PDF, the Hand Tool is essential because the TTT does not work. [See Part I for details.]
The Cleanup Sequence:
Darlene’s recipe Steps 1-3, when combined, actually filled parts of 6 PDF pages at the font size I selected in my browser.
Semantics Can be a Drag — I’ll be using the word “PDF pages,” as opposed to the actual number of hotlinks (preparation steps). When viewing the website steps in the recipe, its pages scroll off the screen, but when printed my be longer than a single PDF page. When I’m dealing with the tasks that follow, I use the word pages to refer to the actual material I downloaded, (e.g., printed to PDF.) Be aware that even though I start this tutorial with 6 pages of raw material, my final recipe will only be two pages long. |
Step I, Dump the Unneeded Pages — First find pages that have nothing useful on them. Of the initial 6 pages, I deleted pages 4 and 6, whose only useful information was the copyright statement:
“©2006 About, Inc., a part of The New York Times Company. All rights reserved.”
Since I wanted to put the copyright information at the end of my final document, I kept it on Page 2 of 6. So, now I’m down to only 4 pages.
Step 2, Delete Unneeded Material From The Remaining Pages — These pages contain both the Thai Green Curry Chicken links list from the original web page(s) and links to other Thai recipes that Darlene has posted. They also contain some added navigation links to other parts of About.com. This is material I don’t want in my final recipe.
In addition, I wanted to keep the copyright information for later use. page 2 contained no unique site or author header material. Note, I could have pasted the copyright material on my multi-clipboard tool so it wouldn’t get lost and tossed page 2, but I didn’t.
I kept that statement on page 2 until I could move it to the last page of the final recipe. This was one of my last steps because I was not sure how the last page of the final recipe would be formatted, or how much space it would contain.
A Header Full of “Junk” I Don’t Need |
Unneeded Links to Other Recipes |
When I first tried to delete the step lists (list of links) on pages 2, 3, and 4, I ran into a problem that TTT selected more text than I wanted to delete.
Oops! What I achieved was a frame-like (box) boundary around an artificial grouping that was an artifact of the web page structure that was replicated in my PDF. Okay, there’s a simple fix — Drag the frame (selection box) ”Out” of the way. Since none of the individual words in the box were highlighted, dragging the frame away did not move text.
Selecting the Link List for Deletion – Trial One. I grabbed more text than I wanted but could not highlight the individual words to delete. |
Success — After Dragging the Frame Out of My Way I Was Able to Just Grab and Highlight the Links List. |
This time the TTT highlighted all the words in the author’s list of links. Backspace and zap! It’s gone.
Then I selected, using the TTT, the rest of the unneeded links to other Thai recipes (e.g., desserts, curries, and more) on page 2. Zap! They’re gone—all I have left on the page is a few words for preparing the lemongrass, a part of the preparation instructions that I want to save. I also kept one copy of copyright information for later use.
Sometimes using the TTT is like using a saber rather than a foil; you just “hit” too much of the target area. One elegant way to focus the TTT tool is to pretend you are using it to draw a line, though only a line of the unwanted text. [Similar to doing a strikeout]. Sometimes even that more focused method does not work. There is however a workaround that even more narrowly focuses the TTT. By holding down the option key when “selecting” text to tweak, you select less material. But, once in a while, even that does not work completely. There are two orphan underlines left where I’d deleted the text. So you option-TTT again and grab these underscores, at times working in a zoomed image. |
As you move to pages 3 and 4, you’ll notice that they, in part, contain general header material that is a duplicate of that part of page 1. Get rid of this material also.
Image Of Some Unneeded Header and Other Material On Pages 2 And 3 |
Without the unneeded material, what is left are images of the grinding of the coriander and processing the ingredients. There are also a few words for a caption for each image. In addition, there are several lines of preparation instructions one needs to keep, as well as the phrases that might make good sub-headers or lead captions for each recipe section.
It took three to four additional TTT grabs to remove the unwanted text on each page, leaving me with lots of white space I can use later to shrink the recipe to two pages. (We still have 4 pages now.)
Note: Look carefully at that text or image you’ve selected. You don’t want to accidentally delete materials you need. Remember, Acrobat allows you to undo many mistakes. If you capture more material that you want to delete, try the grab again. |
Cleaning Up Page 1 — You recall that each link Darlene provided contained all sorts of header material associated with About.com and the specific recipe. On page 1, I’d like to save some of this stuff because it better reflects the web page on which I found the recipe, but I want to make it pretty.
I had two tasks that I want to accomplish on page 1. First, I want to thin out the path information to this page. Second, I want to move the retained material into the newly created white space so I can copy the preparation methods (a few lines) from page 2 to page 1. This allows me to toss page 2, leaving behind a three-page recipe segment.
What You See Is What You Get, After Cleanup |
TTT Actions Taken, A Sequence —
Move the About.com logo out of the way
Delete the path information at the top of the page
Move the Words “Thai Food” to the upper left hand corner.
Move the Recipe Title under Thai Food. [Save]
Delete the words Your Guide to Thai Food and FREE Newsletter. Sign Up Now!
Move Darlene Schmidt’s name next to Thai Food
Grab the remainder of the recipe information and move it up on the page
Delete Darlene’s name and previous/next from under the picture
Copy Step 1 preparation information from page 2 and paste it into page 1. Drag the preparation material to where it belongs under the ingredients.
Now Pages 2 and 3 — I then needed to consolidate the pictures and short preparation steps in the now remaining pages 2 and 3.
TTT Actions Taken:
Select all the material in the center of the page and drag it to the top of the page. [Save.] Paste the contents of page 3 into page 2. I had to tweak the spacing a bit to make things fit after pasting. For that I used the Correct Textual Mistakes [TxT] tool.
Last Thoughts On Fixing Complex PDFs
After all is done, albeit a slower process than working in MS Word, my PDF copy looked as good as the one I created in Part II using MS Word.
On the other hand, when working with a simple one- or two-page PDF, I can tweak a recipe faster in Acrobat than in Word. For merely adding a picture to a recipe that does not have one, Acrobat is easier to use. And using the Touch up Text tool, it’s easy to break up a long line of text into two sections, leaving more space to paste a picture.
In conclusion, you collect recipes online and don’t want them laden with the extra stuff that comes along with capturing whole, unedited pages, use these tools and shape them to your liking. Both Microsoft Word and Adobe Acrobat work well for this purpose, bother offering their own share of tools.
Use what works best for you. I have and therefore use other for tuning recopies I may never cook, but often want to share with my friends or vicariously savor.