|
Using the Services MenuAn often overlooked feature of Mac OS X is the Services architecture. Included in the Application menu of many Mac OS X applications is the Services sub-menu which provides the means for transferring selected data from one application to other for processing. The Mac OS X version of SiteSucker takes advantage of the Services architecture by allowing other applications to download a selected URL using SiteSucker. For example, to have SiteSucker download the current page displayed in Safari, select the page's URL in the address box at the top of the Safari window and choose "Download with SiteSucker" from the Services sub-menu under the Safari menu. You can also activate this service by selecting the address and pressing Command-Shift-S. Safari will send the selected URL to SiteSucker, and SiteSucker will download the URL using its current preference settings. If SiteSucker is busy downloading another site, the requested URL will be added to the queue. Here are some troubleshooting tips in case you have problems getting this feature to work.
Using AppleScriptSiteSucker supports AppleScript, which can be used to set SiteSucker preferences and automate downloads. SiteSucker also provides a Script menu from which you can quickly run your scripts. When you select an AppleScript in the Script menu, SiteSucker runs it. To add a script to this menu, create it using the Script Editor and save it as a compiled script in the Scripts folder inside the folder containing the SiteSucker application. You can also include aliases to compiled scripts in this folder. If you are running under Mac OS X and the Scripts folder does not exist, SiteSucker will search for scripts in ~/Library/Scripts/Applications/SiteSucker (where "~" is your home directory). You can place any kind of script in the Scripts folder including scripts that have nothing to do with SiteSucker. For example, you could add an "Empty Trash" script to the Scripts folder, and then whenever you select it in the Script menu, the Finder would empty the trash. From the Script menu, you can also set up schedules that direct SiteSucker to run your scripts at some future date and time or run them periodically. For instance, you could create a schedule that runs your "Empty Trash" script every day at 9 AM. Or you could use SiteSucker as a backup utility by having it download your Web site every week. (For more information on setting up schedules, see Scheduling Scripts.) The following sections describe how to use AppleScript to automate SiteSucker. Downloading FilesThe download command tells SiteSucker to start downloading. To have SiteSucker download the URL that is currently in the Web URL text field, simply enter the following command: download To have SiteSucker download a specific URL, enter a command like the following: download "http://www.mysite.com/main.html" If SiteSucker is already downloading a site, the URL will be added to the queue. If the command includes a list of URLs, SiteSucker will start downloading the first URL and add the others to the queue. To download more than one URL, enter a command like the following:
download {"http://www.mysite.com/main.html", ¬ You can also specify which paths should be included or excluded from the download. To do this, you would enter a command like the following:
download "http://www.mysite.com/" including ¬ This command will download www.mysite.com and any files that it references which are within the http://www.theregister.com/2004/ directory without downloading files inside the http://www.theregister.com/2004/02/ and http://www.theregister.com/2004/03/05/ directories. Note: The "including" and "excluding" options for the download command override the Paths settings in the SiteSucker preferences. Sometimes you may not want your script to continue until the download is complete. For example, you might want to wait until all your files are downloaded before your script displays these files in a Web browser. To have your script wait until SiteSucker is finished downloading, enter commands like the following:
download "http://www.mysite.com/main.html"
repeat while downloading of front window Opening a URL Clipping FileThe open command can be used to open a URL clipping file. This will copy the contents of the clipping file into the Web URL text field. If the Drag Triggers Download preference is on, SiteSucker will start downloading the URL automatically. If the command includes a list of clipping files, SiteSucker will start downloading the first URL and add the others to the queue. The following statements are examples of commands that can be used to open clipping files: open file "Hard Disk:Clipping Files:mysite.webloc"
open {file "Hard Disk:Clipping Files:mysite.webloc", ¬ Getting and Setting SiteSucker PreferencesThe set command can be used to set SiteSucker preferences. If SiteSucker is already downloading a site, the new preference settings will take effect after the download is finished. This lets you create a single script that sets up SiteSucker preferences for one site, downloads it, changes the preference settings for another site, and then downloads the other site. To set SiteSucker preferences, enter commands like the following: set log errors of preferences to true set log warnings of preferences to false set log history of preferences to false set check all links of preferences to false set localize html of preferences to false set replace files of preferences to never replace set replace files of preferences to always replace set replace files of preferences to with newer set drag triggers download of preferences to true set ignore robot exclusions of preferences to false set ambiguous URLs are files of preferences to false set site login dialog of preferences to never show set site login dialog of preferences to always show set site login dialog of preferences to show when necessary set identity of preferences to "None" set identity of preferences to "SiteSucker" set identity of preferences to "Safari" set download folder of preferences to "Hard Disk:Sites:" set ask before downloading of preferences to true set limit to level of preferences to 1 set limit to level of preferences to 32767 (* No Limit *) set limit to directory of preferences to no limit set limit to directory of preferences to web url host set limit to directory of preferences to web url directory set minimum file size of preferences to 0 set minimum file size of preferences to 16 set download attempts of preferences to 2 set download timeout of preferences to 60 set download delay of preferences to 0 set download delay of preferences to 40 set file types of preferences to all types set file types of preferences to include types set file types of preferences to exclude types set included extensions of preferences to "jpg gif css" set excluded extensions of preferences to "zip mov mpg" set included paths of preferences to ""
set included paths of preferences to ¬
set excluded paths of preferences to ¬
set excluded paths of preferences to ¬ To get a preference value and then restore it later, enter commands like the following: set old_limit_to_level to limit to level of preferences set limit to level of preferences to 3 download "http://www.mysite.com/main.html" set limit to level of preferences to old_limit_to_level Getting and Setting SiteSucker Window InformationYou can use AppleScript to get and set SiteSucker window information. For example, you can create a script to move the SiteSucker window or find out how many files have been downloaded. To set SiteSucker window information, enter commands like the following: set bounds of first window to {144, 255, 1024, 474} set web url of window 1 to "http://store.apple.com/" You can use the following commands to display the status of the SiteSucker window: set winBounds to bounds of window 1
set dialogText to "paused = " & paused of some window & return & ¬ display dialog dialogText buttons {"OK"} default button 1 Stopping a DownloadTo stop a download, simply enter the following command: stop Quitting SiteSuckerTo quit SiteSucker, simply enter the following command: quit Scheduling Scripts |
|
|
You can set up schedules for running your scripts by selecting Schedule Scripts from the Script menu. (For more information on creating scripts, see Using AppleScript.) This command displays the Schedule Scripts dialog which shows a list of scripts and the date and time when each script is scheduled to run. If the script has a check mark next to its name, the schedule is enabled. You can use the following buttons to modify your schedules. AddClick the Add button to add a new schedule to the list. This will display the Set Schedule dialog. The dialog includes a pop-up menu that lists all the scripts in the SiteSucker Scripts folder located inside the SiteSucker application folder. Select a script and set a date and time when SiteSucker should run the script. You can also direct SiteSucker to run the script repeatedly at a regular interval. SiteSucker, however, will only run the script if the Enabled box is checked. |
|
|
When a non-repeating schedule comes due, SiteSucker will run the script and then turn off the schedule's enabled flag. For a repeating schedule, it will run the script and then update the date and time when the script will run next. If you launch SiteSucker and a schedule is past due, SiteSucker will immediately run the script. EditClick the Edit button to edit an existing schedule. This will display the Set Schedule dialog. RemoveClick the Remove button to remove an existing schedule from the list. Using a Local HTML FileSiteSucker allows you to use a local HTML file to tailor your download. For example, to download two levels from two different sites that link to one another and localize the files, you could create a simple HTML document like the following: http://xyz.com/ Set the SiteSucker preferences as follows:
Now, drag your local HTML file to a browser like Safari to generate a file URL (i.e., file:///Volumes/...), then drag the file URL from the browser to the SiteSucker window to start the download. (You can save this file URL for later use by dragging it from the browser to the Finder to create a clipping file. Whenever you drag this clipping file into the SiteSucker window, the file URL will be copied into the Web URL text field.) SiteSucker will download two levels from both sites and fix the links so that they point to your locally downloaded files. Here is another example of how to use a local HTML file to tailor your download. To download the contents of two different folders at the same site and localize the files, you could create an HTML document like the following: http://www.xyz.com/archive/images/ Set the SiteSucker preferences as follows:
Now, drag your local HTML file to a browser like Safari to generate a file URL (i.e., file:///Volumes/...), then drag the file URL from the browser to SiteSucker to start the download. SiteSucker will download the contents of the "manuals" and "images" folders and fix the links so that they point to your locally downloaded files. Assigning Applications to Downloaded FilesWhen running Mac OS 9, SiteSucker relies on the Internet Config settings to tell it how to save downloaded files. The Internet Config settings can be accessed in the File Exchange control panel. For example, the control panel might specify that a file with an "html" extension is a TEXT file that should be opened using the Internet Explorer application. To assign a different application to HTML files, open the File Exchange control panel and click on the PC Exchange tab. You will see a list that shows extensions and how they are mapped to applications and file types. Scroll down the list until you see the "html" extension. If it's not in the list, click on the Add button to add it. If it's already there, click on it in the list to select it and then click on the Change button. In the dialog that's displayed, choose the application and file type you would like the extension mapped to. For example, you might click on Netscape Communicator in the application list and then choose TEXT from the File Type pop-up menu. You must quit and re-launch SiteSucker before these changes will take effect. With the settings described above, SiteSucker will always create a Netscape Communicator TEXT document whenever it downloads a file that has an "html" extension. You should also use File Exchange to assign applications for "htm", "jpg", "gif", and other common extensions. For more information on file extensions, visit Dot What? The file extension database or FileInfo.net - The File Extensions Resource. When running under Mac OS X, SiteSucker does not assign a file type or creator to downloaded files but lets the Finder determine which application should open a file based on the file extension. To change the application that opens a file, select the file in the Finder and choose Get Info from the File menu. Click "Open with" in the Info window. Choose an application in the pop-up menu or choose Other to select an application not in the pop-up menu. To apply this change to all documents of this type, click the Change All button. Downloading Password-protected SitesTo download a password-protected site, you can include your user ID and password in the address that's entered into the Web URL text field. The URL for a password-protected site should have the following form: http://user:password@www.mysite.com/index.html where "user" is your user ID and "password" is your password. If your user ID is an e-mail address, you will need to encode the @ symbol before you can include it in the URL. To do this, simply replace the "@" with "%40". If you do not include this information in the address, SiteSucker will display a login dialog that you can use to enter your user ID and password. If you check "Remember this password" in the login dialog, the user ID and password that you enter will be stored in the Keychain. |
|
|
The Site Login Dialog option in the preferences determines when the Site Login Dialog is displayed. If the preference is set to Never Show, SiteSucker will never log in to a site that requires a password and will always report an Unauthorized Access error for the site. If the preference is set to Always Show, SiteSucker will always display the Site Login Dialog for a site that requires a password and will fill in the user ID and password for the site if they were found in the Keychain. If the preference is set to Show When Necessary, SiteSucker will only display the Site Login Dialog if it was unable to find the login information in the Keychain. Otherwise, it will automatically log in to the site using the user ID and password found in the Keychain. Using an HTTP ProxySiteSucker can send requests through an HTTP proxy server. To use this feature, simply set up the proxy server settings in the Network Preferences (Mac OS X) or Internet Control Panel (Mac OS 9) and SiteSucker will direct your requests to the specified proxy server. If the HTTP proxy server requires password authentication, SiteSucker will display a login dialog that you can use to enter your user ID and password. How Cookies Are HandledSimply put. a cookie is some text that a Web server stores on your hard disk. Cookies allow a Web site to store information on a user's machine and later retrieve it. For example, a Web site might store a unique ID number for each visitor on each user's machine using a cookie. When running under Mac OS X, SiteSucker includes cookies in its server requests. When you first start a download, SiteSucker examines the Cookies.plist file in ~/Library/Cookies (where "~" is your home directory) for any cookies that were stored there by the site that you are trying to download. SiteSucker then adds any cookies that it finds to the download request. Furthermore, if SiteSucker encounters a Set-Cookie header as part of an HTTP response, it adds this new cookie to subsequent downloads. |