mozdev.org

MAF

User Manual

Mozilla Archive Format is an add-on for the Firefox and SeaMonkey browsers that adds the ability to open and save web archives, and provides several improvements to the standard browser's save system.

For a quick overview, see the features page.

Note: This manual covers the latest version of Mozilla Archive Format. Some features may not be available in the version from the Firefox Add-ons website.

Introduction

Web archives are a convenient means to preserve web pages. They store all the text, images, and other resources of a web page in a single file. When a web archive is moved or renamed, the saved pages are unchanged.

Saving web archives

Mozilla Archive Format provides two new options of file type in the Save As dialog box:

Web Archive, MAFF zipped

This option saves one or more pages inside a single MAFF archive. MAFF archives are compressed using the universal, cross-platform ZIP specification for saving multiple files in one archive.

MAFF archives can be opened in the browser. If multiple tabs were saved, opening a MAFF archive opens all the tabs, exactly the way they were when they were saved.

It is possible to view the original location from which the page was saved. The contents of the archive, including any embedded media files, can be inspected and extracted using any ZIP utility.

The Mozilla Archive Format extension generates MAFF archives using the fast, native ZIP implementation embedded in the Mozilla browser. The resulting files are usually smaller than the equivalent MHTML archives, and opening this kind of file is faster.

However, Microsoft Internet Explorer cannot open MAFF files natively.

Web Archive, MHTML

This web archive format, also known as MHT, is used by Microsoft's Internet Explorer browser. This option saves a single page inside a MIME HTML file, or MHTML archive.

MHTML files are encoded, not compressed. The encoding usually increases the size of the saved media files compared to the original. At present, the contents of an MHTML archive can be decoded by only a limited number of web browsers, or by using special utilities. However, MHTML archive format has the advantage that it can be shared with those who use only Internet Explorer.

Additional information saved in web archives

When you save a page as a web archive, the following additional information about the tab or tabs saved is stored in the archive:

If you re-save an already archived page to a different file, the date and time saved, and the location of the original archive, are preserved.

Opening web archives

After Mozilla Archive Format is installed, web archives can be opened as with any saved web page.

Creating file associations

On Windows, Mozilla Archive Format can create file associations to open web archives by double-clicking the file names in Windows Explorer.

File associations can be controlled from the welcome page that is displayed when the extension is installed for the first time, or when the Refresh File Associations link is selected from the Actions options pane.

In the welcome page, you can choose whether file associations are created, for the MAFF and MHTML formats separately.

File associations are always created explicitly for the current user of the system. In addition, if the current user has administration privileges, default file associations for all users are also created.

File associations are not removed when uninstalling the add-on or the browser itself.

Viewing information about archived pages: Web page information panel

When you display an archived page, an additional icon appears in the address bar of the browser (default operation). To display information about each saved web page, first choose a tab, if more than one page was saved. Then left-click on the icon to display the following information about that page:

The original location is a link. You can left-click it to open the original page in the same tab, or you can use the appropriate key combinations to open the link in a new tab or a new window.

From the web page information panel you can choose Browse Open Archives. The Archives dialog provides additional information about all the web pages in the archive. The information displayed can be selected with the mouse and copied to the clipboard.

Display the icon in the lower right-hand corner of the status bar by choosing that add-on option. You can also control the visibility and position of the icon from the interface options.

Integration with other extensions

One of the key features of Mozilla Archive Format is that it integrates not only with the browser but with other extensions also.

Some of the other extensions, like UnMHT, must be installed separately. Other extensions, like Save Complete, are embedded and updated together with Mozilla Archive Format. Here is a list of extensions that inter-operate with the Mozilla Archive Format extension:

Multiple Tab Handler, by Shimoda Hiroshi

This extension adds a multiple selection interface and a new context menu to the Firefox tab bar.

Mozilla Archive Format integrates with the tab selection context menu and adds an entry to save the selected tabs in an archive. For MHTML archives, multiple files are created, while for MAFF archives all the tabs are saved in a single file.

Save Complete, by Stephen Augenstein

The Save Complete extension is integrated with Mozilla Archive Format, but must be enabled from the internal configuration settings.

This extension replaces the system used by the browser to save complete web pages. The new system correctly handles style sheets referencing image files, that otherwise would not be saved, causing some pages to appear differently.

File Title, by Pavel Cvrcek

The functionality of the File Title extension is also available from the Mozilla Archive Format options.

This extension replaces the default file name suggested in the Save As dialog box with the title of the page being saved.

Title Save, by gm

This extension is similar to File Title, but does not affect the default behavior. It adds a new item in the File menu to use the title of the page, instead of the file name, in the Save As dialog box.

You can use the new command to save MAFF and MHTML archives too.

You may install this extension if you want to selectively use the page title instead of the original file name. In this case, ensure that the browser's default naming strategy is selected in the options, otherwise the title of the page might be used in all cases.

UnMHT, by Arai

This extension adds new options in the File menu to save MHTML archives, providing other advanced features also.

If UnMHT is installed, you can continue to use Mozilla Archive Format to create and open MAFF archives, while MHTML archives are opened with UnMHT.

Converting previously saved pages to other file formats

You probably already have some web pages saved among your local files. These pages are often stored as file / folder pairs (like Page.html and Page_files), and you may want to convert them to a web archive format for easier maintenance as a single file.

You may also want to convert saved pages from one web archive format to another, for example from MHTML to MAFF to save disk space, or from MAFF to MHTML to achieve compatibility with Internet Explorer.

Converting single pages

Converting a single page that was previously saved locally is easy. Just open the page in the browser and re-save it in another file format. The Mozilla Archive Format extension handles the details of the conversion process, and preserves the information about the original source, if available.

When converting a web page that is not stored in an archive, the following information is preserved:

When converting a web archive to another archive format, all the information that is supported by the destination file format is preserved.

When saving an archived page as a complete page outside of an archive, if the integrated Save Complete extension is enabled, the original source location is stored in a comment inside the saved page.

Converting multiple pages

If you have many saved pages that you want to convert to another file format, you can use the Saved Pages Conversion Wizard. You can start the wizard using the Tools » Mozilla Archive Format » Convert Saved Pages menu item. If the Mozilla Archive Format sub-menu is hidden, you must first enable it from the interface options.

Important considerations: The wizard allows you to convert all the pages located in one folder, optionally including all its sub-folders. The wizard automates the manual tasks of opening each page and saving them using another file format. When using the conversion wizard, the following information must be considered:

Selecting which files to convert

First select the source and destination file formats. Then select the folder in which the source files are located. You can decide to look in sub-folders of the selected folder, or you can convert only the files that are placed directly inside the selected folder.

The selected source format determines how the wizard will look for source files. The MAFF and MHTML web archive formats are recognized by their extension, respectively .maff and either .mht or .mhtml. Complete web pages are recognized because they have an associated support folder, for example Page.html and Page_files, but also Page (without extension) and Page_files. Web pages saved as single files, without support folders, are recognized by their extension only.

If you are using your browser in a language other than English, the recognition of additional support folder suffixes will be enabled. For example, if you are using your browser in French, a support folder named Page_fichiers is recognized, in addition to the English Page_files.

If you previously saved pages using a browser in a different language than the current one, the support folder names may not be recognized correctly, and you might have to adjust the list of recognized suffixes in the internal configuration settings.

The selected destination format determines how the wizard will assign the output file names. The extension in the source file name, if present, is always replaced with the correct extension for the destination file format. For MHTML, the internal configuration settings determine whether the .mht or .mhtml extension is used.

Then select the destination folder. You may want to place the converted files in a different folder from the original files; that option is particularly useful if you are converting from a read-only source, such as a CD-ROM or a DVD. The original folder structure is always preserved, so that if a source file is located in a sub-folder of the original folder, the converted file will be located in a sub-folder in the destination folder with the same name as the sub-folder in the original folder.

You may also choose to place the converted files in the same folder as the original files. Each converted file will be placed in the same folder as its original, with the same file name but with a different extension. In that case, after conversion you may want to move the originals manually, by selecting a folder that will be used as a bin for the original files that have been successfully converted.

If you are converting from the MAFF file format and use of the "jar:" protocol is enabled in the internal configuration settings, you will not be able to move the source files to another folder, since the browser will lock the files in place until it is closed. If you want to move the source files when converting from MAFF to another format, you should disable the use of the "jar:" protocol for the duration of the conversion process.

The conversion wizard will never delete or overwrite the source files. Since in unusual cases the converted pages may not be entirely faithful to the original, you should always keep a backup of your source files, even after a successful conversion.

Finally, the source folder is scanned to locate the original files. Depending on how many files are present in the source, this operation may require some time. If you are working with large folder trees, you may want to repeat the wizard multiple times, converting one sub-folder at a time.

Before the actual conversion begins, you have the option of fine-tuning your selection. You can also verify that the source files have been identified correctly. In addition to the source file name, support folder name, and sub-folder in the list of files, you may display other columns like the full source, destination, and bin paths.

If for any reason the destination file or support folder is already present, or if a file or support folder is already present in the folder where the source file would be moved after conversion, the source file name will appear in the list, but the selection checkbox will be disabled. This often indicates that the source file was converted successfully during a previous run of the wizard.

Completing the conversion

After you have selected the files to be converted, click the Finish button to start the conversion process. Depending on the number of files, this process may require some time.

You can cancel the conversion at any time by closing the wizard or by using the Back button. Canceling the operation may require some time.

When the operation is finished, you can see the count of how many files have been successfully converted and how many conversions failed. The icon near each file name indicates its current status: not selected, already converted, waiting for conversion, currently converting, conversion failed, or conversion succeeded.

Detailed information about the reasons for conversion failures is available in the Error Console, accessible from the Tools » Error Console menu item.

If you are satisfied with the results, click the Finish button to close the window. You may also use the Back button to retry the conversion process with the same or different settings.

Options

Main

Default format when saving in a single file:

This option controls the default file type selected in the Save In Archive As dialog box. The file type can also be changed at any time in the dialog itself.

The file type specified in the standard Save As dialog box is not affected by this option.

When saving complete web page contents:

This option controls which method is used to find all the web resources (images, styles, sub-frames, ...) that are included in the web page being saved.

  1. Take an exact snapshot. (default) This option provides the most accurate save mode. It captures the current state of the page and creates an exact replica, including the current values of form fields and video and audio embedded in the page. This save mode works especially well for pages that make extensive use of scripts or use dynamic technologies like AJAX.

    The resulting page will be static, as scripts are disabled by the save operation to preserve the integrity of the result when it is displayed again. If you need to keep scripts, you can enable the intergated Save Complete component from the internal configuration settings.

  2. Use browser's standard save system. With this setting, the web pages are saved by the browser. How much of the web page is actually saved depends on the version of the browser being used.

    When this option is enabled, Mozilla Archive Format will not be able to create MHTML files according to the original specification. Other browsers will not be able to display a properly formatted page, in particular if the saved page contains nested CSS style sheets or inner frames.

Note that the selected method of saving is used not only when saving archives, but also when saving complete pages using the Web Page, complete file type.

For the suggested file name:

This preference controls which method is used to select the default file name in the Save As and Save In Archive As dialog boxes.

  1. Use the title of the page. (default) With this setting, the title of the page is preferred to the original file name. This is done for all HTML and XHTML pages, unless the server from which you are downloading the page explicitly asked the browser to use a specific file name. Note that if other extensions affecting this behavior are installed, this setting may not work as expected.
  2. Use browser's standard naming strategy. With this setting, the Mozilla Archive Format extension does not alter the current behavior, which is determined by the browser or by other installed extensions. If no other extension affecting this behavior is installed, the original name of the file is suggested instead of the title of the page.

Interface

When an archived page is opened:

You can control which notifications are displayed to access additional information about an archived page.

  1. Show icon in the location bar  When viewing a page that is saved in an archive, an icon in the location bar will appear, allowing an information panel to be opened. The icon is hidden during normal browsing.
  2. Display an information bar  Every time an archived page is opened, an information bar will appear showing the original location, as well as the date and time the page was saved. The information bar can be closed, and it will not reappear until the page is reloaded.
Show Mozilla Archive Format menu items in:

You can select which menus will display the Mozilla Archive Format items. These items open a special Save In Archive As dialog. They are useful if you routinely use the standard Save As dialog to save only the text of a page, and need a separate option to save a page in an archive without changing the selection in the file type drop down list.

Actions

You can use the links in this options pane to launch the Saved Pages Conversion Wizard, to refresh file associations on Windows, or to visit the official web site.

Internal configuration settings

These configuration settings are not available from the options dialog. To access them, put about:config into the browser's address bar and press the Enter key. Type maf into the filter field; that will show only the settings that apply to Mozilla Archive Format. Normally don't change the settings unless the reason is clearly understood. Non-default settings may adversely impact functionality or performance.

Note: In previous versions of the add-on, the same internal settings were available with slightly different names. If you customized the settings using the old names, they will still appear in about:config, but they will have no effect on newer versions. You have to customize the settings again, using their new names, for them to take effect. However, this is rarely required because new versions are designed to work optimally without the need to customize the settings.

extensions.maf.advanced.datafoldersuffixes

This option influences the Saved Pages Conversion Wizard when selecting which files to convert.

Complete web pages are recognized because they have an associated support folder, with the same base name as the main file and a different suffix, for example Page.html and Page_files.

This option displays a comma-separated list of the recognized suffixes. An additional suffix that depends on the current browser language may also be recognized without it being explicitly listed in this option.

extensions.maf.advanced.maff.compression

Controls the compression level used when saving files in a MAFF archive.

  1. dynamic (default)  Use maximum compression for all files, but do not re-compress media files.
  2. best   Use maximum compression for all files.
  3. none   Store all the files uncompressed.
extensions.maf.advanced.maff.extendedmetadata

With this preference enabled, history, text zoom, and scroll position is saved for each page. At present this additional information is ignored when the archive is opened.

extensions.maf.advanced.maff.ignorecharacterset

When this setting is enabled, the character set specified for pages saved inside MAFF archives is ignored. Enabling this option may be useful for troubleshooting problems with internationalization. This option will cause saved pages to be displayed incorrectly in most cases.

extensions.maf.advanced.maff.usejarprotocol

If this preference is enabled, when you open a MAFF archive its contents will be accessed directly using the "jar:" protocol, without being extracted.

However, if you enable this option, the archive files you open will be locked, and you will be unable to move, rename or delete them until the browser is closed.

extensions.maf.advanced.mhtml.usemhtmlextension

If this option is selected, and you do not type a file extension in the Save As dialog box, or file extensions are hidden, the complete .mhtml extension will be appended to the file name of MHTML archives, instead of the more common .mht extension.

extensions.maf.advanced.temp.clearonexit

This option is enabled by default. If disabled, the contents of the temporary directory are preserved after the browser exits, and must be emptied manually.

extensions.maf.advanced.temp.folder

Use this preference to choose the location of the temporary files required to open and save the web archives. The contents of this folder will be lost if extensions.maf.advanced.temp.clearonexit is true.

If not specified, this location defaults to a sub-folder of the system temporary folder, which is different for every browser profile.

If customized, the absolute path to the specified location is remembered, regardless of the system temporary folder path.

extensions.maf.save.keepscripts

When this setting is enabled, and the option to take an exact snapshot of the page is also enabled, the integrated Save Complete component written by Stephen Augenstein will be enabled. This will attempt to preserve the dynamic features of the page by keeping all the scripts and the original page source code. However, content generated by scripts may be missing from the resulting page.

More documentation

This document provides only user documentation for the add-on. Technical documentation about the internals of Mozilla Archive Format and the MAFF file format are available in separate documents, the API documentation and the MAFF specification.