HyperText Studio Title
HyperText Studio Title Home Button Features Button Download Button Purchase Button Support Button Contact Button Company Button
 

Technote 10005: Introduction to cleaning HTML code

Using the HyperText Studio's powerful clean feature you can create your own cleans when you want to remove or modify sets of tags and attributes in a document. You can also modify the default clean sets.  This technote takes you through the process of creating a custom clean and introduces you to many of the concepts used in HTML cleaning.

Note: This technote assumes some knowledge of HTML code.

Opening the associated file

  1. Download the file used in this tutorial by clicking here.  Unzip the file and extract faq.aspx.  This file is part of the HyperText Studio tutorial and was originally produced in Microsoft Word but has already been cleaned by the HyperText Studio's Microsoft Word clean set.
  2. Select File | Open and open faq.aspx.
  3. Go to Source view.

Creating a Custom Clean Set

A clean set contains the rules used to clean a document.

  1. Select Tools | Reformat Code.
  2. Select Clean Code.
  3. Click  Browse.
  4. Click New. The currently saved list of settings displays.
  5. Type myclean as the name of the Clean Set.
  6. Click Browse.

    The Clean Set dialog box opens.
  7. Leave the dialog box open.

Creating a New Match

A clean rule is made up of a match, which finds a tag to work with and an action, which does something with the tag that has been matched.

In our case, some of the style sheet classes that were created while cleaning the Microsoft Word 2000 code are redundant, so you are going to remove them. You will create four matches, with their corresponding actions, to remove all span tags that have the class attribute set to "class5", "class6", or "class7", and to remove the "MsoNormal" class from p tags.

  1. Click New Match. The Clean Rule Match dialog box opens.
  2. Fill in the boxes as shown below.
  3. Click OK. Leave the Clean Settings dialog box open.

Creating a New Action

Now that a match has been created, you need to configure the match to do something - this is called an action.

  1. Click New Action.
  2. Select Remove Element from the Action Type drop-down list.
  3. Click OK to close the Clean Rule Action dialog box. Leave the Clean Settings dialog box open.

Adding New Matches and Actions

  1. Repeat the steps in Creating a New Match and Creating a New Action to create rules removing span tags that have the class attribute set to class6 and class7.
  2. To remove the MsoNormal class from the p tag, create a match following the same steps, using p instead of span. When creating the action, select Remove Attribute.
  3. Your dialog should match the one below:
  4. Leave the dialog box open.

Setting Clean Options

You can modify the clean options so that when the references to the classes are removed from the HTML, the corresponding CSS styles are removed as well.

  1. Select the Options tab.
  2. If necessary, select Remove Unused Selectors.
  3. Click OK to close the Clean Settings dialog box.
  4. Click OK to close the Clean Settings list.
  5. Select myclean from the drop-down list, leaving Reformat Code checked.
  6. Click OK to execute myclean.

Your code should now look a lot cleaner.

 

Copyright © 1991-2011 Olson Software Limited. All rights reserved.