Sitellite Application Framework
Class Tree         Index         All Elements

Class: Messy

Source Location: Program_Root/HTML/Messy.php

Class Overview

XML_HTMLSax
   |
   --Messy



Variables

Methods


Inherited Variables

Inherited Methods


Class Details

[line 28]


[ Top ]


Class Variables

$levels = array ()

[line 199]

This array is used to compare opening and closing tags within the document structure, and to try to repair them by inserting missing tags where necessary.



Tags:

access:  public

Type:   mixed


[ Top ]

$output = array ()

[line 37]

The output from the last call to parse().



Tags:

access:  public

Type:   mixed


[ Top ]

$safe =  true

[line 211]

This tells Messy whether to use the stripTags and stripAttrs lists

or the stripTagsSafe and stripAttrsSafe lists, which contain additional tags and attributes that are considered potentially unsafe. The default is to use the latter and be more secure by default.




Tags:

access:  public

Type:   mixed


[ Top ]

$selfClosing = array (
      'img',
      'br',
      'hr',
      'meta',
      'link',
      'area',
   )

[line 46]

Contains a list of tags that are self-closing (ie.

they do not contain any data, such as a br tag).




Tags:

access:  public

Type:   mixed


[ Top ]

$stripAttrs = array (
   )

[line 138]

Contains a list of attributes that should be stripped from the output.



Tags:

access:  public

Type:   mixed


[ Top ]

$stripAttrsSafe = array (
      'onclick',
      'onsubmit',
      'onselect',
      'onchange',
      'onmouseover',
      'onmouseout',
      'onfocus',
      'onblur',
      'ondblclick',
      'onhelp',
      'onkeydown',
      'onkeypress',
      'onkeyup',
      'onmousedown',
      'onmousemove',
      'onmouseup',
      'onresize',
      'dataformatas',
      'data',
      'datafld',
      'datasrc',
      'dynsrc',
   )

[line 148]

Contains a list of attributes that should be stripped from the output.



Tags:

access:  public

Type:   mixed


[ Top ]

$stripTags = array (
      'font',
      'spacer',
      'blink',
      'xml:namespace',
      'o:p',
      'st1:city',
      'st1:address',
      'st1:street',
      'st1:state',
      'st1:place',
      'st1:placename',
      'st1:placetype',
      'st1:personname',
      'st1:country-region',
      'v:shapetype',
      'span',
      'del',
      'frame',
      'frameset',
      'layer',
      'ilayer',
      'link',
      'meta',
      'xml',
      'minmax_bound',
   )

[line 62]

Contains a list of tags that should be stripped from the output.



Tags:

access:  public

Type:   mixed


[ Top ]

$stripTagsSafe = array (
      'font',
      'spacer',
      'blink',
      'xml:namespace',
      'o:p',
      'st1:city',
      'st1:address',
      'st1:street',
      'st1:state',
      'st1:place',
      'st1:placename',
      'st1:placetype',
      'st1:personname',
      'st1:country-region',
      'v:shapetype',
      'span',
      'del',
      'script',
      'applet',
      'object',
      'iframe',
      'frame',
      'frameset',
      'layer',
      'ilayer',
      'embed',
      'bgsound',
      'link',
      'meta',
      'xml',
      'minmax_bound',
   )

[line 97]

Contains a list of tags that should be stripped from the output.



Tags:

access:  public

Type:   mixed


[ Top ]

$transform = array (
      'b' => 'strong',       'i' => 'em',
      'center' => array (          'tag' => 'div',
         'attrs' => array (
            'align' => 'center',
         ),),)

[line 180]

Contains a list of tags that should be transformed into other tags in the output.



Tags:

access:  public

Type:   mixed


[ Top ]



Class Methods


constructor Messy [line 219]

Messy Messy( )

Constructor method.



Tags:

access:  public


[ Top ]

method clean [line 586]

string clean( string $doc, [boolean $isXml = false])

Returns a "clean" version of the HTML or XML data provided, by calling both parse() then toXML() for you and return the result.



Tags:

access:  public


Parameters:

string   $doc  
boolean   $isXml  

[ Top ]

method pad [line 493]

string pad( integer $length)

Returns a string of empty space, whose length is determined by the $length parameter.



Tags:

access:  public


Parameters:

integer   $length  

[ Top ]

method parse [line 241]

array parse( string $data, [boolean $isXml = false])

Parses the given HTML or XML $data into an array of

"tokens", which are associative arrays with the following properties: tag (the name of the tag), attributes (a key/value array of tag attributes/properties), level (the depth of this tag within the document), type (either 'open', 'complete'

  • as in self-closing, 'cdata' - as in Character DATA, or
'close'), and the value of the tag (AKA the contents of it). This is also stored in the $output property of your Messy object.




Tags:

access:  public


Parameters:

string   $data  
boolean   $isXml  

[ Top ]

method toXML [line 506]

string toXML( )

Uses the internal $output array from a previous call to parse() and returns an XML representation of the document.



Tags:

access:  public


[ Top ]

method toXMLDoc [line 604]

object reference &toXMLDoc( )

Uses the internal $output array from a previous call to parse() and returns an XMLDoc object representation of the document. Sets $error, $err_code, $err_line, etc.

from the SloppyDOM error values and returns false should an error occur, which it easily could because there's no guarantee "cleaned up" markup is necessarily correctly formatted markup.




Tags:

access:  public


[ Top ]


Copyright © 2007, SIMIAN systems Inc.
All rights reserved. Privacy policy
Documentation generated on Tue, 13 Feb 2007 17:18:12 -0600 by Sitellite AppDoc and phpDocumentor 1.2.2