Package org.htmlparser.nodes
Class TagNode
java.lang.Object
org.htmlparser.nodes.AbstractNode
org.htmlparser.nodes.TagNode
- All Implemented Interfaces:
Serializable
,Cloneable
,Node
,Tag
- Direct Known Subclasses:
BaseHrefTag
,CompositeTag
,DoctypeTag
,FrameTag
,ImageTag
,InputTag
,JspTag
,MetaTag
,ProcessingInstructionTag
TagNode represents a generic tag.
If no scanner is registered for a given tag name, this is what you get.
This is also the base class for all tags created by the parser.
- See Also:
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
accept
(NodeVisitor visitor) Default tag visiting code.boolean
Determines if the given tag breaks the flow of text.getAttribute
(String name) Returns the value of an attribute.getAttributeEx
(String name) Returns the attribute with the given name.Gets the attributes in the tag.String[]
Return the set of tag names that cause this tag to finish.int
Get the line number where this tag ends.Get the end tag for this (composite) tag.String[]
Return the set of end tag names that cause this tag to finish.String[]
getIds()
Return the set of names handled by this tag.Return the name of this tag.int
Get the line number where this tag starts.int
Gets the nodeBegin.int
Gets the nodeEnd.Return the name of this tag.getText()
Return the text contained in this tag.Return the scanner associated with this tag.boolean
Is this an empty xml tag of the form <tag/>.boolean
isEndTag()
Predicate to determine if this tag is an end tag (i.e.void
removeAttribute
(String key) Remove the attribute with the given key, if it exists.void
setAttribute
(String key, String value) Set attribute with given key, value pair.void
setAttribute
(String key, String value, char quote) Set attribute with given key, value pair where the value is quoted by quote.void
setAttribute
(Attribute attribute) Set an attribute.void
setAttributeEx
(Attribute attribute) Set an attribute.void
setAttributesEx
(Vector attribs) Sets the attributes.void
setEmptyXmlTag
(boolean emptyXmlTag) Set this tag to be an empty xml node, or not.void
Set the end tag for this (composite) tag.void
setTagBegin
(int tagBegin) Sets the nodeBegin.void
setTagEnd
(int tagEnd) Sets the nodeEnd.void
setTagName
(String name) Set the name of this tag.void
Parses the given text to create the tag contents.void
setThisScanner
(Scanner scanner) Set the scanner associated with this tag.toHtml
(boolean verbatim) Render the tag as HTML.Get the plain text from this node.toString()
Print the contents of the tag.Methods inherited from class org.htmlparser.nodes.AbstractNode
clone, collectInto, doSemanticAction, getChildren, getEndPosition, getFirstChild, getLastChild, getNextSibling, getPage, getParent, getPreviousSibling, getStartPosition, setChildren, setEndPosition, setPage, setParent, setStartPosition, toHtml
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface org.htmlparser.Node
clone, collectInto, doSemanticAction, getChildren, getEndPosition, getFirstChild, getLastChild, getNextSibling, getPage, getParent, getPreviousSibling, getStartPosition, setChildren, setEndPosition, setPage, setParent, setStartPosition, toHtml
-
Field Details
-
mDefaultScanner
The default scanner for non-composite tags. -
mAttributes
The tag attributes. Objects of typeAttribute
. The first element is the tag name, subsequent elements being either whitespace or real attributes. -
breakTags
Set of tags that breaks the flow.
-
-
Constructor Details
-
TagNode
public TagNode()Create an empty tag. -
TagNode
Create a tag with the location and attributes provided- Parameters:
page
- The page this tag was read from.start
- The starting offset of this node within the page.end
- The ending offset of this node within the page.attributes
- The list of attributes that were parsed in this tag.- See Also:
-
TagNode
Create a tag like the one provided.- Parameters:
tag
- The tag to emulate.scanner
- The scanner for this tag.
-
-
Method Details
-
getAttribute
Returns the value of an attribute.- Specified by:
getAttribute
in interfaceTag
- Parameters:
name
- Name of attribute, case insensitive.- Returns:
- The value associated with the attribute or null if it does not exist, or is a stand-alone or
- See Also:
-
setAttribute
Set attribute with given key, value pair. Figures out a quote character to use if necessary.- Specified by:
setAttribute
in interfaceTag
- Parameters:
key
- The name of the attribute.value
- The value of the attribute.- See Also:
-
removeAttribute
Remove the attribute with the given key, if it exists.- Specified by:
removeAttribute
in interfaceTag
- Parameters:
key
- The name of the attribute.
-
setAttribute
Set attribute with given key, value pair where the value is quoted by quote.- Specified by:
setAttribute
in interfaceTag
- Parameters:
key
- The name of the attribute.value
- The value of the attribute.quote
- The quote character to be used around value. If zero, it is an unquoted value.- See Also:
-
getAttributeEx
Returns the attribute with the given name.- Specified by:
getAttributeEx
in interfaceTag
- Parameters:
name
- Name of attribute, case insensitive.- Returns:
- The attribute or null if it does not exist.
- See Also:
-
setAttributeEx
Set an attribute.- Specified by:
setAttributeEx
in interfaceTag
- Parameters:
attribute
- The attribute to set.- See Also:
-
setAttribute
Set an attribute. This replaces an attribute of the same name. To set the zeroth attribute (the tag name), use setTagName().- Parameters:
attribute
- The attribute to set.
-
getAttributesEx
Gets the attributes in the tag.- Specified by:
getAttributesEx
in interfaceTag
- Returns:
- Returns the list of
Attributes
in the tag. The first element is the tag name, subsequent elements being either whitespace or real attributes. - See Also:
-
getTagName
Return the name of this tag.Note: This value is converted to uppercase and does not begin with "/" if it is an end tag. Nor does it end with a slash in the case of an XML type tag. To get at the original text of the tag name use
getRawTagName()
. The conversion to uppercase is performed with an ENGLISH locale.- Specified by:
getTagName
in interfaceTag
- Returns:
- The tag name.
- See Also:
-
getRawTagName
Return the name of this tag.- Specified by:
getRawTagName
in interfaceTag
- Returns:
- The tag name or null if this tag contains nothing or only whitespace.
-
setTagName
Set the name of this tag. This creates or replaces the first attribute of the tag (the zeroth element of the attribute vector).- Specified by:
setTagName
in interfaceTag
- Parameters:
name
- The tag name.- See Also:
-
getText
Return the text contained in this tag.- Specified by:
getText
in interfaceNode
- Overrides:
getText
in classAbstractNode
- Returns:
- The complete contents of the tag (within the angle brackets).
- See Also:
-
setAttributesEx
Sets the attributes. NOTE: Values of the extended hashtable are two element arrays of String, with the first element being the original name (not uppercased), and the second element being the value.- Specified by:
setAttributesEx
in interfaceTag
- Parameters:
attribs
- The attribute collection to set.- See Also:
-
setTagBegin
public void setTagBegin(int tagBegin) Sets the nodeBegin.- Parameters:
tagBegin
- The nodeBegin to set
-
getTagBegin
public int getTagBegin()Gets the nodeBegin.- Returns:
- The nodeBegin value.
-
setTagEnd
public void setTagEnd(int tagEnd) Sets the nodeEnd.- Parameters:
tagEnd
- The nodeEnd to set
-
getTagEnd
public int getTagEnd()Gets the nodeEnd.- Returns:
- The nodeEnd value.
-
setText
Parses the given text to create the tag contents.- Specified by:
setText
in interfaceNode
- Overrides:
setText
in classAbstractNode
- Parameters:
text
- A string of the form <TAGNAME xx="yy">.- See Also:
-
toPlainTextString
Get the plain text from this node.- Specified by:
toPlainTextString
in interfaceNode
- Specified by:
toPlainTextString
in classAbstractNode
- Returns:
- An empty string (tag contents do not display in a browser).
If you want this tags HTML equivalent, use
toHtml()
.
-
toHtml
Render the tag as HTML. A call to a tag'stoHtml()
method will render it in HTML.- Specified by:
toHtml
in interfaceNode
- Specified by:
toHtml
in classAbstractNode
- Parameters:
verbatim
- Iftrue
return as close to the original page text as possible.- Returns:
- The tag as an HTML fragment.
- See Also:
-
toString
Print the contents of the tag.- Specified by:
toString
in interfaceNode
- Specified by:
toString
in classAbstractNode
- Returns:
- An string describing the tag. For text that looks like HTML use #toHtml().
-
breaksFlow
public boolean breaksFlow()Determines if the given tag breaks the flow of text.- Specified by:
breaksFlow
in interfaceTag
- Returns:
true
if following text would start on a new line,false
otherwise.
-
accept
Default tag visiting code. Based onisEndTag()
, calls eithervisitTag()
orvisitEndTag()
.- Specified by:
accept
in interfaceNode
- Specified by:
accept
in classAbstractNode
- Parameters:
visitor
- The visitor that is visiting this node.
-
isEmptyXmlTag
public boolean isEmptyXmlTag()Is this an empty xml tag of the form <tag/>.- Specified by:
isEmptyXmlTag
in interfaceTag
- Returns:
- true if the last character of the last attribute is a '/'.
-
setEmptyXmlTag
public void setEmptyXmlTag(boolean emptyXmlTag) Set this tag to be an empty xml node, or not. Adds or removes an ending slash on the tag.- Specified by:
setEmptyXmlTag
in interfaceTag
- Parameters:
emptyXmlTag
- If true, ensures there is an ending slash in the node, i.e. <tag/>, otherwise removes it.
-
isEndTag
public boolean isEndTag()Predicate to determine if this tag is an end tag (i.e. </HTML>). -
getStartingLineNumber
public int getStartingLineNumber()Get the line number where this tag starts.- Specified by:
getStartingLineNumber
in interfaceTag
- Returns:
- The (zero based) line number in the page where this tag starts.
-
getEndingLineNumber
public int getEndingLineNumber()Get the line number where this tag ends.- Specified by:
getEndingLineNumber
in interfaceTag
- Returns:
- The (zero based) line number in the page where this tag ends.
-
getIds
Return the set of names handled by this tag. Since this a a generic tag, it has no ids. -
getEnders
Return the set of tag names that cause this tag to finish. These are the normal (non end tags) that if encountered while scanning (a composite tag) will cause the generation of a virtual tag. Since this a a non-composite tag, the default is no enders. -
getEndTagEnders
Return the set of end tag names that cause this tag to finish. These are the end tags that if encountered while scanning (a composite tag) will cause the generation of a virtual tag. Since this a a non-composite tag, it has no end tag enders.- Specified by:
getEndTagEnders
in interfaceTag
- Returns:
- The names of following end tags that stop further scanning.
-
getThisScanner
Return the scanner associated with this tag.- Specified by:
getThisScanner
in interfaceTag
- Returns:
- The scanner associated with this tag.
- See Also:
-
setThisScanner
Set the scanner associated with this tag.- Specified by:
setThisScanner
in interfaceTag
- Parameters:
scanner
- The scanner for this tag.- See Also:
-
getEndTag
Get the end tag for this (composite) tag. For a non-composite tag this always returnsnull
. -
setEndTag
Set the end tag for this (composite) tag. For a non-composite tag this is a no-op.
-