/ org.htmlparser / src / org / htmlparser / package.html
package.html
 1  <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
 2  <html>
 3  <head>
 4  <!--
 5   HTMLParser Library $Name: v1_6_20060319 $ - A java-based parser for HTML
 6   http://sourceforge.org/projects/htmlparser
 7   Copyright (C) 2004 Somik Raha
 8  
 9   Revision Control Information
10  
11   $Source: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/package.html,v $
12   $Author: derrickoswald $
13   $Date: 2005/04/05 00:48:12 $
14   $Revision: 1.22 $
15  
16   This library is free software; you can redistribute it and/or
17   modify it under the terms of the GNU Lesser General Public
18   License as published by the Free Software Foundation; either
19   version 2.1 of the License, or (at your option) any later version.
20  
21   This library is distributed in the hope that it will be useful,
22   but WITHOUT ANY WARRANTY; without even the implied warranty of
23   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
24   Lesser General Public License for more details.
25  
26   You should have received a copy of the GNU Lesser General Public
27   License along with this library; if not, write to the Free Software
28   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
29  -->
30  </head>
31  <body>
32  The basic API classes which will be used by most developers when working with
33  the HTML Parser.
34  <p>The {@link org.htmlparser.Parser} class is the main high level class that
35  provides simplified access to the contents of an HTML page.
36  A wide range of methods is available to customize the operation of the Parser,
37  as well as access specific pieces of the page as
38  {@link org.htmlparser.Node Nodes}.</p>
39  <p>The {@link org.htmlparser.NodeFactory} interface specifies the requirements
40  for a developer to have the Parser or Lexer generate nodes. Three types of
41  nodes are required: {@link org.htmlparser.Text}, {@link org.htmlparser.Remark}
42  and {@link org.htmlparser.Tag Tags}. Tags contain lists
43  of child nodes and {@link org.htmlparser.Attribute attributes}.</p>
44  <p>The only provided implementation of the NodeFactory interface
45  is the {@link org.htmlparser.PrototypicalNodeFactory} which
46  operates by holding example nodes and cloning them as needed to satisfy the
47  requests for nodes by the Parser. By default, a Lexer is it's own NodeFactory,
48  returning new {@link org.htmlparser.nodes.TextNode},
49  {@link org.htmlparser.nodes.RemarkNode} and undifferentiated
50  {@link org.htmlparser.nodes.TagNode Tagnodes} (see the
51  {@link org.htmlparser.nodes nodes} package), but when the parser uses a lexer
52  it replaces this behaviour with a PrototypicalNodeFactory to return a rich
53  set of specific tags (see the {@link org.htmlparser.tags tags} package).</p>
54  <p>The {@link org.htmlparser.NodeFilter} interface is used by the filtering
55  code to determine if a node meets a certain criteria. Some generic examples of
56  filters can be found in the {@link org.htmlparser.filters filters} package.
57  </body>
58  </html>