/ org.htmlparser / src / org / htmlparser / http / package.html
package.html
  1  <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
  2  <html>
  3  <head>
  4  <!--
  5   HTMLParser Library $Name: v1_6_20060319 $ - A java-based parser for HTML
  6   http://sourceforge.org/projects/htmlparser
  7   Copyright (C) 2004 Derrick Oswald
  8  
  9   Revision Control Information
 10  
 11   $Source: /cvsroot/htmlparser/htmlparser/src/org/htmlparser/http/package.html,v $
 12   $Author: derrickoswald $
 13   $Date: 2005/06/19 12:01:14 $
 14   $Revision: 1.3 $
 15  
 16   This library is free software; you can redistribute it and/or
 17   modify it under the terms of the GNU Lesser General Public
 18   License as published by the Free Software Foundation; either
 19   version 2.1 of the License, or (at your option) any later version.
 20  
 21   This library is distributed in the hope that it will be useful,
 22   but WITHOUT ANY WARRANTY; without even the implied warranty of
 23   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
 24   Lesser General Public License for more details.
 25  
 26   You should have received a copy of the GNU Lesser General Public
 27   License along with this library; if not, write to the Free Software
 28   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
 29  -->
 30  </head>
 31  <body>
 32  The http package is responsible for HTTP connections to servers.
 33  The Lexer and Parser provide many ways to supply text to be parsed,
 34  but this package only deals with cases where a URL is supplied as a
 35  string, with the expectation that the Lexer or Parser will perform
 36  the HTTP connection.
 37  <p>The {@link org.htmlparser.http.ConnectionManager} class adds
 38  <ul>
 39  <li>cookie</li>
 40  <li>proxy</li>
 41  <li>password protected URL</li>
 42  </ul>
 43  capabilities when accessing the internet via the 
 44  <a href="http://www.ietf.org/rfc/rfc2616.txt">HTTP protocol</a>.
 45  Each of these capabilities requires conditioning the HTTP connection.
 46  A HTTP header utility class is also included.
 47  <p>The {@link org.htmlparser.http.ConnectionMonitor} interface is a callback
 48  mechanism for the ConnectionManager to notify an interested application
 49  when an HTTP connection is made. Example uses may include conditioning the
 50  connection further, accessing HTTP header information, or providing reporting
 51  or statistical functions. Callbacks are not performed for FileURLConnections,
 52  which are also handled by the connection manager.
 53  <p>The {@link org.htmlparser.http.Cookie} class is a container for
 54  cookie data received and sent in HTTP requests and responses. It may be
 55  necessary to prime the ConnectionManager with cookies received via a
 56  login procedure in order to access protected HTML content.
 57  <p>
 58  A typical use of this package, might look something like this:
 59  <pre>
 60  ConnectionManager manager = Parser.getConnectionManager ();
 61  // set up proxying
 62  manager.setProxyHost ("proxyhost.mycompany.com");
 63  manager.setProxyPort (8888);
 64  manager.setProxyUser ("FredBarnes");
 65  manager.setProxyPassword ("secret");
 66  // set up cookies
 67  Cookie cookie = new Cookie ("USER", "FreddyBaby");
 68  manager.setCookie (cookie, "www.freshmeat.net");
 69  cookie = new Cookie ("PHPSESSID", "e5dbeb6152e70d99427f2458d8969f8b");
 70  cookie.setDomain (".freshmeat.net");
 71  manager.setCookie (cookie, null);
 72  // set up security to access a password protected URL
 73  manager.setUser ("FredB");
 74  manager.setPassword ("holy$cow");
 75  // set up (an inner class) for callbacks
 76  ConnectionMonitor monitor = new ConnectionMonitor ()
 77      {
 78          public void preConnect (HttpURLConnection connection)
 79   	{
 80              System.out.println (HttpHeader.getRequestHeader (connection));
 81  	}
 82  	public void postConnect (HttpURLConnection connection)
 83  	{
 84              System.out.println (HttpHeader.getResponseHeader (connection));
 85  	}
 86      };
 87  manager.setMonitor (monitor);
 88  // perform the connection
 89  Parser parser = new Parser ("http://frehmeat.net");
 90  </pre>
 91  The ConnectionManager used by the Parser class is actually held by the 
 92  {@link org.htmlparser.lexer.Page#mConnectionManager Page} class.
 93  It is accessible from the Parser (or the Page class) via
 94  {@link org.htmlparser.Parser#getConnectionManager getConnectionManager()}.
 95  It is a static (singleton) instance so that subsequent connections made by the
 96  parser will use the contents of the cookie jar from previous connections.
 97  By default, cookie processing is not enabled. It can be enabled by either
 98  setting a cookie or using
 99  {@link org.htmlparser.http.ConnectionManager#setCookieProcessingEnabled setCookieProcessingEnabled()}.
100  </body>
101  </html>