Farid Hajji: Perl - Einführung, Anwendungen, Referenz
2., aktualisierte und erweiterte Auflage
Addison-Wesley Longman, ISBN 3-8273-1535-2
Tie/URL.pm
# Tie/URL.pm -- TIEHASH Klasse zum Holen von URLs via LWP-Library.
package Tie::URL;
$VERSION = "0.01";
use LWP::UserAgent;
use Tie::Hash;
use vars qw(@ISA);
@ISA = qw(Tie::StdHash);
use constant USERAGENT => 'TIE-URL/0.1';
use constant RTIMEOUT => 60; # Sekunden: Timeout bei Misserfolg
sub TIEHASH {
my $class = shift;
my %params = (
UserAgentName => USERAGENT,
Timeout => RTIMEOUT,
@_);
# Sicherheitshalber brauchen wir diesen Parameter!
die "Tie::URL: Must specify UserMail Parameter!\n"
unless exists $params{'UserMail'};
my $ho = bless({}, $class);
$ho->{'_ua'} = LWP::UserAgent->new();
$ho->{'_ua'}->agent($params{'UserAgentName'});
$ho->{'_ua'}->from($params{'UserMail'});
# Die zwei speziellen Konstanten ProxyOff und CacheOff
# schalten den Proxy und den Cache aus.
unless (exists $params{'ProxyOff'}) {
$ho->{'_ua'}->proxy('http', $params{'ProxyURL'})
if defined $params{'ProxyURL'};
$ho->{'_ua'}->no_proxy($params{'ProxyNO'})
if defined $params{'ProxyNO'};
}
unless (exists $params{'CacheOff'}) {
$ho->{'_cache'} = 1;
}
# Wenn ein Timeout angegeben wird, sollte es honoriert werden
if (exists $params{'Timeout'}) {
$ho->{'_ua'}->timeout($params{'Timeout'});
}
return $ho;
}
sub FETCH {
my $self = shift;
my $url = shift;
# Erst im Cache nachpruefen, ob die URL schon da ist...
return $self->{$url} if exists $self->{$url};
# Nun die URL holen
my $resp = $self->{'_ua'}->request(new HTTP::Request('GET', $url));
my $cont = $resp->is_success() ? $resp->content() :
$resp->error_as_HTML();
# Im Cache speichern, falls Cache eingeschaltet ist.
$self->{$url} = $cont if exists $self->{'_cache'};
# Den Wert zurueckgeben
return $cont;
}
sub STORE {
warn "Sorry, HTTP PUT/POST Methods not yet implemented!\n";
}
1;
__END__
=head1 NAME
Tie::URL - Tie Hashes to URLs
=head1 SYNOPSIS
use Tie::URL;
tie %url, 'Tie::URL', UserMail => 'user@somewhere.org';
tie %url, 'Tie::URL', UserMail => 'user@somewhere.org',
Timeout => $time_to_wait,
CacheOff => 1,
ProxyOff => 1;
tie %url, 'Tie::URL', UserMail => 'user@somewhere.org',
ProxyURL => 'http://proxy.isp.org:8080/',
ProxyNO => 'isp.org';
=head1 DESCRIPTION
This module provides a TIEHASH class to tie a hash to URLs using
the LWP::UserAgent module as a backend to fetch URLs.
Once a hash has been tied to this class, accessing a URL is simple:
Just read out the value of the hash, specifying as key the wanted
URL. The value of a hash is the content returned by the server
at the specified URL, or the error code.
Requests are normally cached in the hash memory, so that subsequent
reads are not propagated to the server. The cache can be turned
off by adding the CacheOff parameter to the tie() call.
tie() accepts the following parameters:
=over
=item UserMail
The mail address of the user running this program.
This is the only mandatory parameter.
=item Timeout
The time to wait for the server to reply in seconds. Defaults
to RTIMEOUT. You may wish to set this to a higher value in heavily
loaded networks or servers.
=item CacheOff
If this parameter is specified, repeated requests to a specific URL
are also repeatedly sent to the server. No cache is maintained in
memory.
The enabled cache currently saves all requests in memory, regardless
of their Expires: HTTP Header. This could change in the future.
=item ProxyOff
Using a proxy server is normally enabled, once you set the
ProxyURL parameter, eventually also the ProxyNO parameter. Adding
ProxyOff turns the use of proxy servers off.
=item ProxyURL
The URL of the proxy server to use. This is sometimes necessary to
pass through firewalls or to profit from the caching proxy of your
ISP, thus reducing waiting time and conserving network bandwidth.
=item ProxyNO
Don't ask proxy server for the domain specified by this parameter.
=back
keys() and related hash functions return the contents of the
cache, which could be empty if the cache has been disabled with
CacheOff. Note that keys() won't return all the URLs worldwide :-)
=head1 KNOWN BUGS
Caching should be smarter and use the HTTP Expires: Headers provided
by the responding server.
Support for Cookies is currently disabled, but could be added in
the TIEHASH constructor, as LWP::UserAgent supports cookie jars.
Writing a value to a tied hash results only in warning being issued.
A HTTP POST or even HTTP PUT request is NOT issued to the server.
This could be implemented in the future.
keys(), each() and values() also return the special keys
_ua, _cache which are not URLs but internal objects resp. markers.
=head1 AUTHOR
Farid Hajji <farid.hajji@ob.kamp.net>
This module is copylefted under the same terms as Perl. See the
GNU copyleft version 2 or the artistic license.
=cut
[Prev] [Up] [Relevant Chapter] [Next]
[Alte Quelle]
| Last modified: $Date: 2004/06/16 22:19:40 $ FH. Search :: Sitemap :: Disclaimer :: Copyright :: Privacy |
|