Changeset 68

Show
Ignore:
Timestamp:
06/06/01 12:27:49
Author:
miyagawa
Message:

use Email::Find 0.04

Files:

Legend:

Unmodified
Added
Removed
Modified
Copied
Moved
  • Apache-AntiSpam/trunk/Changes

    r61 r68  
    11Revision history for Perl extension Apache::AntiSpam. 
     2 
     30.03  Wed Jun  6 12:24:31 JST 2001 
     4        * use Email::Find 0.04; 
    25 
    360.02  Thu May 10 18:58:21 JST 2001 
  • Apache-AntiSpam/trunk/README

    r61 r68  
    4848    *   remove mailto: tags using HTML::Parser. 
    4949 
    50     *   Make it easy to subclass so can be configured address munging. 
    51  
    52 CAVEATS 
    53     Email::Find 0.0[23] may take up to half-an-hour or so to extract emails 
    54     in complex documents, which can't be used for this kind of usage. (You 
    55     can't wait for an hour in front of the browser!) 
    56  
    57     Thus, Apache::Antispam localizes regex used by find_email() to more 
    58     speedy version if Email::Find's VERSION is lower than 0.03. 
    59  
    60     Email::Find 0.04, which Michael G Schwern is now working on, will solve 
    61     this problem of parsing speed. 
     50    *   Make it easy to subclass so that the antispamming method can be 
     51        configured. 
    6252 
    6353ACKNOWLEDGEMENTS 
    6454    The idea of this module is stolen from Apache::AddrMunge by Mark J 
    6555    Dominus. See http://perl.plover.com/AddrMunge/ for details. 
     56 
     57    Many thanks to Michael G. Schwern for kindly improving the matching 
     58    speed of Email::Find. 
    6659 
    6760AUTHOR 
  • Apache-AntiSpam/trunk/lib/Apache/AntiSpam.pm

    r61 r68  
    33use strict; 
    44use vars qw($VERSION); 
    5 $VERSION = '0.02'; 
     5$VERSION = '0.03'; 
    66 
    77use Apache::Constants qw(:common); 
    88use Apache::File; 
    9 use Email::Find; # 0.04; 
    10  
    11 # make compiler aware of constant 
    12 use constant EMAIL_FIND_VERSION => $Email::Find::VERSION; 
    13 use vars qw($ADDR_SPEC); 
     9use Email::Find 0.04; 
    1410 
    1511sub handler { 
     
    5753    $r->send_http_header; 
    5854 
    59     # XXX encapsulation broken! 
    60     local $Email::Find::Addr_spec_re = $ADDR_SPEC 
    61         unless EMAIL_FIND_VERSION >= 0.04; 
    6255    local $/;           # slurp 
    6356    my $input = <$fh>; 
     
    6760    return OK; 
    6861}     
    69  
    70 BEGIN { 
    71     $ADDR_SPEC =<<'REGEX'; 
    72 (?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\ 
    73 \[\]\000-\037\x80-\xff])|"[^\\\x80-\xff\n\015"]*(?:\\[^\x80-\xff][ 
    74 ^\\\x80-\xff\n\015"]*)*")(?:\.(?:[^(\040)<>@,;:".\\\[\]\000-\037\x 
    75 80-\xff]+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff])|"[^\\\x80-\ 
    76 xff\n\015"]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015"]*)*"))*@(?:[^(\0 
    77 40)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\\[\]\000 
    78 -\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x80-\xff])*\]) 
    79 (?:\.(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,; 
    80 :".\\\[\]\000-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x 
    81 80-\xff])*\]))* 
    82 REGEX 
    83     ; 
    84     $ADDR_SPEC =~ s/\n//g; 
    85 } 
    8662 
    87631; 
     
    170146=back 
    171147 
    172 =head1 CAVEATS 
    173  
    174 Email::Find 0.0[23] may take up to half-an-hour or so to extract 
    175 emails in complex documents, which can't be used for this kind of 
    176 usage. (You can't wait for an hour in front of the browser!) 
    177  
    178 Thus, Apache::Antispam localizes regex used by find_email() to more 
    179 speedy version if Email::Find's VERSION is lower than 0.03. 
    180  
    181 Email::Find 0.04, which Michael G. Schwern is now working on, will 
    182 solve this problem of parsing speed. 
    183  
    184148=head1 ACKNOWLEDGEMENTS 
    185149 
    186150The idea of this module is stolen from Apache::AddrMunge by Mark J 
    187151Dominus.  See http://perl.plover.com/AddrMunge/ for details. 
     152 
     153Many thanks to Michael G. Schwern for kindly improving the matching 
     154speed of Email::Find. 
    188155 
    189156=head1 AUTHOR