Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Perl question



dsr at tao wrote:
>On Wed, May 21, 2003 at 04:52:40PM -0400, Eric Schwartz wrote:
>> I have a perl programing question, any help you guys could offer would be 
>> greatly appreciated.
>
>...

[okay up to here]

>
>> ($etapagerem) = $buffer
>
>1. There's no semicolon at the end of this statement.
>2. You are assigning a scalar to a list containing exactly one scalar,
>   which won't work because a scalar is not a list.

>
>>         =~ /BLACK CARTRIDGE\s*(?:<.*?>\s*)/s;
>
>Assuming you are continuing from the last line, you appear to be
>confused as to what constitutes a regexp. "man perlre" will explain them
>to you, if you read it carefully.

Assuming Eric is continuing the previous line then #2 above is false.
In a scalar context, the (implicit) match operator returns the
number of strings captured by the regexp.  In a list context (this
case), it returns a list consisting of all of the matched strings.
Eric is doing the equivalent of assigning $1 to $etapagerem after the
regexp matching completes.

As for the regexp issue, I agree that his regexp is unlikely to do
anything useful.  However, I believe that it is well formed.  I
interpret it as follows:

/BLACK CARTRIDGE \s*   (?:<.*?>\s*)	/s;
 ^ exact string^ ^     ^                 ^
                 ^ 0 or more spaces      ^
                       ^ a non-capturing subpattern which consists
		       of the shortest string which starts with '<'
		       ends with '>' and has 0 or more spaces after it
                                         ^ allow '.' to match newline

Eric has not specified any capturing subpatterns so $etapagrem should
end up being empty.  Looking at the HTML excerpt that was sent
earlier, I might suggest:

/BLACK CARTRIDGE.*?Pages Remaining.*?color.*?>([0-9]+)/s;

as a replacement.  That's not particularily elegant, but I think it will
probably work to capture the page info for 'BLACK'.  Without seeing
the rest of the HTML, I couldn't say for sure how to capture the
info for the other colors...

				Bill Bogstad
				bogstad at pobox.com




BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org