PDA

View Full Version : "hidden variables" in perl



monkeyking
April 16th, 2009, 11:14 PM
I got some small perl questions, on the strangness of hidden variables in perl.

First of all when I define/initialize variables in the beginning of my script.


use strict;
my (@nams, $len1, $len1);

Does the parenthesis mean anything other than just saving alot some typing like


use strict;
my @nams;
my $len1;
my $len2;


Next is something that is really puzzling me,
whiling through a file like



while (<HANDLE>) {
chomp;
if (/^(cr.+)\:(\d+)\-(\d+)$/) {
$cr = $1;
$start = $2;
$end = $3;
foreach my $ids($begin .. $end) {
print $ids
}
}
}

I dont understand and see where the actual line being tokenized is? I have seen things like $_ but not here.
And hvad about "(/^(cr.+)\:(\d+)\-(\d+)$/)"
Sorry for the strangeness in the line above, but it looks like my code is being interpreted as a smiley.
Thanks in advance

johnl
April 16th, 2009, 11:38 PM
It's voodoo perl magic.

Ok, not really. I'm not super-knowledgeable about perl but I think I can help you here.

The result of each line (if you don't specifically assign it to a variable) is stored in $_. And if you don't pass an argument to certain statements, their default is to operate on $_.

So for example:



while (<HANDLE>) {
# $_ here is one line from the file <HANDLE>
chomp; # same as $_ = chomp($_)
if (/^(cr.+):(d+)-(d+)$/) { # does $_ match this regex?
$cr = $1; # store first capture in $cr
$start = $2; # store second capture in $start
$end = $3; # store third capture in $end
foreach my $ids($begin .. $end) {
print $ids
}
}
}

odyniec
April 16th, 2009, 11:54 PM
First of all when I define/initialize variables in the beginning of my script.


use strict;
my (@nams, $len1, $len1);

Does the parenthesis mean anything other than just saving alot some typing like


use strict;
my @nams;
my $len1;
my $len2;

Apart from saving a few keystrokes, the parentheses also cause the three variables to be treated as a list. This is commonly used in subroutines to assign arguments to private variables:

sub somesub {
my ($arg1, $arg2, $arg3) = @_;

@_ represents the array of arguments passed to the subroutine.


I dont understand and see where the actual line being tokenized is? I have seen things like $_ but not here.

If no variable is specified, then it's $_. Most text operations (like input/output and pattern matching) affect $_ by default. So "<HANDLE>" is equivalent to "$_ = <HANDLE>", "if (/somepattern/)" is equivalent to "if ($_ =~ /somepattern/)", and so on.


And hvad about "(/^(cr.+)\:(\d+)\-(\d+)$/)"

It's a regular expression that the current input line ($_) is matched against. It corresponds to a sequence of characters starting with "cr", followed by a colon, then a sequence of digits (at least one), then a hyphen, and a second sequence of digits.

monkeyking
April 17th, 2009, 01:24 AM
Thanks johnl the invisible vars,
makes more sense now.

I still however would like some elaboration.
odyniec you talk about lists, is this the same as an array?
So the following would be valid and good perl programming


my($var1,$var2) = @_;
#and
@var3 = ($var1,$var2);


I would like some elaboration on

"(/^(cr.+)\:(\d+)\-(\d+)$/)"

The "(/ /)" around simply indicates that we got regexp inside?
The "^" means the beginning of the line?
The "\:" or "\-" means delimted by or split?
The "(cr.+)" what does the punctuation mean?
The "(\d+)" what does sequenze mean? would "1324 1234" still match?
The "$" end of line right?

Thanks for your replys

odyniec
April 17th, 2009, 01:46 AM
odyniec you talk about lists, is this the same as an array?

Not exactly -- see http://www.perlfoundation.org/perl5/index.cgi?array_vs_list.


The "(/ /)" around simply indicates that we got regexp inside?

The slashes indicate it's a regexp. The parentheses are required by the if statement.


The "^" means the beginning of the line?

Yes.


The "\:" or "\-" means delimted by or split?

It means that the particular character must be present in the input ($_) to match the regexp.


The "(cr.+)" what does the punctuation mean?

"." - any character, "+" - occuring at least once. In other words, anything starting with "cr" will match, e.g: "cry", "crfoobar", "crisis of the economy", etc.


The "(\d+)" what does sequenze mean? would "1324 1234" still match?

"\d+" - at least one digit but no other character, so a space won't match.


The "$" end of line right?

Right.

See the perlre manual page for more information on Perl regexp syntax.

monkeyking
April 17th, 2009, 01:57 AM
cheers

thank you.

You have been most helpfull