I cannot seem to get my regular expressions in Perl to recognize
underscore (_) characters
Sorry to bring up a problem that is likely very simple, but I am trying to
write a series of regular expressions in Perl to extract certain types of
data from a file. For some reason, I cannot seem to get Perl to match
anything lines of data that have an underscore (_) in them.
If I want to get lines that start with "Ch2 Flybase exon " or"Ch3 Flybase
exon " (the white spaces are tab characters), the following code works
well:
if ($_ =~ m/^Ch[ 2-3] Flybase exon /) {print outputFile;}
However, if I want to match the lines with more complex chromosome names
(i.e. more than just the letters 'Ch' followed by a number), such as:
Ch4_group1
Ch4_group2
Ch4_group3
Ch4_group4
Ch4_group5
ChXL_group1a
ChXL_group1e
ChXL_group3a
ChXL_group3b
ChXR_group3a
ChXR_group5
ChXR_group6
ChXR_group8
Unknown_group_1
Unknown_group_10
Unknown_group_100
Unknown_group_101
I have tried the following codes without success:
if ($_ =~ m/^Ch4_group[1-5] Flybase exon /) {print outputFile;}
if ($_ =~ m/^ChX._group[0-9]+[a-z]* Flybase exon /) {print outputFile;}
if ($_ =~ m/^Unknown_group_[0-9]+ Flybase exon /) {print outputFile;}
if ($_ =~ m/^Unknown_singleton_[0-9]+ Flybase exon /) {print
outputFile;}
I have also tried including a \ in front of the _'s, but this did not help.
Any suggestions would be very much appreciated.
No comments:
Post a Comment