Re: Perl 5's "non-greedy" matching can be TOO greedy!

From: Tom Christiansen

Date: December 15, 2000 12:27

Subject: Re: Perl 5's "non-greedy" matching can be TOO greedy!

Message ID: 11146.976912022@chthon

December 15, 2000 12:27Re: Perl 5's "non-greedy" matching can be TOO greedy!

>I made a mistake in phrasing it this way, because it seemed to suggest that >I thought it was an implementation bug that it returns "bbbbccccd" instead >of "bccccd". I didn't make it clear that I was trying to approach this as >a purely SEMANTIC question, considered in isolation from the implementation >of the system. You keep using "semantic". However, I do not think that that word means what you think it means. >The question is, "what interpretation makes the most sense, >at a high level", not "why does the current behavior make sense". There are all three of them different things. >It's not that there aren't justifications for the current behavior. It's a >question of perspective -- from one perspective (mine), "bccccd" makes more >sense semantically. No, sir. You cannot use the S word for that. Here are the *SEMANTICS* of pattern matching in Perl: When there's more than one match, the first match found (that is, the leftmost) is the winner, with ties being resolved in favor of the longer string for maximal matches and the shorter string for minimal matches. This is *not* an "implementational detail". These *are* the semantics. You are asking for *different* semantics. What you are doing is simply an attempt to impose a sloppy English-language description on the behavior of the code. Just because you should happen to understand the English does not mean that this describes the code. It's like people thinking /<.*?>/ will find a tag because they are thinking in English, not Perl. Of course it won't. >I believe it it more intuitive, at the highest level. "Intuitive" is another one of those words frequently bandied about that is nearly always misapplied. WRONG: The frobnitz interface is more intuitive. RIGHT: The nipple is the only intuitive human interface. CORRECTION: From my own historical experiences and resulting biases, the frobnitz interface would have been more what I personally without regard to anyone else would have been expecting. >>From a different (more implementation-oriented) perspective, the current No, this is not "implementation-oriented". It is merely the semantics. >Hopefully, we can have a rational discussion about whether this semantic >anomaly is real or imagined, what impact "fixing" it would have on the >implementation (if it's deemed real), and whether it's worth "fixing". I do not expect you to be rational, because I do not think we can agree to your terms. There is no semantic anomaly, anymore than thinking that <.*> or <.*?> finds an HTML tag is some sort of "semantic anomaly". It is the result of your mistranslating between English and code. >Here's where I see the disconnect happening. I'm approaching this from a >semantic perspective, asking myself "what should this match (ideally)?" No, you're not. Please stop abusing the S word. It places you on no moral high ground whatsoever. --tom



