I am writing a link crawler and found this useful. I found it on another site but modified it a little.string matchlinks = @"]*?HREF\s*=\s*[""']?([^'"" >]+?)[ '""]?[^>]*?>";
This will match any <a href=""> tag, even if it has other elements in it or uses single quotes. I suggest stripping all line breaks. \r\n, \r, and \n before trying to match. For the match options I used SingleLine or IgnoreCase or IgnorePatternWhitespace. Enjoy!
Friday, October 27, 2006
Find Links Using RegEx
Labels:
c#,
regex,
regular expressions
Subscribe to:
Post Comments (Atom)

0 comments:
Post a Comment