[kwlug-disc] Help!

Hubert Chathi hubert at uhoreg.ca
Wed Dec 31 10:44:14 EST 2014


On Wed, 31 Dec 2014 08:42:15 -0500, Joe Wennechuk <youcanreachmehere at hotmail.com> said:

> Hello All, Slightly off topic, but I know you guys can help. I have
> applied for a job, and they have asked me to write a java class that
> searches html from websites for links. I am using this regex
> ...(Pattern pattern = Pattern.compile("<a[^>]*>(.*?)</a>",
> Pattern.DOTALL | Pattern.CASE_INSENSITIVE);) to find them but based on
> the constraints I don't think I'm doing it right, as I am not finding
> all of the links. Here are the constraints.. Can anyone help??
> Implementation constrains: * For simplification assume that the link
> is defined as '<[whitespace]a[whitespace]' or
> '<[whitespace]A[whitespace]'.  ('<a ', '< a h', '<A >', '<a attr=' are
> all valid links)

The one thing that I can see immediately is that your regexp does not
allow for whitespace between the "<" and the "a".





More information about the kwlug-disc mailing list