[kwlug-disc] Help!

Joe Wennechuk youcanreachmehere at hotmail.com
Wed Dec 31 11:21:22 EST 2014


I have to do it in java and use gradle to build it and it must process the. Links in parallel. It Only needs to count the links.  So it should look like

$java -jar linkcounter.jar http://google.com http://reddit.com http://hotmail.com

Http://google.com/ 7
Http://hotmail.com/ 5
Http://reddit.com/ 375


Sent from my ALCATEL ONE TOUCH 5020T

William Park <opengeometry at yahoo.ca> wrote:

On Wed, Dec 31, 2014 at 08:42:15AM -0500, Joe Wennechuk wrote:
> Hello All,
> Slightly off topic, but I know you guys can help. I have applied for a
> job, and they have asked me to write a java class that searches html
> from websites for links. I am using this regex ...(Pattern pattern =
> Pattern.compile("<a[^>]*>(.*?)</a>", Pattern.DOTALL |
> Pattern.CASE_INSENSITIVE);) to find them but based on the constraints
> I don't think I'm doing it right, as I am not finding all of the
> links. Here are the constraints.. Can anyone help??  Implementation
> constrains:   * For simplification assume that the link is defined as
> '<[whitespace]a[whitespace]' or '<[whitespace]A[whitespace]'.
> ('<a ', '< a h', '<A >', '<a  attr=' are all valid links)

Are they testing your Java knowledge?
    - You are supposed to account for whitespaces.  That may be the
      problem.

Or, do they just want the list of links?
    - Here, there are better ways to get the list of links, eg.
        lynx -dump -listonly http://...
--
William


_______________________________________________
kwlug-disc mailing list
kwlug-disc at kwlug.org
http://kwlug.org/mailman/listinfo/kwlug-disc_kwlug.org





More information about the kwlug-disc mailing list