You are writing software that will spider a web site (that is, it visits all links on the homepage, then visits all links on those pages, etc.), and travels three levels deep. You have a list of all the links found in the first pass, in the second pass, and in the third pass.
You are given a vector
You are given a vector
You are finally given a vector
You are to return an int indicating the number of distinct pages (not including the initial homepage) visited during this crawl of the web site.
0) {"home.htm", "sitemap.htm", "contact.htm", "support/login.jsp"} {"2 locations.htm", "3 ../home.htm"} {"0 contact.htm"} Returns: 5
On the home page, the first pass finds that there are four links: "home.htm", "sitemap.htm", "contact.htm", and "login.jsp". Notice that the login page is in the "support" folder.
On the second pass, we find that the contact page has a link to "locations.htm", and the login page has a link back to the home page (which we have already visited).
On the third (and final) pass, we find that the locations page has a link back to the contact page (which we have already seen).
So, we take account of all pages found on the site: home, sitemap, contact, login, and locations. Thus, there are five pages.
1) {"index.html","products/all/INDEX.HTML","images/products/A101.GIF"} {"1 ../../index.html","1 ../../products"} {} Returns: 4
Note that the second link in secondPass is to products, which is the same as a directory name. This is allowed, and should be counted separately.
2) {".rc"} {} {} Returns: 1 3) {"a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a"} {"0 a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a"} {"0 ../a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a", "0 ..a/a", "0 a../a.."} Returns: 5 4) {"abc/ccba","ab/cba","..a"} {"0 cba","1 ccba"} {"0 cba","1 ccba"} Returns: 5 5) {"a","ab/ab","ab/ab/abc","abc/abc"} {"0 ab/ab","1 ab","1 ../ab/ab","2 ../../ab/abc"} {"0 ../ab/ab","2 ../abc/abc","1 ab/ab"} Returns: 6 6) {"a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a/a"} {"0 ../../../../../../../../a/a/a/a/a/a/a/a/a"} {"0 ../../../../../../../../a/a/a/a/a/a/a/a/a"} Returns: 1 7) {"index.asp", "contact.asp", "about/index.asp", "users/support.asp", "company/executiveteam.asp", "products/catalog.asp"} {"1 index.asp", "1 requestinfo.asp", "2 ../index.asp", "2 history.asp", "3 ../index.asp", "3 helpdesk.asp", "4 ../index.asp", "4 boardofdirectors.asp", "4 location.asp", "5 ../index.asp", "5 new/index.asp", "5 ../index.asp", "5 sale.asp"} {"10 ../../index.asp", "11 products/index.asp", "11 products/catalog.asp"} Returns: 14
This problem statement is the exclusive and proprietary property of TopCoder, Inc. Any unauthorized use or reproduction of this information without the prior written consent of TopCoder, Inc. is strictly prohibited. (c)2003, TopCoder, Inc. All rights reserved.