Crawlers/Spiders/Scrappers?
by
LavaLeaker
November 8, 2014 at 4:11 AM UTC
Just wondering if I'm allowed to crawl avicus.net for an idea I have that could be fun for the community.
If you don't know crawling is where you let a bot run around a website and download all the html files, then you search those html files for information that would be useful to you such as links, keywords, images, ect ect.
And staff, if you'd like to know how much of a load this will put on your servers I just ran one of these on overcast to collect the top 2400 players AND download their heads in 128x res, in total it used only 3.6 mbs of bandwidth. 2.3 for html files and 1.3 for images.
Would this leak Staff only information? (The category for staff)
No. The staff page is visible on a permission only basis and since this bot wouldn't be viewed as a mod by the site it would be unable to access anything to do with the staff page.
I don't see why not... Search engines do this daily. Though if your crawler is too aggressive, our servers may tag you as an attacker and block all connections.
I don't see why not... Search engines do this daily. Though if your crawler is too aggressive, our servers may tag you as an attacker and block all connections.
I don't see why not... Search engines do this daily. Though if your crawler is too aggressive, our servers may tag you as an attacker and block all connections.
What exactly do you consider agressive? Would that be using 12 threads to download pages and crawl them later, or downloading one crawling it and then without delay downloading another?
I don't see why not... Search engines do this daily. Though if your crawler is too aggressive, our servers may tag you as an attacker and block all connections.
He might steal all the secret government data, though. He'll leak info like he leaks cores!
This website is an archive of data gathererd by Avicus Network LLC between the years of 2013 and 2017
Copyright Ⓒ 2012-2017 Avicus Network LLC. All Rights Reserved