Avicus Archive

Crawlers/Spiders/Scrappers? by LavaLeaker November 8, 2014 at 4:11 AM UTC

Just wondering if I'm allowed to crawl avicus.net for an idea I have that could be fun for the community.

If you don't know crawling is where you let a bot run around a website and download all the html files, then you search those html files for information that would be useful to you such as links, keywords, images, ect ect.

And staff, if you'd like to know how much of a load this will put on your servers I just ran one of these on overcast to collect the top 2400 players AND download their heads in 128x res, in total it used only 3.6 mbs of bandwidth. 2.3 for html files and 1.3 for images.

kycrafft November 8, 2014 at 4:11 AM UTC

wow we all kno u r juts going 2 hak avcus websit we arnt dumb

LavaLeaker November 8, 2014 at 4:11 AM UTC

wow we all kno u r juts going 2 hak avcus websit we arnt dumb
ur acc wil b 1st 4 dat

badgg November 8, 2014 at 5:11 AM UTC

Would this leak Staff only information? (The category for staff)

LavaLeaker November 8, 2014 at 5:11 AM UTC

Would this leak Staff only information? (The category for staff)
No. The staff page is visible on a permission only basis and since this bot wouldn't be viewed as a mod by the site it would be unable to access anything to do with the staff page.

LavaLeaker November 10, 2014 at 12:11 AM UTC

Bump because I really do need to know.

AndrewJKim November 10, 2014 at 12:11 AM UTC

I'm pretty sure you're allowed to do this.

keenanjt November 10, 2014 at 5:11 PM UTC

I don't see why not... Search engines do this daily. Though if your crawler is too aggressive, our servers may tag you as an attacker and block all connections.

StewieFG November 11, 2014 at 8:11 PM UTC

I don't see why not... Search engines do this daily. Though if your crawler is too aggressive, our servers may tag you as an attacker and block all connections.
Aren't u afk? :o

LavaLeaker November 12, 2014 at 4:11 AM UTC

I don't see why not... Search engines do this daily. Though if your crawler is too aggressive, our servers may tag you as an attacker and block all connections.
What exactly do you consider agressive? Would that be using 12 threads to download pages and crawl them later, or downloading one crawling it and then without delay downloading another?

kycrafft November 12, 2014 at 4:11 AM UTC

I don't see why not... Search engines do this daily. Though if your crawler is too aggressive, our servers may tag you as an attacker and block all connections.
He might steal all the secret government data, though. He'll leak info like he leaks cores!