Hey guys, so I've been working on a now 3 part tutorial here:
https://null-byte.wonderhowto.com/forum/creating-python-web-crawler-part-1-getting-sites-source-code-0175912/
https://null-byte.wonderhowto.com/forum/creating-python-web-crawler-part-2-traveling-new-sites-0175928/
https://null-byte.wonderhowto.com/forum/creating-python-web-crawler-part-3-narrowing-our-search-scope-0175935/
In it I detail the creation of a recursively defined python web-crawler, that can navigate through sites and process their raw-html docs. The ultimate goal of the crawler series is to create a tool to clone/archive and entire website, visiting over 500+ pages a minute and pulling all their data.
In the tutorials I use null-byte as an example target, because I consider it safe (no system admin would be too freaked out, people run scans against this site all the time as part of other tutorials). However, I'm worried that I could be potentially unleashing a DOS attack on the hosting servers, visiting so many links at once from one IP address.
I don't really have the best understanding of how DOS attacks work, so I'm just wondering if what I'm doing here is dangerous for the site. Null-Byte is awesome and I'd never want to risk messing with it.
Please let me know, I don't want to cause any harm.
Sharknado
Comments
No Comments Exist
Be the first, drop a comment!