EvilZone

Programming and Scripting => Scripting Languages => Topic started by: daxda on November 05, 2013, 08:53:00 AM

Title: [Source] pastebinScraper.py
Post by: daxda on November 05, 2013, 08:53:00 AM
Just finished another scraper, this time it's for scraping latest created pasties from pastebin.com, Idea taken from joepie91, he's been talking about scraping the site which motivated me coding a script for myself aswell.

The script chooses a random proxy from a defined list, if a connection to the target fails, the proxy will get discarded and when you exit the script the updated proxy list will be saved to file. It connects to pastebin, filters out the latest links to the pasties and fetches the data of those and saves them into files.

Attached to this post is the script with it's dependent files (*.py, Data/Results, Data/Proxies.txt, Data/User-Agents.txt)

As always, this code is free to use and modify, take from it what you want and do with it what you like. Improvements, critique, feedback and so forth are welcome.

Usage:  python pastebinScraper.py [ -s <optional sleep time in seconds here>]

[gist]Daxda/7315302[/gist]
Title: Re: [Source] pastebinScraper.py
Post by: imation on November 05, 2013, 09:54:01 AM
nice, looks good
Title: Re: [Source] pastebinScraper.py
Post by: d4rkcat on November 05, 2013, 02:24:59 PM
I've been waiting for something like this, ace!
I'm gonna have to look over this code.

Many Thanks daxda!