Show Posts

1

Projects and Discussion / Project

« on: February 05, 2014, 03:28:33 PM »

Hi,

We have a group project to do and came up with an idea.

Is there a way that a website can control a c++ application? Say you pushed a button on a website, it would do a specific command on c++?

2

Game Hacking, Modding & Discussing / Re: "Mafia 2" mini-review

« on: January 15, 2014, 08:51:43 PM »

The first one was fucking awesome, will definitely be giving this a play.

EDIT: I wasn't happy when I completed the first one because I enjoyed it so much!

3

Creative Arts / Re: The Music Thread

« on: January 14, 2014, 04:44:19 PM »

Loving The Naked and Famous - Young Blood

http://www.youtube.com/watch?v=0YuSg4mts9E

4

Creative Arts / Re: The Music Thread

« on: January 13, 2014, 08:23:00 AM »

Quote from: lucid on January 13, 2014, 01:35:23 AM

Lol I'm sure I'd have to be a seasoned listener in order to determine the differences.. Thanks for enlightening me

Haha, I've been listening for around 7 years.

5

Creative Arts / Re: The Music Thread

« on: January 13, 2014, 01:07:54 AM »

Quote from: lucid on January 13, 2014, 12:15:47 AM

Lol wtf. Subgenres upon subgenres upon subgenres. People get so fucking ridiculous with that..

What the fuck is hardstyle anyway? I feel like it is somehow related to screamo

Well, all genres are a subgenre of one genre...

It's a form of electronic dance music (EDM) with a tempo typically consisting of 150bpm. It isn't particularly popular but it is becoming increasingly popular every year, last year was a good year for hardstyle and I believe it's audience has expanded drastically compared to it's audience in 2012. How? I believe it's because of festivals in America (EDM Las Vegas and artists touring America) and Headhunterz (he's like the most known producer) being signed to Ultra Records. Hardstyle has also become less 'raw' in the last year with tracks like 'Wildstylez ft. Neils Geusebroek - Year of Summer', which in fact got a Gold Record this year at X-Qlusive Wildstylez and made it on Dutch radio (alongside of Brennan Heart & Wildstylez - Lose My Mind). To me I feel that Gabber is hardstyle sister/brother style which is much much more hard/raw. Good festivals are Defqon 1 (which I am going to this year) and Qlimax, both which are hosted by Q-Dance and held in The Netherlands.

Raw hardstyle track:
http://www.youtube.com/watch?v=5-2TcNJMfdw

Non-raw hardstyle:
http://www.youtube.com/watch?v=WA0t6ErCtus

Euphoric hardstyle:
http://www.youtube.com/watch?v=EENk51ofvMg (Wasted Penguinz just released their album Wistfulness, my favourite artist)

and now what will be becoming mainstream over the coming year or two:
http://www.youtube.com/watch?v=5HcvRIMU_Xs

I hope you can determine the differences.

6

Scripting Languages / Re: 'Recycle bin' feature in Linux

« on: January 13, 2014, 12:31:55 AM »

Quote from: lucid on January 13, 2014, 12:27:16 AM

Am I missing something? What linux are you running where 'del' is the command to delete a file?

I don't think you understand, by naming the file del and placing it into the /bin folder I can execute the command 'del' to delete files.

7

Creative Arts / Re: The Music Thread

« on: January 12, 2014, 11:34:58 PM »

I fucking love hardstyle!

Listening to Crypsis - Break Down Low at the moment which is a hardstyle sub-genre. dubstyle.

http://www.youtube.com/watch?v=XUjPGZRp_Xc

Digging Brennan Hearts new master piece as well, Imaginary with John Mendelsohn. Will be seeing him next month as well as DJ Coone at the Arches in Glasgow.

http://www.youtube.com/watch?v=h9I-9Sj4sKs

8

Scripting Languages / 'Recycle bin' feature in Linux

« on: January 12, 2014, 11:26:16 PM »

This is a replica of the recycle bin feature in Linux for command line.

How it works?

Say you want to remove a file from command line (non-permanently), you would run del (filename). The script would then move the file to the recycle bin folder and write it's location to a file called 'link2file' so it can remember where it was stored prior to deletion so that in the event we want to restore this file or folder then we can.

If we wanted to restore a file from command line, we would run restore (filename).

If we wanted to trash files then we have two options. If we have -a as a parameter then it will remove all files from the recycle bin without prompting the user if they definitely want to remove the file, trash (filename) will ask the user for each individual file within the recycle bin if they wish to delete.

del

Code: [Select]

#!/bin/sh
# This file should be stored in /usr/bin 


#Checks if folder dustbin exists 
if [ ! -e /root/dustbin ]


#If is doesn't then
then 


	#Check for write permissions
	if [ -w . ] 
	
		#Make the directory dustbin
		then mkdir /root/dustbin
	
		#If not then continue silently 
		else :
	fi
	
#If dustbin does exist continue silently
else
	:
fi


#If there is a parameter and that paraemter exists then continue other display message  
if [ -z "$1" ]; 


	#Display error message
	then echo "USAGE: del filename" 
	
#If there is a parameter then continue with the following
else


#If parameter one file or folder doesn't exist
if [ ! -e "$1" ];


	#Display error message
	then echo "File or folder does not exist."
#else
else


#Set a to the current working directory
a=$(pwd)


#Read user input y or n (yes or no)
read -p  "Are you sure you would like to delete $1? Please enter 'Y' for yes or 'N' for no." yn


#Set unique to time since 1970
unique=$(date +%s)


#Open case statement
case $yn in


	#Note that the program accepts any string that begins with a y or an n.
	#If user enters y or Y then move file to recycle bin and store path in url2file
	[Yy]* ) mkdir /root/dustbin/$unique; echo $unique "|" $a"/"$1 >>/root/link2file; mv $1 /root/dustbin/$unique; echo ""; echo "File deleted."; echo "";;


	#Old method
	#[Yy]* ) mv $1 /root/dustbin; echo "File deleted";;
	#If user enters n or N then do not move file and exit
	[Nn]* )	echo "File not deleted"; exit;; 


	#If user enters an invalid characters then display error message
	* ) echo "Invalid input!";;


#Close case statement
esac


#End parameter check if statement
fi
fi

restore

Code: [Select]

#!/bin/sh
# This file should be stored in /usr/bin 


#If second parameters is present and is equal to -n then
#If there is a parameter and that paraemter exists then continue other display message  
if [ -z "$1" ]; 


	#Display error message
	then echo "USAGE: restore filename" 


#TEST TWO FILES WITH SAME NAME 


else
 for line in `ls /root/dustbin`; do
	
#Get whatever is in parameter 1 search for it in link2file and store it in variable restore
  unique=`grep $1 /root/link2file | cut -d "|" -f 1`	
  restore=`grep $1 /root/link2file | cut -d "|" -f 2`
  echo $restore
  #location=`dirname "$restore"`	
  #echo $location
#Change directory to /root/dustbin
  cd /root/dustbin/$unique
  
#If parameter one file or folder doesn't exist
	if [ ! -e "$1" ]


#Display error message
	then echo "File or folder does not exist in recycle bin."
	
#else
else


#Move file to the location, prompt if file already exists
  mv -i $1 $location | echo "File" $1 "restored"
  sed --in-place '/$unique/d' /root/link2file
  done
fi
fi

Code: [Select]

#!/bin/bash
# This file should be stored in /usr/bin 


#If the -a parameter is specified
if [ "$1" == "-a" ]


#Then
then 


#Change directory to /root/dustbin
  cd /root/dustbin
  
#Remove (force) all files and folders from the directory
  rm -rf *
  
#Change directory to root
  cd /root
  
#Remove file that stored locations of files prior to deletion
  rm link2file
  
#Re-create the file that stored locations of files prior to deletion
  touch link2file
  
#If -a parameter isn't specific
else


#Change directory to /root/dustbin
  cd /root/dustbin
  
#List contents of the folder
  ls
  
 #For every file in the dustbin folder
for line in `ls /root/dustbin`
  do
	read -p  "Are you sure you want to remove $line permanently?" yn
	
#Open case statement
case $yn in


#If the user enters Y then remove the file listed
	[Yy]* ) rm $line; echo "File" $line "removed successfully";;


#If the user enters N then do nothing
	[Nn]* ) :;;


#If the user enters anything other than Y or N then display error message
	* ) echo "Invalid input!";;


esac
 done
fi

Again, I hope this helps people. The scripts should be put into the /usr/bin (bin stands for binaries) folder.

9

Scripting Languages / Re: [Python] Organize your download folder

« on: January 12, 2014, 10:24:59 PM »

Could be very useful, especially for people who have cluttered Desktops... I know plenty of they sort of people

10

Scripting Languages / Re: Get webpage content script

« on: January 12, 2014, 10:17:56 PM »

Quote from: Kulverstukas on January 12, 2014, 07:46:11 PM

If "# Python sucks", why did you code in it? also where the HELL are the conventions? I see none. Variable names? sooooo descriptive... and why I have to ensure a directory exists? can't the code do it?Basically cool initiative, but a really sloppy and awful code. Do you even lift, bro?

#Python sucks was for a bit of banter and to lighten the mood, it was a coursework assignment. I'm not a coder and yeah some of the variables aren't very descriptive but it works.

It can but it doesn't, I don't think I implemented that. I did state that it most definitely can be improved but thought it was worth a share.

No I don't.

Quote from: Deque on January 12, 2014, 08:10:45 PM

Python has a HTMLParser module. Consider using this instead of regex.
Some people can get pretty mad about this, see here: https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags
(for actual reasons read the answers below)

Btw, there is also natural language processing library that can find e-mail-addresses, telephone numbers and names: http://nltk.org/
Might be too much for this simple tool, though.

Code: [Select]
MAKE SURE DIRECTORY C:\temp exists.

This is a fail. You use a platform independend language, but hardcode a windows path.
Use relative pathes, create the folder if it doesn't exists.

Code: [Select]
MUST INCLUDE HTTP://

This is also something you can check for and fix within two lines instead of screaming at the user what to do.
if string doesn't start with http: prepend http to string

Code: [Select]
### This code will work perfectly on Unix systems as the way they work

No, it won't.

I'm balls with regular expressions so I just searched Google for something and it worked so used it.

As I said above, I didn't implement checking if C:\temp exists into the code. There is plenty of room for improvement

. The script does everything the handout asked it to do.

I thought it would have definitely worked.

11

Scripting Languages / Get webpage content script

« on: January 12, 2014, 06:57:44 PM »

Hi,

I wrote a script that will perform the following

1. Download the webpage source code
2. Search for links in the source and print them to screen
3. Search for email addresses and UK phone numbers and print them to screen, for other types of phone numbers the regular expressions will need to be modified.
4. Search for MD5 hashes (The script will try to crack them - uses an inbuilt mini dictionary)
5. Search for images and documents
6. Download images and documents

The way I wrote the script was using one script to control each individual part of the project. I learned from this project that indentation is incredibly important in Python. I used IDLE as my python editor. I am not a programmer and the code can definitely be improved and made more efficient. I hope this helps someone.

MAKE SURE DIRECTORY C:\temp exists.

main.py controlled the running:

Code: [Select]

#
# Script: main.py
#


#A lot of importing
import webpage_getlinks, webpage_getemails, webpage_getstuff, webpage_md5, webpage_getimages
import webpage_downloadcontent, forensic_analysis, urllib, httplib, urlparse, urllib2


# Function main
def main():


#Display options and stuff
 print 'Below is the list of available options, please select one when asked to do so:'
 print '1. Get webpage content and print hyperlinks.'
 print '2. Get webpage content and print emails and numbers.'
 print '3. Get webpage content and find the MD5 hashes if any.'
 print 'The script will then try to crack them.'
 print '4. Get webpage content - images and documents.'
 print '5. Download webpage content - images and documents.'
 print '6. Perform Forensic Analysis.'
 print 'Please enter the number to work from:'
 #Get users option
 option = raw_input()
 print ''
 print 'You have selected option number', option


 print 'Please enter the URL to get content from (MUST INCLUDE HTTP://):'
 # Get user input
 url = raw_input()


 print ''
 print 'You have selected:', url
 print ''


 #Checks to make sure the site works but MUST have http://...
 #if it is continue if not error message


 try:
     urllib2.urlopen(url)
       #If option is equal to one then
     if option == '1':
          # 1. Get webpage content and print hyperlinks.
            webpage_getlinks.main(url)


     elif option == '2':
            # 2. Get webpage content and print emails and numbers.
            webpage_getemails.main(url)


     elif option == '3':
            # 3. Get webpage content and find the MD5 hashes if any. The script will then try to crack them.
            webpage_md5.main(url)


     elif option == '4':
            # 4. Get webpage content - images and documents.
            webpage_getimages.main(url)
     elif option == '5':
          # 5. Download webpage content - images and documents.
            webpage_downloadcontent.main(url)
     elif option == '6':
            # 5. Download webpage content - images and documents.
            forensic_analysis.main(url)
 except ValueError, ex:
     print 'There appears to be a problem with the server, please try again later!'
 except urllib2.URLError, ex:
     print 'There appears to be a problem with the server, please try again later!'




 
if __name__ == '__main__': 
 main()

webpage_getlinks.py

Code: [Select]

# Python sucks  
import sys, re, urllib 
import main 
  
# Function print_links 
def print_links(page):  
 # find all hyperlinks on a webpage passed in as input and print  
 print '[*] print_links()' 
 # regex to match on hyperlinks, returning 3 grps, links[1] being the link itself  
 links = re.findall(r'\<a.*href\=.*http\:.+', page) 
  
 # sort and print the links  
 links.sort() 
 #print [+], the numbers of links found and HyperLinks Found: 
 print '[+]', str(len(links)), 'HyperLinks Found:'
  
 #print links  
 #Uncomment below when testing (commented out to keep everything tidy) 
 for link in links:  
  print link 
 print 'All non-encrypted hyperlinks found!'
 
def print_slinks(page):  
 # find all hyperlinks on a webpage passed in as input and print  
 print '[*] print_slinks()' 
 # regex to match on hyperlinks, returning 3 grps, links[1] being the link itself  
 links = re.findall(r'\<a.*href\=.*https\:.+', page) 
  
 # sort and print the links  
 links.sort() 
 #print [+], the numbers of links found and HyperLinks Found: 
 print '[+]', str(len(links)), 'Secure HyperLinks Found:'
  
 #print links  
 #Uncomment below when testing (commented out to keep everything tidy) 
 for link in links:  
  print link 
 print 'All secure hyperlinks found!'
  
# Function wget 
def wget(url): 
      
 # Try to retrieve a webpage via its url, and return its contents  
 print '[*] wget()'
 # open file like url object from web, based on url  
 url_file = urllib.urlopen(url) 
 # get webpage contents  
 page = url_file.read() 
 # return whatever is in page.. don't know if this is needed.. 
 return page 
  
  
# Function main 
def main(url):  


 page = wget(url) 
 # Get the links  
 print_links(page) 
  # Get the links  
 print_slinks(page)  
   
if __name__ == '__main__':  
 main()

webpage_getemails.py

Code: [Select]

# Python sucks 
import sys, re, urllib
import main


# Function print_emails
def print_emails(page):
# Set blank space between functions
 print ""
 
 # find all emails on a webpage passed in as input and print 
 print '[*] print_emails()' 
 # regex to match on emails, returning 3 grps, links[1] being the link itself 
 emails = re.findall(r'\w+@\w+.\w+.\w+', page)


 # sort and print the links 
 emails.sort()
 #print [+], the numbers of emails found and Emails Found:
 print '[+]', 'There were', str(len(emails)), 'emails found.'
 print 'Emails found:'


 #Uncomment below when testing (commented out to keep everything tidy)
 for email in emails: 
  print email


# Function print_numbers
def print_numbers(page):
# Set blank space between functions
 print ""


   
 # find all numbers on a webpage passed in as input and print 
 print '[*] print_numbers()' 
 # regex to match on emails, returning 3 grps, links[1] being the link itself 
 numbers = re.findall(r'([+].*[(].*[)]\d{3}[-\.\s]??\d{3}[-\.\s]??\d{4}|\(\d{3}\)\s*\d{3}[-\.\s]??)', page)
  
 # sort and print the links 
 numbers.sort()
 #print [+], the number of numbers found and Emails Found:
 print '[+]', 'There were', str(len(numbers)), 'phone numbers found.'
 print 'Phone numbers found:'


 #print numbers
 #Uncomment below when testing (commented out to keep everything tidy)
 for number in numbers: 
  print number


# Function wget
def wget(url):
    
 # Try to retrieve a webpage via its url, and return its contents 
 print '[*] wget()'
 # open file like url object from web, based on url 
 url_file = urllib.urlopen(url)
 # get webpage contents 
 page = url_file.read()
 # return whatever is in page???
 return page




# Function main
def main(url): 
 
 page = wget(url)
 # Get the emails
 print_emails(page)
 # Get the numbers
 print_numbers(page)


if __name__ == '__main__': 
 main()

webpage_getstuff

Code: [Select]

# Python sucks 
import sys, hashlib


# Function crack_hashes
def crack_hashes(md5):
# Set blank space between functions
 print ""
 #Checks password hash, against a dictionary of common passwords
 print 'crack_hashes(): Cracking hash:', md5
 # set up list of common password words
 dic = ['123','1234','12345','123456','1234567','12345678','password','123123', 'qwerty','abc','abcd','abc123','111111','monkey','arsenal','letmein','trustno1','dragon','baseball','superman','iloveyou','starwars','montypython','cheese','football','password','batman']
 #passwd_found = False;
   
 for passwd in dic:
     passwds = hashlib.md5(passwd)
     #print passwds.hexdigest()


     if passwds.hexdigest() == md5:


         #passwd_found = True;
         #break
         print 'crack_hashes(): Password recovered:', passwd
     #else:
             
 
def main():
    print'[crack_hashes] Tests'
    md5 = 'd8578edf8458ce06fbc5bb76a58c5ca4'
    crack_hashes(md5)
    #sys.exit(0)   
 
if __name__ == '__main__':
    main()

webpage_md5.py

Code: [Select]

# Python sucks
#download the hases
import sys, re, urllib
import main
import webpage_crackmd5


# Function print_md5
def print_md5(page):
# Set blank space between functions
 print ""


 # find all numbers on a webpage passed in as input and print 
 print '[*] print_md5()' 
 # regex to match on emails, returning 3 grps, links[1] being the link itself 
 md5 = re.findall(r'([a-fA-F\d]{32})', page)


 # sort and print the links 
 md5.sort()
 #print [+], the number of numbers found and Emails Found:
 print '[+]', 'There were', str(len(md5)), 'md5 hashes found.'
 print 'MD5 hashes found:'


 #print numbers
 #Uncomment below when testing (commented out to keep everything tidy)
 for i in md5: 
  print i
 webpage_crackmd5.main(md5)
  
# Function wget
def wget(url):
    
 # Try to retrieve a webpage via its url, and return its contents 
 print '[*] wget()'
 # open file like url object from web, based on url 
 url_file = urllib.urlopen(url)
 # get webpage contents 
 page = url_file.read()
 # return whatever is in page.. don't know if this is needed
 return page


# Function main
def main(url): 


 page = wget(url)
 # Print md5
 print_md5(page)
 
if __name__ == '__main__': 
 main()

Code: [Select]

# Python sucks 
import sys, re, urllib
import main


# Function print_numbers
def print_images(page):
# Set blank space between functions
 print ""


 # find all numbers on a webpage passed in as input and print 
 print '[*] print_images()' 
 # regex to match on emails, returning 3 grps, links[1] being the link itself 
 numbers = re.findall(r'src="([^"]+)"', page)    


 # sort and print the links 
 numbers.sort()
 #print [+], the number of numbers found and Emails Found:
 print '[+]', 'There were', str(len(numbers)), 'images found.'
 print 'Images found:'


 #print numbers
 #Uncomment below when testing (commented out to keep everything tidy)
 for number in numbers: 
  print number


# Function print_numbers
def print_documents(page):
# Set blank space between functions
 print ""


 # find all numbers on a webpage passed in as input and print 
 print '[*] print_documents()' 
 # regex to match on emails, returning 3 grps, links[1] being the link itself 
 numbers = re.findall(r'"(.*\.docx)"', page) #




 # sort and print the links 
 numbers.sort()
 #print [+], the number of numbers found and Emails Found:
 print '[+]', 'There were', str(len(numbers)), 'documents found.'
 print 'Documents found:'


 #print numbers
 #Uncomment below when testing (commented out to keep everything tidy)
 for number in numbers: 
  print number


# Function wget
def wget(url):
    
 # Try to retrieve a webpage via its url, and return its contents 
 print '[*] wget()'
 # open file like url object from web, based on url 
 url_file = urllib.urlopen(url)
 # get webpage contents 
 page = url_file.read()
 # return whatever is in page, don't know if this is needed..
 return page


# Function main
def main(url): 


 page = wget(url)
 # Get the emails
 print_images(page)
 print_documents(page)
 
if __name__ == '__main__': 
 main()

webpage_downloadcontent.py

Code: [Select]

# Download images


import os
import main
import re
import urllib
import httplib, urlparse
    
def download_images(url, page):
 #Set link equal to the link minus the page for downloading images without a link
 link = os.path.dirname(url)#


 #Set numbers equal to the images on the page (WILL ONLY FIND FULL LINKS)
 images = re.findall(r'src=".*http\://([^"]+)"', page)
 
 #Set aegami equal to the images on the page (WILL ONLY FIND IMAGES WITH NO PATH)
 aegami = re.findall(r'\ssrc\s*=\s*"([^/"]+\.(?:jpg|gif|png|jpeg|bmp))\s*"', page) #\ssrc\s*=\s*"([^/"]+\.[^/".]+" no need to specify image file extensions
 images.sort()


 #This is for adding URL to images with no full URL
 for image in aegami:
  #Set test = dirname of the URL + / + each individual file in the loop
  test = link+"/"+image


  #Set location equal to the directory for the download content to be stored in
  location = os.path.abspath("C:/temp/coursework/")
  
  #Need to concantenate location C:\temp/coursework + filename) - this is to save the file
  get = os.path.join(location, image)


  #Check the file actually exists by getting the HEAD code
  status = urllib.urlopen(test).getcode()


  #If status is 200 (OK)
  if status == 200:
      urllib.urlretrieve(test, get)
      print 'The file:', image, 'has been saved to:', get
  #If status is 404 (FILE NOT EXISTANT)
  elif status == 404:
      print 'The file:', image, 'could not be saved. Does not exist!!'
  else:
      print 'Unknown Error:', status


  ######################### This is for files with a link ################
 for files in images:
  #Download the images
  #Set filename equal to the basename of files, so the actual file i.e image1.jpg of http://google.com/image1.jpg
  filename = os.path.basename(files)


  #Need to concantenate location C:\temp/coursework + filename) - this is to save the file
  save = os.path.join(location, filename)


  #This is because the regular expression removes http:// for some weird reason, without http the download does not work...
  addhttp = "http://"+files
  
  #Download the file arguements(link to file, destionation to save and the filename)
  status = urllib.urlopen(test).getcode()




  if status == 200:
      urllib.urlretrieve(addhttp, save)
      print 'The file:', filename, 'has been saved to:', save
  elif status == 404:
      print 'The file:', filename, 'could not be saved. Does not exist!!'
  else:
      print 'Unknown Error:', status


  print 'Download of images complete!'
  #Print white space
  print ''




def download_documents(url, page):
 #Set dirname equal to the link minus the page for downloading images without a link
 dirname = os.path.dirname(url)


 #Set documents equal to the documents on the page
 documents = re.findall(r'"(.*\.docx)"', page)


 documents.sort()


 #Download the documents, see above comments - pretty much same code but different
 #Variables and regex


 for doc in documents: 
  test = dirname+"/"+doc
  
  location = os.path.abspath("C:/temp/coursework/")


  #Set filename equal to the basename of files, so the actual file i.e image1.jpg of http://google.com/image1.jpg
  name = os.path.basename(test)


  #Need to concantenate location C:\temp/coursework + filename) - this is to save the file
  get = os.path.join(location, name)


  status = urllib.urlopen(test).getcode()
  #print status


  if status == 200:
      urllib.urlretrieve(test, get)
      print 'The file:', doc, 'has been saved to:', get
  elif status == 404:
      print 'The file:', doc, 'could not be saved. Does not exist!!'
  else:
      print 'Unknown Error:', status


 print 'Download of documents complete!'


# Function wget
def wget(url):
    
 # Try to retrieve a webpage via its url, and return its contents 
 print ''
 print '[*] wget()'
 
 #Print white space
 print ''
 
 # open file like url object from web, based on url 
 url_file = urllib.urlopen(url)
 
 # get webpage contents 
 page = url_file.read()
 
 # return whatever is in page???
 return page


def main(url):
 page = wget(url)
 
 download_images(url, page)
 download_documents(url, page)


if __name__ == '__main__': 
 main()

forensic_analysis.py

Code: [Select]

#Forensic Analysis
import hashlib, urllib, binascii, os, shutil
#Binascii module allows convertion of ASCII characters to Hex and Binary.


def forensic_analysis(url, page):


 #Setup the dictionary [0] = hash [1] = the file name
 bad_hashes = [('9d377b10ce778c4938b3c7e2c63a229a','badfile1.jpg'), ('6bbaa34b19edd6c6fa06cccf29b33125', 'badfile2.jpg'), ('28f6607fa6ec96acb89027056cc4c0f5', 'badfile3.jpg'), ('1d6d9c72e3476d336e657b50a77aee05', 'badfile4.jpg')]


 #Set location equal to the directory where downloads are stored
 location = r'C:\\temp\\coursework\\'
 
 #Counts how many files are in C:\\temp\\coursework
 path, dirs, files = os.walk("C:\\temp\\coursework").next()
 file_count = len(files)


 #Tell the user how many files there are in the directory
 print 'How many files:', file_count


 #For each file in the directory do
 for each in files:
     
  #Set filename equal to the downloads directory + each individual filename in the directory
  filename = location+each


  #Blank space
  print ''
  print 'Current filename:', filename


  #Open the file so the contents are hashed
  fh = open(filename, 'r')
  
  #Read the first four bytes
  file_sig = fh.read(4)


  #Convert from ascii to hex
  #test = binascii.hexlify(file_sig)


  #Extract file extension from file name
  fileName, fileExtension = os.path.splitext(each)


  #Print the files hash signature
  filesig_hex = binascii.hexlify(file_sig)
  print 'This is the files hash signature:', filesig_hex


  #Set hasher to hash files
  hasher = hashlib.md5()


  #if the file signature == (SIGNATURE - in this case JPG)
  if filesig_hex == 'ffd8ffe0':
          #Tell the user this is a jpg file
          print 'This file is JPG'


          #Set file extension to jpg
          fileExtension = '.jpg'


          #Set it's new name to the filename plus it's fixed extension
          newname = fileName+fileExtension


          #Print it's newname 
          print 'The files newname:', newname
          
          #Set destined to location\fixed
          dst_dir = os.path.join(location, "fixed")
          #Join location and file name together
          src_file = os.path.join(location, each)


          #Copy the original file to the new directory location\fixed
          shutil.copy(src_file, dst_dir)
          
          ### Note that because the way Windows works, we must make sure
          ### that the fixed directory is empty before running the script as
          ### the code will not be able to rename files otherwise.
          ### This code will work perfectly on Unix systems as the way they work
          ### Is overwrite is guaranteed.


          #Set dst_file to the new location + the file name 
          dst_file = os.path.join(dst_dir, each)


          #Set the new name to new location + newname 
          new_dst_file_name = os.path.join(dst_dir, newname)


          #Rename the original file to it's new name 
          os.rename(dst_file, new_dst_file_name)
          
          ##### need to copy file to new folder, rename it and then hash it^
          
          with open(new_dst_file_name, 'rb') as afile:
              buf = afile.read()
              hasher.update(buf)
          for i in bad_hashes:
              if hasher.hexdigest() == i[0]:
                  print 'This file hash matches bad hash', i[0], '. This file should be called:', i[1] 
  #See above comments for anything below, code is the same but I ran out of
  #time to put this into a function which would take up less lines and be more efficient.                 
  if filesig_hex == 'ffd8ffe1':
          print 'This file is JPG'
          fileExtension = '.jpg'
          newname = fileName+fileExtension
          print 'Newname:', newname
          dst_dir = os.path.join(location, "fixed")
          src_file = os.path.join(location, each)
          shutil.copy(src_file, dst_dir)


          dst_file = os.path.join(dst_dir, each)
          new_dst_file_name = os.path.join(dst_dir, newname)
          os.rename(dst_file, new_dst_file_name)


          with open(new_dst_file_name, 'rb') as afile:
              buf = afile.read()
              hasher.update(buf)
          
          for i in bad_hashes:
              if hasher.hexdigest() == i[0]:
                  print 'This file hash matches bad hash', i[0], '. This file should be called:', i[1]
                  
  if filesig_hex == '424d3e':
          print 'This file is BMP'
          fileExtension = '.bmp'
          newname = fileName+fileExtension
          print 'Newname:', newname
          dst_dir = os.path.join(location, "fixed")
          src_file = os.path.join(location, each)
          shutil.copy(src_file, dst_dir)


          dst_file = os.path.join(dst_dir, each)
          new_dst_file_name = os.path.join(dst_dir, newname)
          os.rename(dst_file, new_dst_file_name)
          with open(new_dst_file_name, 'rb') as afile:
              buf = afile.read()
              hasher.update(buf)
          for i in bad_hashes:
              if hasher.hexdigest() == i[0]:
          
  if filesig_hex == '47494638':
          print 'This file is GIF'
          fileExtension = '.gif'
          newname = fileName+fileExtension
          print 'Newname:', newname
          dst_dir = os.path.join(location, "fixed")
          src_file = os.path.join(location, each)
          shutil.copy(src_file, dst_dir)


          dst_file = os.path.join(dst_dir, each)
          new_dst_file_name = os.path.join(dst_dir, newname)
          os.rename(dst_file, new_dst_file_name)
          with open(new_dst_file_name, 'rb') as afile:
              buf = afile.read()
              hasher.update(buf)
          for i in bad_hashes:
              if hasher.hexdigest() == i[0]:
                  print 'This file hash matches bad hash', i[0], '. This file should be called:', i[1] 


  if filesig_hex == '000001':
          print 'This file is ICO'
          fileExtension = '.ico'
          newname = fileName+fileExtension
          print 'Newname:', newname
          dst_dir = os.path.join(location, "fixed")
          src_file = os.path.join(location, each)
          shutil.copy(src_file, dst_dir)


          dst_file = os.path.join(dst_dir, each)
          new_dst_file_name = os.path.join(dst_dir, newname)
          os.rename(dst_file, new_dst_file_name)
          with open(new_dst_file_name, 'rb') as afile:
              buf = afile.read()
              hasher.update(buf)
          for i in bad_hashes:
              if hasher.hexdigest() == i[0]:
                  print 'This file hash matches bad hash', i[0], '. This file should be called:', i[1] 


  if filesig_hex == '89504e':
          print 'This file is PNG'
          fileExtension = '.png'
          newname = fileName+fileExtension
          print 'Newname:', newname
          dst_dir = os.path.join(location, "fixed")
          src_file = os.path.join(location, each)
          shutil.copy(src_file, dst_dir)


          dst_file = os.path.join(dst_dir, each)
          new_dst_file_name = os.path.join(dst_dir, newname)
          os.rename(dst_file, new_dst_file_name)
          with open(new_dst_file_name, 'rb') as afile:
              buf = afile.read()
              hasher.update(buf)
          for i in bad_hashes:
              if hasher.hexdigest() == i[0]:
                  print 'This file hash matches bad hash', i[0], '. This file should be called:', i[1]
                  
  if filesig_hex == 'd0cf11':
          print 'This file is DOC'
          fileExtension = '.doc'
          newname = fileName+fileExtension
          print 'Newname:', newname
          dst_dir = os.path.join(location, "fixed")
          src_file = os.path.join(location, each)
          shutil.copy(src_file, dst_dir)


          dst_file = os.path.join(dst_dir, each)
          new_dst_file_name = os.path.join(dst_dir, newname)
          os.rename(dst_file, new_dst_file_name)
          with open(new_dst_file_name, 'rb') as afile:
              buf = afile.read()
              hasher.update(buf)
          for i in bad_hashes:
              if hasher.hexdigest() == i[0]:
                  print 'This file hash matches bad hash', i[0], '. This file should be called:', i[1]
                  
  if filesig_hex == '504b0304':
          print 'This file is DOCX'
          fileExtension = '.docx'
          newname = fileName+fileExtension
          print 'Newname:', newname
          dst_dir = os.path.join(location, "fixed")
          src_file = os.path.join(location, each)
          shutil.copy(src_file, dst_dir)


          dst_file = os.path.join(dst_dir, each)
          new_dst_file_name = os.path.join(dst_dir, newname)
          os.rename(dst_file, new_dst_file_name)
          with open(new_dst_file_name, 'rb') as afile:
              buf = afile.read()
              hasher.update(buf)
          for i in bad_hashes:
              if hasher.hexdigest() == i[0]:
                  print 'This file hash matches bad hash', i[0], '. This file should be called:', i[1] 


  if filesig_hex == '7b5c72':
          print 'This file is RTF'
          fileExtension = '.rtf'
          newname = fileName+fileExtension
          print 'Newname:', newname
          dst_dir = os.path.join(location, "fixed")
          src_file = os.path.join(location, each)
          shutil.copy(src_file, dst_dir)
          dst_file = os.path.join(dst_dir, each)
          new_dst_file_name = os.path.join(dst_dir, newname)
          os.rename(dst_file, new_dst_file_name)


          with open(new_dst_file_name, 'rb') as afile:
              buf = afile.read()
              hasher.update(buf)
          for i in bad_hashes:
              if hasher.hexdigest() == i[0]:
                  print 'This file hash matches bad hash', i[0], '. This file should be called:', i[1] 
  if filesig_hex == 'd0cf11':
          print 'This file is XLS'
          fileExtension = '.xls'
          newname = fileName+fileExtension
          print 'Newname:', newname
          dst_dir = os.path.join(location, "fixed")
          src_file = os.path.join(location, each)
          shutil.copy(src_file, dst_dir)


          dst_file = os.path.join(dst_dir, each)
          new_dst_file_name = os.path.join(dst_dir, newname)
          os.rename(dst_file, new_dst_file_name)
          ##### need to copy file to new folder, rename it and then hash it
          with open(new_dst_file_name, 'rb') as afile:
              buf = afile.read()
              hasher.update(buf)
          for i in bad_hashes:
              if hasher.hexdigest() == i[0]:
                  print 'This file hash matches bad hash', i[0], '. This file should be called:', i[1] 
          
  else:
      print 'File signature not recognised. Be wary'


# Function wget
def wget(url):
    
 # Try to retrieve a webpage via its url, and return its contents 
 print '[*] wget()'
 # open file like url object from web, based on url 
 url_file = urllib.urlopen(url)
 # get webpage contents 
 page = url_file.read()
 # return whatever is in page???
 return page


def main(url):
 #Set page equal to the output of the wget function
 page = wget(url)


 #Call the forensic_analysis function and pass the arguements url and page
 forensic_analysis(url, page)


if __name__ == '__main__': 
 main()

Note that to perform forensic analysis please make sure you have downloaded some files...

Forensic analysis works by checking the file signature and ensuring that the file actually is whatever extension it claims to be. If for example a file is called file1.jpg, the forensic analysis script will check file1.jpg signature to see if it actually is a jpg. If it is then it does nothing and just states the file is a jpg, if it isn't then it will tell the user that it isn't actually a jpg and that it is actually whatever file extension by checking it's signature.

I haven't looked at the code for a couple of months and if you have any issues with it I am more than happy to help. Feel free to modify it to your personal use.

To download all the scripts:

Download

Messages - dsme

Projects and Discussion / Project

Game Hacking, Modding & Discussing / Re: "Mafia 2" mini-review

Creative Arts / Re: The Music Thread

Creative Arts / Re: The Music Thread

Creative Arts / Re: The Music Thread

Scripting Languages / Re: 'Recycle bin' feature in Linux

Creative Arts / Re: The Music Thread

Scripting Languages / 'Recycle bin' feature in Linux

Scripting Languages / Re: [Python] Organize your download folder

Scripting Languages / Re: Get webpage content script

Scripting Languages / Get webpage content script