sex.py
12/07/09 15:25
Smashing Email eXtractor 1.0
Extract valid e-mail addresses from all kind of files. With sex.py you can extract a list of emails from a defaced text file or even scan recursively through a directory and all its content. A scenario could be to download a website to your local hard-drive and use sex.py to harvest all email addresses.
Highlights:
To configure Smashing Email eXtractor edit the variables in the source file.
verbose = n
0 no output
1 print the email addresses e.g. if you want to pipe them
2 output email addresses, current file and grand total
sort = n
0 write email addresses to destination file as found
1 sort addresses in alphabetical order
remove_duplicates = n
0 capture all addresses
1 remove duplicated emails
Usage:
source: absolute path to a file or directory
destination: path to write the output file
Example 1:
Example 2:
Download:
sex.py
Extract valid e-mail addresses from all kind of files. With sex.py you can extract a list of emails from a defaced text file or even scan recursively through a directory and all its content. A scenario could be to download a website to your local hard-drive and use sex.py to harvest all email addresses.
Highlights:
- Switch the search pattern to match valid email addresses
- Scan a single file or multiple files form a directory (including subdirectories)
- Sort the addresses of the output file
- Except duplicates
- Change verbosity level
To configure Smashing Email eXtractor edit the variables in the source file.
verbose = n
0 no output
1 print the email addresses e.g. if you want to pipe them
2 output email addresses, current file and grand total
sort = n
0 write email addresses to destination file as found
1 sort addresses in alphabetical order
remove_duplicates = n
0 capture all addresses
1 remove duplicated emails
Usage:
sex.py <source> <destination>
source: absolute path to a file or directory
destination: path to write the output file
Example 1:
$ wget --mirror -p --restrict-file-names=windows --html-extension --convert-links -v http://www.wolfgang-schaeuble.de/
$ python sex.py www.wolfgang-schaeuble.de/ addresses.txt
>> File: www.wolfgang-schaeuble.de/Audioplayer/swfobject.js
...
>> File: www.wolfgang-schaeuble.de/fileadmin/user_upload/PDF/050625nordkurier.pdf
Margareta.Moertl@cducsu.de
...
>> Extraced email addresses: 10
$ cat addresses.txt
Bruno.Kahl@cducsu.de
Margareta.Moertl@cducsu.de
aki-108@gmx.de
forum@welt.de
heike.nieske@cducsu.de
poststelle@bmi.bund.de
sebastian.pieper@cducsu.de
wolfgang.schaeuble.ma02@bundestag.de
wolfgang.schaeuble@bundestag.de
wolfgang.schaeuble@wk.bundestag.de
Example 2:
$ python sex.py shitty_formatted_list.txt shiny_email_list.txt
Download:
sex.py