See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/338825515
DARKWEB + PYTHON: DISCOVER, ANALYZE AND EXTRACT
INFORMATION FROM HIDDEN SERVICES
Presentation · May 2019
DOI: 10.13140/RG.2.2.18959.94885
CITATIONS READS
0 753
1 author:
José Manuel Ortega
University of Alicante
30 PUBLICATIONS 0 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Machine Learning para proyectos de seguridad View project
Python Security View project
All content following this page was uploaded by José Manuel Ortega on 25 January 2020.
The user has requested enhancement of the downloaded file.
DARKWEB + PYTHON: DISCOVER,
ANALYZE AND EXTRACT INFORMATION
FROM HIDDEN SERVICES
@jmortegac May,2019
www.sti-innsbruck.at
About me
http://jmortega.github.io/
2
About me
3
Agenda
• Introduction to Tor project and hidden
services
• Discovering hidden services
• Modules and packages we can use in
python for connecting with Tor network
• Tools that allow search hidden services
and atomate the crawling process in Tor
network
4
Surface vs Deep vs Dark Web
5
What is Tor?
• Tor is a free tool that allows people to use the
internet anonymously.
• Tor anonymizes the origin of your traffic
6
What is Tor?
7
What is Tor?
8
Onion Routing
Tor is based on Onion Routing, a technique for
anonymous communication over a computer network.
9
Onion Routing
10
Establish TOR circuit
User's software or client
incrementally builds a
circuit of encrypted
connections through
relays on the network.
11
Establish TOR circuit
When we connect to
the TOR network, we
do it through a circuit
formed by 3
repeaters, where the
encrypted packet sent
from the client is
passing. Each time
the packet goes
through a repeater, an
encryption layer is
added.
12
Hidden services
User's software or client
incrementally builds a
circuit of encrypted
connections through
relays on the network.
13
Hidden services
https://metrics.torproject.org/hidserv-dir-onions-seen.html
14
Tor NODE List
15
Tor NODE List
https://www.dan.me.uk/tornodes
http://torstatus.blutmagie.de
16
Tor NODE List
https://onionite.now.sh
17
Exonera TOR
https://metrics.torproject.org/exonerator.html
18
Relay search
https://metrics.torproject.org/rs.html#simple
19
Relay search
https://metrics.torproject.org/rs.html#simple
20
Relay search
https://metrics.torproject.org/rs.html#simple
21
Discover hidden services
HiddenWiki:http://wikitjerrta4qgz4.onion/
Dark Links: http://wiki5kauuihowqi5.onion
Tor Links: http://torlinkbgs6aabns.onion
Dark Web Links:
http://jdpskjmgy6kk4urv.onion/links.html
HDWiki: http://hdwikicorldcisiy.onion
OnionDir: http://dirnxxdraygbifgc.onion
DeepLink: http://deeplinkdeatbml7.onion
Ahmia: http://msydqstlz2kzerdg.onion
22
Tor onnion services
23
Tor onnion services
https://en.wikipedia.org/wiki/List_of_Tor_onion_
services
https://en.wikipedia.org/wiki/The_Hidden_Wiki
24
TOR2web
https://www.onion.to
25
TOR browser
https://www.torproject.org/download/
26
Onion Routing
27
Installing TOR
sudo apt-get update
sudo apt-get install tor
sudo /etc/init.d/tor restart
28
TORrc
29
Running TOR
$ tor --SocksPort 9050 --ControlPort 9051
30
Running TOR
31
Tor service
service tor start/restart
service tor status
32
Connecting with TOR
Stem
https://stem.torproject.org/
TorRequest
https://github.com/erdiaker/torrequest
Requests + socks5
33
Stem
pip install stem
34
TOR descriptors
Server descriptor: Complete information about a repeater
ExtraInfo descriptor: Extra information about the repeater
Micro descriptor: Contains only the information necessary for
TOR clients to communicate with the repeater
Consensus (Network status): File issued by the authoritative
entities of the network and made up of multiple entries of
information on repeaters (router status entry)
Router status entry: Information about a repeater in the
network, each of these elements is included in the consensus
file generated by the authoritative entities.
35
TOR spec
36
Stem
from stem import Signal
from stem.control import Controller
with Controller.from_port(port = 9051) as
controller:
controller.authenticate(password='your
password set for tor controller port in torrc')
print("Success!")
controller.signal(Signal.NEWNYM)
print("New Tor connection processed")
37
Periodic Tor IP Rotation
import time
from stem import Signal
from stem.control import Controller
def main():
while True:
time.sleep(20)
print ("Rotating IP")
with Controller.from_port(port = 9051) as controller:
controller.authenticate()
controller.signal(Signal.NEWNYM) #gets new identity
if __name__ == '__main__':
main()
38
Stem.Circuit status
from stem.control import Controller
controller = Controller.from_port(port=9051)
controller.authenticate()
print(controller.get_info('circuit-status'))
39
Stem.Network status
from stem.control import Controller
controller = Controller.from_port(port=9051)
controller.authenticate(password)
entries = controller.get_network_statuses()
for routerEntry in entries:
print(routerEntry)
40
Stem.circuits
41
Stem.circuits
42
Server descriptors
43
Introduction points
44
Tor nyx
https://nyx.torproject.org/
45
Tor nyx
46
Tor nyx
47
Tor nyx
48
TorRequest
from torrequest import TorRequest
with TorRequest() as tr:
response = tr.get('http://ipecho.net/plain')
print(response.text) # not your IP address
tr.reset_identity()
response = tr.get('http://ipecho.net/plain')
print(response.text) # another IP address
49
Request
import requests
def get_tor_session():
session = requests.session()
# Tor uses the 9050 port as the default socks port
session.proxies = {'http': 'socks5h://127.0.0.1:9050',
'https': 'socks5h://127.0.0.1:9050'}
return session
# Following prints your normal public IP
print(requests.get("http://httpbin.org/ip").text)
# Make a request through the Tor connection
# Should print an IP different than your public IP
session = get_tor_session()
print(session.get("http://httpbin.org/ip").text)
r = session.get('https://www.facebookcorewwwi.onion/')
print(r.headers)
50
Analyze hidden services
1) Queries to the data sources.
2) Filter adresses that are active.
3) Testing against each active address and
analysis of the response.
4) Store URLs from websites.
5) Perform a crawling process against each
service
6) Apply patterns and regular expressions to
detect specific content(for example mail
addresses)
51
Ahmia search engine
https://ahmia.fi/
52
Torch search engine
http://xmh57jrzrnw6insl.onion
53
UnderDir Search engine
54
Hidden services
55
Search Hidden services
56
Other tools
POOPAK - TOR Hidden Service Crawler
https://github.com/teal33t/poopak
Tor spider
https://github.com/absingh31/Tor_Spider
Tor router
https://gitlab.com/edu4rdshl/tor-router
57
Onnion scan
https://github.com/s-rah/onionscan
58
Dark Web map
https://www.hyperiongray.com/dark-web-map/
59
GitHub repositories
https://github.com/serfer2/python-deepweb
https://github.com/jmortega/python_dark_web
View publication stats 60