KEMBAR78
Leaks API | PDF | Json | User (Computing)
0% found this document useful (0 votes)
497 views11 pages

Leaks API

The document outlines the Leaks API, part of the Identity Portal, which allows users to search for leaked data and export accounts in CSV format. It details the authentication process, search functionalities, and how to retrieve results using specific endpoints. Additionally, it includes a history of document versions and examples of API requests and responses.

Uploaded by

Rny
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
497 views11 pages

Leaks API

The document outlines the Leaks API, part of the Identity Portal, which allows users to search for leaked data and export accounts in CSV format. It details the authentication process, search functionalities, and how to retrieve results using specific endpoints. Additionally, it includes a history of document versions and examples of API requests and responses.

Uploaded by

Rny
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Kleissner Investments s.r.o.

Na Strzi 1702/65
140 00 Praha
Czech Republic

December 16, 2020


Version 5

LEAKS API
This document specifies the Leaks API which is part of the Identity Portal. It is a separate API from the
regular Search API, which is part of intelx.io. The high-level use cases are:

1. Search for a term and return each line where it appears.


2. Export leaked accounts as CSV.

Contents
LEAKS API ................................................................................................................................................... 1
Document History ......................................................................................................................................... 1
Authentication............................................................................................................................................... 2
Search data and return lines ......................................................................................................................... 3
Export Leaked Accounts ............................................................................................................................... 7
Terminate Search ......................................................................................................................................... 9
Synchronous Export Leaked Accounts ....................................................................................................... 10
Appendix 1: Buckets ................................................................................................................................... 11

Document History
Version Date Note
0.1 15.10.2020 Initial version.
0.2 26.10.2020 Data leak search: Return line in text encoding in new field “linea”
(before it was only in byte encoding). Add field “positionsize”.
0.3 26.10.2020 New terminate function and parameter to terminate previous searches.
0.4 27.10.2020 Updated list of buckets.
New function to synchronously export leaked accounts.
4 28.10.2020 New optional datefrom/dateto parameters.
5 16.12.2020 Authentication simplified.
Authentication
The API URL is https://3.intelx.io/.

The API key, specific to the end-user, must be specified as HTTP header “x-key”, or as &k=[key] in the
URL. Your key can be found at https://intelx.io/account?tab=developer. Note that your key must have the
“Identity Portal” license assigned to use this API.

The API will return HTTP 401 Unauthorized in case the API key is invalid or not authorized.
Search data and return lines
This is the API used by the “Search Data Leaks” tab of the Identity Portal. It searches for data and returns
the lines where it appears.

Request GET /live/search/internal

selector=[term] Selector to search for.


limit=[limit] Max amount of records per bucket to return.
Default 10.
bucket=[bucket] Optional bucket filter. Default empty.
skipinvalid=true Specify to skip invalid entries (recommended).
Default false.
analyze=false Default false.
terminate=[search ID] Optional: ID of previous search to terminate to
save system resources.
datefrom=[date] Optional: Date from/to (both required). Format is
dateto=[date] “YYYY-MM-DD HH:MM:SS”.
Response 200 Success with JSON LiveSearchResponse
400 Invalid input

• Selector: Must be an email address, domain, social security number (US based), or credit card
number.
• Limit: In some cases, the API might return more results than specified in limit. If an upper hard limit
is required, it must be enforced on the client side.
• Bucket: Optional filter for searching only in the target bucket. See Appendix 1 for list.
• In case a user makes a new search and the previous one shall be discarded; its search ID shall be
specified in the “terminate” parameter to save system resources. Searches may consume Gigabytes
of data, therefore any searches that are no longer required shall be terminated. Searches can also
be manually terminated via the /live/search/terminate function.
• Dates: From/to dates may be used as filter. Note that item’s dates are set to when the original data
was published if available, or otherwise when it was indexed. This means that newly indexed items
are often backdated.

Response is JSON data returning the search job ID that can be used to retrieve the results:

type LiveSearchResponse struct {


Status int `json:"status"` // Status: 0 = Success, 2 = Selector invalid
ID uuid.UUID `json:"id"` // id of the search job. This is used to get the results.
}

Example request:

GET
https://3.intelx.io/live/search/internal?selector=itsv.at&limit=100&bucket=&skipinvalid=true&analyze=fals
e&k=[KEY]

Example response:

{
"status": 0,
"id": "84bc2d34-ff97-4acc-acc2-7c8737fef917"
}

Using the search job ID, the results can be fetched as JSON records for machine processing, or as HTML
encoded text for visualization to the end-user.
The following function retrieves the actual search results.

Request GET /live/search/result

id=[search job ID] Search job ID returned by previous function.


format=[0,1,2] Return only specified fields:
0 = Text only, default
1 = Records only
2 = Return both fields
Response 200 Success with JSON LiveResult
400 Invalid input

Depending on the format parameter the response uses the text field, the records field, or both. Text is
HTML encoded and intended for direct visualization to the end-user. Use format=1 for machine processing.

The response uses the following JSON structure. The status indicates whether results are available and
whether the client shall continue to fetch for results. Status 0 means there are results in the response, 1
means there is currently no result in the response, but the client should continue to fetch for results. 2
(Terminated) and 3 (Search ID Not Found) tell the client to stop querying for results. Note that with status 2
there might be the last results in the response.

type LiveResult struct {


Status int `json:"status"` // Status: 0 = Result available, 1 = No result but keep trying
// 2 = Terminated, 3 = Search ID Not Found
Text string `json:"text"` // Result as text
Records []interface{} `json:"records"` // Result as records, depending on the original function
}

The “records” field is an array of elements with the following structure:

type LineRecord struct {


LineA string `json:"linea"` // Line as text
LineRaw []byte `json:"lineraw"` // Raw text UTF-
8. As []byte so it does not screw UTF-8 encoding in transit and position in line remains accurate.
PositionLine int `json:"positionline"` // Byte position in the raw line
PositionSize int `json:"positionsize"` // Byte size in the raw line
Item ixservice.Item `json:"item"` // Full item details where the result was found
}

The original line is passed in two fields:


• “linea”: Encoded as text.
• “lineraw”: The raw line passed as []byte, to preserve the original UTF-8 encoding in transit. The
“positionline” is the byte offset in the raw line where the input term appears, and the “positionsize” is
the number of bytes at the offset that represents the input term. Note that the found term does not
necessarily equate with the input term; if a user searches for “www.example.com” it will also find
occurrences of “example.com”. The “lineraw” field is passed internally by JSON as base64 data.

For the documentation of all values in an item structure, please see the documentation of
/intelligent/search/result, page 10 ff:

https://github.com/IntelligenceX/SDK/blob/master/Intelligence%20X%20API.pdf

Example request to retrieve the result as text:

GET https://3.intelx.io/live/search/result?id=84bc2d34-ff97-4acc-acc2-7c8737fef917&format=0&k=[KEY]
Example response:

{
"status": 0,
"text": "\u003ca href=\"https://intelx.io?did=a7143fcd-5b26-429b-9841-
0e26a02d4cad\" target=\"_blank\"\u003ea7143fcd-5b26-429b-9841-
0e26a02d4cad\u003c/a\u003e Text File dumpster vip163@\u003cspan style=\"b
ackground-color: yellow;\"\u003eexample.com\u003c/span\u003e\n"
}

Example request to retrieve the result as records:

GET https://3.intelx.io/live/search/result?id=84bc2d34-ff97-4acc-acc2-7c8737fef917&format=1&k=[KEY]

Example response:

{
"status": 2,
"text": "",
"records": [
{
"linea": "155159789,,gabriel.nitu@itsv.at,FQYuFpg7Usg=,",
"lineraw": "MTU1MTU5Nzg5LCxnYWJyaWVsLm5pdHVAaXRzdi5hdCxGUVl1RnBnN1VzZz0s",
"positionline": 11,
"positionsize": 20,
"item": {
"systemid": "7f016419-3fac-4bfc-b30d-9f93901e5f76",
"owner": "b37bf0f6-ab90-4737-804a-ccf0b1b6d6de",
"storageid": "c6db12911c263f756608eca50fdede1921fa1c180b0ddbdcd27307a0072729fc283913de721
1de8a4ae1fc0d550c8b9c211e2887e56daf2c0f60894baafc261c",
"instore": true,
"size": 4194275,
"accesslevel": 0,
"type": 1,
"media": 24,
"added": "2019-12-04T18:09:19.27645Z",
"date": "2019-12-04T18:05:34.265767Z",
"name": "Adobe October 2013.txt [Part 1337 of 1973]",
"description": "",
"xscore": 67,
"simhash": 9832427935425476012,
"bucket": "leaks.public.general",
"keyvalues": null,
"tags": [
{
"class": 4,
"value": "email"
},
{
"class": 0,
"value": "fr"
}
],
"relations": null
}
},
{
"linea": "gabriel.nitu@itsv.at:8098daaeb98a2e3238b0d956cf4ffc70348a9a35",
"lineraw": "Z2FicmllbC5uaXR1QGl0c3YuYXQ6ODA5OGRhYWViOThhMmUzMjM4YjBkOTU2Y2Y0ZmZjNzAzNDhhOWEzN
Q==",
"positionline": 0,
"positionsize": 20,
"item": {
"systemid": "5a598aca-4952-46e0-8aa5-5e5dee98fd84",
"owner": "ad837f4f-8c83-409d-a389-337c0b62303c",
"storageid": "73dfa0d808da612bb702ce184926b831e56f52ee556b0d3c1a12d67f4e35e16ca1f7ea361a8
a6da9014bb9b9b38339813213b38284812acc73212cdce9d63713",
"instore": true,
"size": 4194303,
"accesslevel": 0,
"type": 1,
"media": 24,
"added": "2020-10-02T19:28:04.041776Z",
"date": "2020-10-02T19:27:30.386528Z",
"name": "Dropbox.com.rar/sha1.txt [Part 197 of 245]",
"description": "",
"xscore": 63,
"simhash": 12638153115695167421,
"bucket": "leaks.private.general",
"keyvalues": null,
"tags": [
{
"class": 0,
"value": "fr"
},
{
"class": 4,
"value": "email"
}
],
"relations": null
}
}
]
}
Export Leaked Accounts
This is the API used by the “Export Leaked Accounts” tab of the Identity Portal. It only supports domains
and email addresses as input.

Request GET /accounts/csv

selector=[term] Selector to search for.


limit=[limit] Max amount of records per bucket to return.
Default 20000.
bucket=[bucket] Optional bucket filter. Default empty.
terminate=[search ID] Optional: ID of previous search to terminate to
save system resources.
datefrom=[date] Optional: Date from/to (both required). Format is
dateto=[date] “YYYY-MM-DD HH:MM:SS”.
Response 200 Success with JSON LiveSearchResponse
400 Invalid input

The response is the same as for /live/search/internal, returning a status and the search job ID. In case a
user makes a new export and the previous one shall be discarded; its search ID shall be specified in the
“terminate” parameter to save system resources. Searches may consume Gigabytes of data, therefore any
searches that are no longer required shall be terminated. Searches can also be manually terminated via the
/live/search/terminate function.

Example request:

GET https://3.intelx.io/accounts/csv?selector=itsv.at&k=[KEY]

Example response:

{
"status": 0,
"id": "63272ebf-d906-4faf-855a-0772c5459f9a"
}

To fetch the results, use the same /live/search/result endpoint as described before. Note that the “records”
field is an array of records with a different structure:

type CSVRecord struct {


User string `json:"user"` // Username
Password string `json:"password"` // Password
PasswordType string `json:"passwordtype"` // Password Type
Bucket string `json:"bucket"` // Bucket
Date time.Time `json:"date"` // Date
SourceShort string `json:"sourceshort"` // Source Short
SourceLong string `json:"sourcelong"` // Source Long
SystemID uuid.UUID `json:"systemid"` // System ID of the item that contains this result
}

Example request to retrieve the result as records:

GET https://3.intelx.io/live/search/result?id=63272ebf-d906-4faf-855a-0772c5459f9a&format=1&k=[KEY]

Example response:

{
"status": 0,
"text": "",
"records": [
{
"user": "demorales574@gmail.com",
"password": "badin1990",
"passwordtype": "Plaintext",
"bucket": "leaks.public.general",
"date": "2019-01-17T22:08:55+01:00",
"sourceshort": "Collection 1",
"sourcelong": "Collection 1/Collection #1_Games combos_Sharpening.tar.gz/Collection #1_Game
s combos_Sharpening/725.txt",
"systemid": "fe1e7f1b-ff8d-4ab0-91c2-7de07fa093f8"
},
{
"user": "demorales574@gmail.com",
"password": "badin1990",
"passwordtype": "Plaintext",
"bucket": "leaks.public.general",
"date": "2019-01-17T20:09:45+01:00",
"sourceshort": "Collection 1",
"sourcelong": "Collection 1/Collection #1_EU combos.tar.gz/Collection #1_EU combos/1080.txt
",
"systemid": "59c153b9-51bd-40c3-a77d-632272b4a6bb"
}
]
}
Terminate Search
To terminate an active search or export, use this function. Terminating a search that is no longer needed
saves system resources. Since searches may read and process Gigabytes of data, it is highly appreciated
if users terminate searches that are no longer needed.

Request GET /live/search/terminate

id=[search job ID] Search job ID returned by previous function.


Response 204 Success
400 Invalid input

Terminating a search that is already terminated has no effect.


Synchronous Export Leaked Accounts
Use this function to query leaked accounts and return them immediately.

Note: You should use the asynchronous function /accounts/csv as this one might miss results that are not
available within the given timeout. Searching for leaked accounts may take minutes, especially when
searching for domains that have thousands of results. Internally the API must fetch the entire data for each
individual result which often results internally in Gigabytes of traffic and potentially causes delays.

Request GET /accounts/1

selector=[term] Selector to search for.


limit=[limit] Max amount of records per bucket to return.
Default 20000.
bucket=[bucket] Optional bucket filter. Default empty.
timeout=[timeout] Timeout in seconds. If omitted or set to 0, the
default is used.
datefrom=[date] Optional: Date from/to (both required). Format is
dateto=[date] “YYYY-MM-DD HH:MM:SS”.
Response 200 JSON array of CSVRecord
400 Invalid input

The default timeout is 10 minutes. The client must make sure to allow for such high HTTP timeouts on the
client side. The timeout must not be higher than 1 hour, which is the HTTP server write timeout.

The result is an JSON array of these records:

type CSVRecord struct {


User string `json:"user"` // Username
Password string `json:"password"` // Password
PasswordType string `json:"passwordtype"` // Password Type
Bucket string `json:"bucket"` // Bucket
Date time.Time `json:"date"` // Date
SourceShort string `json:"sourceshort"` // Source Short
SourceLong string `json:"sourcelong"` // Source Long
SystemID uuid.UUID `json:"systemid"` // System ID of the item that contains this result
}

Example:

GET https://3.intelx.io/accounts/1?selector=itsv.at&timeout=10&k=[KEY]
Appendix 1: Buckets
This is the list of currently available buckets. Please note that access to certain buckets is restricted
according to your obtained license. Buckets may be added or removed at any time without prior notice.

Bucket Info
darknet.tor Tor hidden services (.onion domains)
darknet.i2p I2P eepsites (.i2p domains)
documents.public.scihub Public documents from Sci-Hub
dumpster Any data potentially relevant but does not fit into any other category
dumpster.web.1 Dumpster: Interesting other websites with high-value information
dumpster.web.ssn Dumpster: SSN related websites
leaks.private.general Private Data Leaks
leaks.public.general Public Data Leaks
leaks.public.wikileaks WikiLeaks, Cryptome & Snowden data
pastes Pastes from various pastebin sites
web.public.de Public Web: Germany
web.public.gov Government US
web.public.kp North Korean public websites
web.public.peer Public Web: Decentralized blockchain based TLDs
web.public.ru Public Web: Russia
web.public.com Public Web: International TLDs
whois Whois data

You might also like