# Categorization API

# Working with Categorization API

##### Purpose of the Categorization API

The Categorization API is designed to provide developers and third-party systems with a quick and easy way to get data from the SafeDNS database of categorized sites. The API is designed for integration with other systems where site category verification is required (filtering systems, advertising systems, etc.). The categorization API uses the standard JSON specification to process requests. The API is not intended to be accessed by end users of the integrated system. The API must be requested from an intermediate server of the integrated system.

The performance of the API is approximately 1k requests per second.  
  
The database of categorized SafeDNS sites currently includes more than 106 million unique domains in more than 60 categories.

<p class="callout info">In order to increase the processing speed, the data provided by the Categorization API can be cached on the side of the integrated system for a period of no more than 12 hours.</p>


---

##### Accessing API

To access API you should use the following host: **x.api.safedns.com**

---

##### Authorization

Requesting x.api.safedns.com you need to use Basic Authorization. A special HTTP Authorization header should be passed in each request. The string `<client_id>:<client_secret>` passed in the Authorization header is encoded by the **base64** method. Herewith, the **Basic** should be specified as the authorization method.

Authorization header example:

```JSON
Authorization: Basic Ndc2MDE4N2Q4MWJjNGI3Nzk5NDc2YjQycjUxMDM3MTM6ZjI1YmViZjk5MWZmNDE5ODkzZGIyNTU3MjhlNGUxZGU=
```

CURL request with authorization:

```HTML
curl  https://x.api.safedns.com/domain/www.website.com -H "Authorization: Basic AUTHORIZATION_HEADER"
```

PowerShell example with authorization:

```Powershell
Invoke-WebRequest -Uri https://x.api.safedns.com/domain/www.website.com -Headers @{ Authorization = "Basic "+ [System.Convert]::ToBase64String([System.Text.Encoding]::ASCII.GetBytes("<client_id>:<client_secret>")) }
```

---

##### Getting a list of site categories

Request:

```HTML
GET https://x.api.safedns.com/domain/www.website.com
```

will return an answer in JSON format:

```JSON
{
  "category": [49, 59], 
  "bad": false, 
  "category_name": ["Computers & Internet", "Business"]
}
```

---

##### Getting a list of URL categories

Request:

```HTML
GET https://x.api.safedns.com/url/http://www.website.com/path/to?arg=val
```

will return an answer in JSON format:

```JSON
{
  "category": [49, 59], 
  "bad": false, 
  "category_name": ["Computers & Internet", "Business"]
}
```

---

##### Getting a list of categories

Request:

```HTML
GET https://x.api.safedns.com/catgroups
```

will return an answer in JSON format:

```JSON
[
  {
    "Illegal Activity": {"6": "Drugs",
                         "7": "Tasteless",
                         "8": "Academic Fraud",
                        "10": "Hate & Discrimination",
                        "11": "VPN, Proxies & Anonymizers",
                        "19": "Child Sexual Abuse (IWF)",
                        "31": "German Youth Protection",
                        "65": "Child Sexual Abuse (Arachnid)"}
  },
  {
    "Adult Related": {"13": "Adult Sites",
                     "14": "Alcohol & Tobacco",
                     "15": "Dating",
                     "16": "Pornography & Sexuality",
                     "17": "Astrology",
                     "18": "Gambling"}
  },
  {
    "Bandwidth Hogs": {"24": "Photo Sharing",
                       "20": "Torrents & P2P",
                       "21": "File Storage",
                       "22": "Movies & Video",
                       "23": "Music & Radio"}
  },
  {
    "Time Wasters": {"5": "Online Ads",
                    "26": "Chats & Messengers",
                    "27": "Forums", "28": "Games",
                    "29": "Social Networks",
                    "30": "Entertainment"}
  },
  {
    "General Sites": {"32": "Automotive",
                      "33": "Blogs",
                      "34": "Corporate Sites",
                      "35": "E-commerce",
                      "36": "Education",
                      "37": "Finances",
                      "38": "Government",
                      "39": "Health & Fitness",
                      "40": "Humor",
                      "41": "Jobs & Career",
                      "42": "Weapons",
                      "43": "Politics, Society and Law",
                      "44": "News & Media",
                      "45": "Non-profit",
                      "46": "Portals",
                      "47": "Religious",
                      "48": "Search Engines",
                      "49": "Computers & Internet",
                      "50": "Sports",
                      "51": "Science & Technology",
                      "52": "Travel",
                      "53": "Home & Family",
                      "54": "Shopping",
                      "55": "Arts",
                      "56": "Webmail",
                      "57": "Real Estate",
                      "58": "Classifieds",
                      "59": "Business",
                      "60": "Kids",
                      "63": "Trackers & Analytics",
                      "67": "Online Libraries",
                      "72": "Generative AI",
                      "100": "Contentless Domains"}
  },
  {
    "Security": { "73": "DNS-tunneling",
                   "1": "Newly Registered Domains",
                   "3": "Malware",
                   "4": "Phishing & Typosquatting",
                   "9": "Parked Domains",
                  "12": "Botnets & C2C",
                  "66": "Cryptojacking",
                  "70": "DGA",
                  "71": "Ransomware"}
  }
]
```

---

##### Getting usage stats

Method **total\_count** returns a summary of detailed stats for the requested period.

<p class="callout warning">USERNAME and TOKEN are generated separately for the stats service by SafeDNS Manager or Tech Support.  
</p>

Request:

```shell
curl --location --request POST 'https://sdk.safedns.com/stats_x/total_count' \
--header 'Authorization: Bearer TOKEN' \
--header 'Content-Type: application/json' \
--data-raw '{
    "user": "USERNAME",
    "start_date": "2023-01-01",
    "end_date": "2023-05-31"
}'
```

will return an answer in JSON format:

```JSON
{
"total_requests": 10600,
"billed_requests": 9600,
"categorized_domains": 5600
"nx_domains": 4000,
"unknown_domains": 910,
"bad_requests": 90
}
```

‌Method **detailed\_count** returns detailed stats for the each day of the requested period.  
USERNAME and TOKEN are generated separately for the stats service by SafeDNS Manager or Tech Support.

Request:

```shell
curl --location --request POST 'https://sdk.safedns.com/stats_x/detailed_count' \
--header 'Authorization: Bearer TOKEN' \
--header 'Content-Type: application/json' \
--data-raw '{
    "user": "USERNAME",
    "start_date": "2023-01-01",
    "end_date": "2023-05-31"
}'
```

will return an answer in JSON format:

```JSON
{
"2023-01-01": {
               "total_requests": 10600,
               "billed_requests": 9600,
               "categorized_domains": 5600
               "nx_domains": 4000,
               "unknown_domains": 910,
               "bad_requests": 90
              },
"2023-01-02": {"total_requests": 22600,
               "billed_requests": 21600,
               "categorized_domains": 15000
               "nx_domains": 6600,
               "unknown_domains": 130,
               "bad_requests": 870
              },
"2023-01-02": {...},
}
```

---

##### Responses

If a domain is categorized, API returns code 200, category name and its number.

```shell
curl --user <client_id>:<client_secret> https://x.api.safedns.com/domain/safedns.com
```

```JSON
StatusCode        : 200
StatusDescription : OK
Content           : {"category": [49], "bad": false, "category_name": ["Computers & Internet"]}
```

If a domain does not exist, API returns code 206, category 0 and non-existing domain.

```shell
curl --user <client_id>:<client_secret> https://x.api.safedns.com/domain/does.not.exist
```

```JSON
StatusCode        : 206
StatusDescription : Partial Content
Content           : {"category": [0], "bad": false, "category_name": ["Non-Existing Domain"]}
```

If a domain is not categorized, API returns code 404 without JSON.

```shell
curl --user <client_id>:<client_secret> https://x.api.safedns.com/domain/com
```

```
curl : The remote server returned an error: (404) Not Found.
```

# Example of a simple python project

Below you can see an example of a simple python project that allows you to get categories for a list of domains from a file **domains.txt.** To work with the code, create a text file domains.txt, add the domains there for categorization line by line and save them in one folder with a code file.

To access the API, you must use the following host(line 6): `<strong>x.api.safedns.com</strong>`

<div class="pointer-container" id="bkmrk-%C2%A0-0"><div class="pointer anim is-page-editable"><svg class="svg-icon" data-icon="link" role="presentation" viewbox="0 0 24 24" xmlns="http://www.w3.org/2000/svg"></svg><div class="input-group inline block"> <button class="button outline icon" data-clipboard-target="#pointer-url" title="Copy Link" type="button"><svg class="svg-icon" data-icon="copy" role="presentation" viewbox="0 0 24 24" xmlns="http://www.w3.org/2000/svg"></svg></button></div><svg class="svg-icon" data-icon="edit" role="presentation" viewbox="0 0 24 24" xmlns="http://www.w3.org/2000/svg"></svg></div></div>```Python
#!/usr/bin/env python3

import requests
from base64 import b64encode

url_src = "https://x.api.safedns.com/domain/"

credentials = b64encode(b"username:password").decode("ascii")  # replace username:password with your credentials

headers = {
    'Authorization': 'Basic %s' % credentials,
    'Content-Type': 'application/json'
}
domain_src = open("domains.txt", "r")
total_time = 0
while True:
    domain = domain_src.readline()
    if not domain:
        print(f'All requests were processed for {total_time} sec')
        print('ENDofFILE')
        break
    url = url_src + domain
    response = requests.get(url=url, headers=headers)
    if response.status_code == 200:
        print(domain, response.json())
        print(f'Request processing time {response.elapsed.total_seconds()} sec')
        total_time = total_time + response.elapsed.total_seconds()
    elif response.status_code == 404:
        print(f"According to our Data Base {domain} doesn't belong to any category.")
        pass
    elif response.status_code == 403:
        print("Wrong username or password, access denied")
        break
    elif response.status_code == 429:
        print("You run out of queries to x.api, wait for 1 minute")
        break

domain_src.close()
```