Categorization API
This document describes the Categorization API and interaction with it.
Working with Categorization API
Purpose of the Categorization API
The Categorization API is designed to provide developers and third-party systems with a quick and easy way to get data from the SafeDNS database of categorized sites. The API is designed for integration with other systems where site category verification is required (filtering systems, advertising systems, etc.). The categorization API uses the standard JSON specification to process requests. The API is not intended to be accessed by end users of the integrated system. The API must be requested from an intermediate server of the integrated system.
The performance of the API is approximately 1k requests per second.
The database of categorized SafeDNS sites currently includes more than 106 million unique domains in more than 60 categories.
In order to increase the processing speed, the data provided by the Categorization API can be cached on the side of the integrated system for a period of no more than 12 hours.
Accessing API
To access API you should use the following host: x.api.safedns.com
Authorization
Requesting x.api.safedns.com you need to use Basic Authorization. A special HTTP Authorization header should be passed in each request. The string <client_id>:<client_secret>
passed in the Authorization header is encoded by the base64 method. Herewith, the Basic should be specified as the authorization method.
Header example:
Authorization: Basic Ndc2MDE4N2Q4MWJjNGI3Nzk5NDc2YjQycjUxMDM3MTM6ZjI1YmViZjk5MWZmNDE5ODkzZGIyNTU3MjhlNGUxZGU=
CURL request with authorization:
curl --user <client_id>:<client_secret> https://x.api.safedns.com/domain/www.website.com
Getting a list of site categories
Request:
GET https://x.api.safedns.com/domain/www.website.com
will return an answer in JSON format:
{
"category": [49, 59],
"bad": false,
"category_name": ["Computers & Internet", "Business"]
}
Getting a list of URL categories
Request:
GET https://x.api.safedns.com/url/http://www.website.com/path/to?arg=val
will return an answer in JSON format:
{
"category": [49, 59],
"bad": false,
"category_name": ["Computers & Internet", "Business"]
}
Getting a list of categories
Request:
GET https://x.api.safedns.com/catgroups
will return an answer in JSON format:
[
{
"Illegal Activity": {"65": "Child Sexual Abuse (Arachnid)",
"66": "Crypto Mining", "6": "Drugs",
"7": "Tasteless", "8": "Academic Fraud",
"9": "Parked Domains", "10": "Hate & Discrimination",
"11": "Proxies & Anonymizers",
"19": "Child Sexual Abuse (IWF)",
"31": "German Youth Protection"}
},
{
"Adult Related": {"13": "Adult Sites",
"14": "Alcohol & Tobacco",
"15": "Dating",
"16": "Pornography & Sexuality",
"17": "Astrology",
"18": "Gambling"}
},
{
"Bandwidth Hogs": {"24": "Photo Sharing",
"20": "Torrents & P2P",
"21": "File Storage",
"22": "Movies & Video",
"23": "Music & Radio"}
},
{
"Time Wasters": {"5": "Online Ads",
"26": "Chats & Messengers",
"27": "Forums", "28": "Games",
"29": "Social Networks",
"30": "Entertainment"}
},
{
"General Sites": {"32": "Automotive",
"33": "Blogs",
"34": "Corporate Sites",
"35": "E-commerce",
"36": "Education",
"37": "Finances",
"38": "Government",
"39": "Health & Fitness",
"40": "Humor",
"41": "Jobs & Career",
"42": "Weapons",
"43": "Politics, Society and Law",
"44": "News & Media",
"45": "Non-profit",
"46": "Portals",
"47": "Religious",
"48": "Search Engines",
"49": "Computers & Internet",
"50": "Sports",
"51": "Science & Technology",
"52": "Travel",
"53": "Home & Family",
"54": "Shopping",
"55": "Arts",
"56": "Webmail",
"57": "Real Estate",
"58": "Classifieds",
"59": "Business",
"60": "Kids",
"63": "Trackers & Analytics",
"67": "Online Libraries"}
},
{
"Security": {"12": "Botnets",
"3": "Virus Propagation",
"4": "Phishing"}
}
]
Getting usage stats
Method total_count returns a summary of detailed stats for the requested period.
USERNAME and TOKEN are generated separately for the stats service by SafeDNS Manager or Tech Support.
Request:
curl --location --request POST 'https://sdk.safedns.com/stats_x/total_count' \
--header 'Authorization: TOKEN' \
--header 'Content-Type: application/json' \
--data-raw '{
"user": "USERNAME",
"start_date": "2023-01-01",
"end_date": "2023-05-31"
}'
will return an answer in JSON format:
{
"total_requests": 10600,
"billed_requests": 9600,
"categorized_domains": 5600
"nx_domains": 4000,
"unknown_domains": 910,
"bad_requests": 90
}
Method detailed_count returns detailed stats for the each day of the requested period.
USERNAME and TOKEN are generated separately for the stats service by SafeDNS Manager or Tech Support.
Request:
curl --location --request POST 'https://sdk.safedns.com/stats_x/detailed_count' \
--header 'Authorization: TOKEN' \
--header 'Content-Type: application/json' \
--data-raw '{
"user": "USERNAME",
"start_date": "2023-01-01",
"end_date": "2023-05-31"
}'
will return an answer in JSON format:
{
"2023-01-01": {
"total_requests": 10600,
"billed_requests": 9600,
"categorized_domains": 5600
"nx_domains": 4000,
"unknown_domains": 910,
"bad_requests": 90
},
"2023-01-02": {"total_requests": 22600,
"billed_requests": 21600,
"categorized_domains": 15000
"nx_domains": 6600,
"unknown_domains": 130,
"bad_requests": 870
},
"2023-01-02": {...},
}
Responses
If a domain is categorized, API returns code 200, category name and its number.
curl --user <client_id>:<client_secret> https://x.api.safedns.com/domain/safedns.com
StatusCode : 200
StatusDescription : OK
Content : {"category": [49], "bad": false, "category_name": ["Computers & Internet"]}
If a domain does not exist, API returns code 206, category 0 and non-existing domain.
curl --user <client_id>:<client_secret> https://x.api.safedns.com/domain/does.not.exist
StatusCode : 206
StatusDescription : Partial Content
Content : {"category": [0], "bad": false, "category_name": ["Non-Existing Domain"]}
If a domain is not categorized, API returns code 404 without JSON.
curl --user <client_id>:<client_secret> https://x.api.safedns.com/domain/com
curl : The remote server returned an error: (404) Not Found.
Example of a simple python project
Below you can see an example of a simple python project that allows you to get categories for a list of domains from a file domains.txt. To work with the code, create a text file domains.txt, add the domains there for categorization line by line and save them in one folder with a code file.
To access the API, you must use the following host(line 6): x.api.safedns.com
#!/usr/bin/env python3
import requests
from base64 import b64encode
url_src = "https://x.api.safedns.com/domain/"
credentials = b64encode(b"username:password").decode("ascii") # replace username:password with your credentials
headers = {
'Authorization': 'Basic %s' % credentials,
'Content-Type': 'application/json'
}
domain_src = open("domains.txt", "r")
total_time = 0
while True:
domain = domain_src.readline()
if not domain:
print(f'All requests were processed for {total_time} sec')
print('ENDofFILE')
break
url = url_src + domain
response = requests.get(url=url, headers=headers)
if response.status_code == 200:
print(domain, response.json())
print(f'Request processing time {response.elapsed.total_seconds()} sec')
total_time = total_time + response.elapsed.total_seconds()
elif response.status_code == 404:
print(f"According to our Data Base {domain} doesn't belong to any category.")
pass
elif response.status_code == 403:
print("Wrong username or password, access denied")
break
elif response.status_code == 429:
print("You run out of queries to x.api, wait for 1 minute")
break
domain_src.close()