API Endpoints
GET
/api/sitemap/:domain
Crawls a website and returns all discovered page paths in JSON format.
GET
/api/sitemap/:domain/json
Returns sitemap in JSON format (same as above).
GET
/api/sitemap/:domain/xml
Returns sitemap in XML format (standard sitemap.xml format for search engines).
Parameters
Path Parameters
| Parameter | Type | Description |
|---|---|---|
domain |
string | The domain to crawl (with or without http/https) |
Query Parameters
| Parameter | Type | Default | Range | Description |
|---|---|---|---|---|
depth |
integer | 5 | 0-10 | Maximum crawling depth |
limit |
integer | 500 | 1-1000 | Maximum pages to crawl |
Features
π‘οΈ Infinite Loop Protection
Tracks visited URLs to prevent crawling the same page twice
β±οΈ Timeout Protection
Automatically aborts after 30 seconds
π― Same-Domain Only
Only follows links within the target domain
π HTML Pages Only
Skips images, PDFs, and other non-HTML content
π Smart URL Handling
Handles relative, absolute, and protocol-relative URLs
Example Requests
JSON Format
GET /api/sitemap/example.com/json?depth=5&limit=500
XML Format
GET /api/sitemap/example.com/xml?depth=5&limit=500
Example Response (JSON)
{
"domain": "example.com",
"url": "https://example.com",
"maxDepth": 5,
"maxPages": 500,
"pagesCrawled": 23,
"totalPaths": 23,
"paths": [
"/",
"/about",
"/blog",
"/blog/post-1",
"/blog/post-2",
"/contact",
"/products",
"/services"
]
}
Example Response (XML)
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/</loc>
</url>
<url>
<loc>https://example.com/about</loc>
</url>
<url>
<loc>https://example.com/blog</loc>
</url>
<url>
<loc>https://example.com/contact</loc>
</url>
</urlset>
Response Fields
| Field | Type | Description |
|---|---|---|
domain |
string | The domain that was crawled |
url |
string | The full URL of the starting page |
maxDepth |
integer | Maximum depth used for crawling |
maxPages |
integer | Maximum pages limit |
pagesCrawled |
integer | Number of pages actually crawled |
totalPaths |
integer | Total unique paths found |
paths |
array | Sorted array of all discovered paths |
Try It Out
Click an example to test the API:
Error Responses
400 Bad Request
{
"error": "Domain parameter is required"
}
502 Bad Gateway
{
"error": "Failed to fetch https://example.com: Not Found"
}
500 Internal Server Error
{
"error": "Failed to generate sitemap",
"message": "Error details..."
}
Usage Examples
cURL
curl "https://your-api.com/api/sitemap/example.com?depth=5&limit=500"
JavaScript (fetch)
const response = await fetch('/api/sitemap/example.com?depth=5&limit=500');
const data = await response.json();
console.log(data.paths);
Python (requests)
import requests
response = requests.get(
'https://your-api.com/api/sitemap/example.com',
params={'depth': 5, 'limit': 500}
)
data = response.json()
print(data['paths'])