Sitemap API - Documentation

API Endpoints

GET /api/sitemap/:domain

Crawls a website and returns all discovered page paths in JSON format.

GET /api/sitemap/:domain/json

Returns sitemap in JSON format (same as above).

GET /api/sitemap/:domain/xml

Returns sitemap in XML format (standard sitemap.xml format for search engines).

Parameters

Path Parameters

Parameter	Type	Description
`domain`	string	The domain to crawl (with or without http/https)

Query Parameters

Parameter	Type	Default	Range	Description
`depth`	integer	5	0-10	Maximum crawling depth
`limit`	integer	500	1-1000	Maximum pages to crawl

Features

🛡️ Infinite Loop Protection

Tracks visited URLs to prevent crawling the same page twice

⏱️ Timeout Protection

Automatically aborts after 30 seconds

🎯 Same-Domain Only

Only follows links within the target domain

📄 HTML Pages Only

Skips images, PDFs, and other non-HTML content

🔗 Smart URL Handling

Handles relative, absolute, and protocol-relative URLs

Example Requests

JSON Format

GET /api/sitemap/example.com/json?depth=5&limit=500

XML Format

GET /api/sitemap/example.com/xml?depth=5&limit=500

Example Response (JSON)

{
  "domain": "example.com",
  "url": "https://example.com",
  "maxDepth": 5,
  "maxPages": 500,
  "pagesCrawled": 23,
  "totalPaths": 23,
  "paths": [
    "/",
    "/about",
    "/blog",
    "/blog/post-1",
    "/blog/post-2",
    "/contact",
    "/products",
    "/services"
  ]
}

Example Response (XML)

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/</loc>
  </url>
  <url>
    <loc>https://example.com/about</loc>
  </url>
  <url>
    <loc>https://example.com/blog</loc>
  </url>
  <url>
    <loc>https://example.com/contact</loc>
  </url>
</urlset>

Response Fields

Field	Type	Description
`domain`	string	The domain that was crawled
`url`	string	The full URL of the starting page
`maxDepth`	integer	Maximum depth used for crawling
`maxPages`	integer	Maximum pages limit
`pagesCrawled`	integer	Number of pages actually crawled
`totalPaths`	integer	Total unique paths found
`paths`	array	Sorted array of all discovered paths

Try It Out

Click an example to test the API:

JSON Format:
example.com (JSON) monzoor.framer.website (JSON)

XML Format:
example.com (XML) monzoor.framer.website (XML)

Error Responses

400 Bad Request

{
  "error": "Domain parameter is required"
}

502 Bad Gateway

{
  "error": "Failed to fetch https://example.com: Not Found"
}

500 Internal Server Error

{
  "error": "Failed to generate sitemap",
  "message": "Error details..."
}

Usage Examples

cURL

curl "https://your-api.com/api/sitemap/example.com?depth=5&limit=500"

JavaScript (fetch)

const response = await fetch('/api/sitemap/example.com?depth=5&limit=500');
const data = await response.json();
console.log(data.paths);

Python (requests)

import requests

response = requests.get(
    'https://your-api.com/api/sitemap/example.com',
    params={'depth': 5, 'limit': 500}
)
data = response.json()
print(data['paths'])