Technology driven lead generation

Sell and market more effectively with technographic insights. We track over a thousand technologies across websites of millions of companies to help you to identify new prospects and increase your addressable market.

    Home
  • /
  • Developer documentation

Crawl API

Identify the technologies used on any public website in real-time. Results are always guaranteed up-to-date and delivered within minutes.

This API is asynchronous: results are sent to a callback URL after the domain has been indexed.

Endpoint

GET https://api.wappalyzer.com/crawl/v2/

Properties

Property Description
Execution Synchronous / Asynchronous (when crawling or indexing multiple domains)
Request timeout 30s
Rate limit 1 request / second (up to ten domains per request)

Parameters

Name Description
urls (required) Beetween one and ten website URLs (e.g. https://example.com,https://example.org).
callback_url A POST request will be made to the callback URL upon completion of the request (e.g. https://yourdomain.com). Required when recursive is true (default) or multiple urls are specified.
recursive Follow links to analyse up to 25 pages (true (default) or false).
sets Comma-separated list of additional attribute sets to include in the results (e.g. meta,social). See Attribute sets.

Callback

A callback URL is a public endpoint hosted on your own server. If you request a domain that we haven't seen before, you'll initially get an empty response while the website is being indexed. Minutes later, your callback URL will receive a POST request with the final results. Without a callback URL, you will not get these results unless you make another request.

Examples

Example request

By default, websites are crawled recursively by following links on the page and a callback URL is required.

curl -H "x-api-key: <your api key>" "https://api.wappalyzer.com/crawl/v2/?urls=https://example.com,https://example.org&callback_url=https://yourdomain.com"
Example response

The callback URL will receive a POST request when results become available. The callback URL will be invoked seperately for each requested URL.

[
  {
    "url": "https://example.com",
    "crawl": true // Crawl initiated
  },
  {
    "url": "https://example.org",
    "crawl": true // Crawl initiated
  }
]
Example request

In this example we pass a single URL, set recursive to false and no callback_url. This way only a single page is indexed and results are available directly in the response. This method is faster and requires less effort to set up but results are less comprehensive.

curl -H "x-api-key: <your api key>" "https://api.wappalyzer.com/crawl/v2/?urls=https://example.com&recursive=false"
Example response (success)
[
  {
    "url": "https://example.com",
    "technologies": [
      {
        "slug": "craft-cms",
        "name": "Craft CMS",
        "versions": [
          "3.0.0"
        ],
        "categories": [
          {
            "id": 1,
            "slug": "cms",
            "name": "CMS"
          }
        ]
      }
    ]
  }
]
Example response (error)
[
  {
    "url": "https://example.com",
    "errors": [
      "No response from server"
    ]
  }
]

Subscribe to receive occasional product updates.