To add sites to Sherlock is relatively straight forward. But, there is some background that will help
you in your task.Sherlock relies on the site’s designers providing a unique URL for a registered username. To determine
if a username is available, Sherlock queries that URL, and uses to response to understand if there is
a claimed username already there.The data.json file contains all of the
information about which sites Sherlock will query usernames for, and how that query should be
carried out.Here is an illustrative entry for one site in the
data.json file. Following this, is a
more detailed description of each entry.
Copy
"Some Cool Site That Everyone Loves": { "errorType": "status_code", "regexCheck": "^[a-zA-Z][a-zA-Z0-9_-]*$", "url": "https://somecoolsitethateveryoneloves.com/members/{}", "urlMain": "https://somecoolsitethateveryoneloves.com", "username_claimed": "blue", "username_unclaimed": "noonewouldeverusethis7" },
errorTypeRequired field that defines the detection method.
See the Detection Algorithms section below for more details about the options here.
regexCheckOptional field that specifies a regular expression that only matches on a valid username.
This is used to flag usernames that may not be allowable on a given site. Note that
a username that is valid on one site may not be allowed on another site.
urlRequired field that defines the format of a member’s URL on the site.
Use the curly brackets (i.e. "") to record where the username appears in the URL.
urlMainRequired field that defines the site’s URL.
username_claimedOptional field that defines a username that is claimed. This is used for tests.
Note that even though this field is optional for Sherlock functionality, the tests
will fail if a proper value is not supplied.
username_unclaimedOptional field that defines a username that has not been claimed. This is used for tests.
Note that even though this field is optional for Sherlock functionality, the tests
will fail if a proper value is not supplied.
urlProbeOptional field that defines a special URL which should be used to determine if a username is supported (this is normally an API call). If this is not given, then url will be used to check for the existence of a username. This field is useful if detecting the username using url is not reliable. Note that only unauthenticated API calls are supported.
With the large variety of websites, there are many different ways that detection of the availability of a username can go off the rails. So, it is important when a new site is added to configure the detection as robustly as possible. Likewise, it is crucial that tests are written for the site so that we can understand if the detection starts failing for certain sites.There are three detection algorithms supported at the current time.
This is the most reliably detection method. In this detection method, Sherlock does a request of the specific site URL which should exist if the desired username has been taken. If the username is unclaimed, then the site returns an HTTP Status code (typically a 404 Not Found).This is by far the most reliable method to detect if a username is available or not. Unfortunately, it is not supported by all sites.Here are some examples of what a site using this detection method would be configured:
This is the least reliable detection method. In this detection method, Sherlock does a request of the specific site URL which should exist if the desired username has been taken. If the username is unclaimed, then the site does not return any discernible HTTP Status code. Sites with this design will display some error message, or perhaps directly offer the user a chance to sign-up for the site.For this detection method, Sherlock looks for specific error text. Since the error message might change with minor updates on the site, this mechanism is the most fragile.Here are some examples of what a site using this detection method would be configured: