We’re renaming ‘products’ to ‘apps’

Atlassian 'products’ are now ‘apps’. You may see both terms used across our documentation as we roll out this terminology change. Here’s why we’re making this change

Connect custom website to Rovo

This connector allows you to do a search crawl and index of your website to show in Rovo Search results and use in Rovo Chat and Agents.

What is indexed?

The custom website connector indexes these objects:

  • Web pages (mime type: text/html)

  • Text files (mime type: text/plain)

For each object, it indexes these attributes:

  • Name

  • URL

  • Created date

  • Last updated date

  • Description

  • Page content

Before you begin

  • This connector can only crawl secure (https) websites that you own.

  • To ensure you own the domain or subdomain, you need to be able to edit the robots.txt file in the website you’d like to crawl.

  • We encourage you to review what pages are available on your website. All Rovo users will have access to all content available to the crawler (including content using any configured authentication).

Editing your robots.txt

You need to be able to edit the robots.txt file on your website. If you’re not sure what a robots.txt file is, see How to write a robots.txt file, or talk to your website administrator.

At minimum you’ll need to add the following to your existing robots.txt file on your website:

User-agent: atlassian-bot

Note that User-agent: * will not permit crawling alone, the atlassian-bot line must be included.

If the website you’d like to crawl is a subdomain (for example, https://support.vitafleet.com/ ) the robots.txt file must be available at that subdomain (https://support.vitafleet.com/robots.txt), not at the domain (editing https://www.vitafleet.com/robots.txt will not work).

Be aware that your robots.txt file, including this atlassian-bot edit, is always visible to the public (unless your site requires authentication).

You can additionally add specific allow or disallow rules to the robots.txt and the connector will follow these rules, for example:

User-agent: atlassian-bot Disallow: /not-useful/

This rule would allow Rovo to index every public page on your site except the content under /not-useful/.

Authentication options

The connector currently supports crawling sites with:

Basic authentication

Crawling a site with basic authentication means that Rovo will index all content with the username and password provided in the setup form.

This is suitable when your organisation has sites that aren’t public, but also don’t require individual permissions (for example, some intranets or internal knowledge bases).

You will still need to edit the robots.txt file on your authenticated site.

Broad content access

Connecting a site with basic authentication means that every Rovo user on your site will have access to all content available to the provided username and password.

Rovo will not respect individual permissions for this site.

Connecting and crawling your website

To get to the setup screen for your custom website in Atlassian Admin:

  1. Go to Atlassian Administration. Select your organization if you have more than one.

  2. Select Settings > Rovo.

  3. Under Sites, next to the site you want to connect, select Add connector.

  4. Select Custom website and press Next.

To setup your crawl:

  1. Enter a website name for the site you’d like to crawl.

  2. Add the full URL of the domain. Include the protocol (https://).

  3. Choose how often Rovo should index your site.

  4. Choose your authentication method and fill in any applicable fields.

  5. Review and agree to the data usage information.

  6. Select Connect.

Next steps

After you’ve finished setting up the crawl:

  1. The crawling and indexing of your site will start immediately.

  2. Pages will start to show in Search incrementally for you and your team over the next few hours.

  3. Depending on the number of pages on your website, it may take some time for all your website’s content to be indexed and appear in Search.

Still need help?

The Atlassian Community is here for you.