Getting Started

Onna's wiki crawler was created to index wiki pages.  

The wiki crawler does not currently support password-protected websites.  If you need to crawl a password-protected site, please contact us and we will put you in touch with one of our partners.

Integration Features

What is collected?

The wiki indexer follows links to one level, meaning that it syncs the page(s) from the given URL(s) and all links in that page(s)).  It crawls html pages for:

  • Headings
  • Subheadings
  • Paragraphs
  • Metadata
  • Pictures
  • Links
  • Text links

What are your sync models?

We currently support one syncing mode - one-time.

  • One-time is a one-way sync that collects information only once.

The synchronization scope currently encompasses the url to be synced and all the links in the initial page. Only one depth level for http links is supported.

Can you export the data?

Yes, you can export data and metadata in eDiscovery ready format. Load files are available in a dat, CSV, or custom text file.

How to Guide

Click on "Add a Data Source" and select Wiki Crawler.

Name your data source and enter the URL. Separate the URLs with a comma if entering more than one.

Once you have clicked "Sync", you will see this integration under Your Data Sources page.  Data is being indexed instantaneously and you can see the status as "Uploading". Once the data source is fully synced, you will see a green cloud with the last sync date.

When you click on the Web Crawler data source, you will start seeing results being populated.

From this screen, you are able to filter results by date range, categories, and/or extensions using the menu on the left.  

From that same screen, you can find the audit logs and set your privacy settings for this particular data source.  To view the audit logs, please click on the information icon on the top right.

Users can manage their privacy settings for this particular data source from the search results page.  To edit, click on the lock icon on the top right.

Once you have clicked on the icon, you can add and customize the privacy settings for each user added onto the data source.  You can select if you would like the user to manage this resource, view this resource only, and/or download this resource.  You can also delete a user from the particular data source here.

Did this answer your question?