Confluence is a team content collaboration software. Onna supports Confluence Cloud and Server version 5.7 and up. Onna connects directly with the API to collect all information in native format. The integration collects all data and metadata from an entire Confluence site or individual spaces.
All files are synced, including, but not limited to:
- HTML content of the page
- Comments on pages
- Attachments for the page
- Labels for attachments and pages
- Ancestors for the page/attachments
- Historical information and related metadata, including:
- Author of the page
- Created by/on
- Last updated by/on
- Previous Version created by/on
Types of Sync Available
We currently support three syncing modes - one-time, archive, and auto.
- One-time is a one-way sync that collects information only once.
- Archive means that Onna will perform a full sync first and will continuously add any new files generated at the data source. The sync type does not delete files deleted from the data source
- Auto-sync means that Onna will perform a full sync first and will keep the data source and Onna in mirrored sync. Any deletions from the data source will be deleted in Onna, as well.*
The synchronization scope currently encompasses entire Confluence sites, specific Confluence spaces, and specific Confluence pages.
*This sync mode is only supported for Confluence pages and files attached to a space.
All files and metadata can be exported in eDiscovery ready format. Load files are available in a dat, CSV, or custom text file.
The following metadata fields are exported:
- Space Name
- Space ID (text key field to identify space in Confluence)
- Confluence Space Type
- Ancestors for a file
- List of Labels
- All date related metadata
What does the export look like?
We've compiled sample load files for our different integrations. Click on the link below to download a sample Confluence export.
How to Guide
Click on "Add source" and select Confluence.
That will lead you to the following page:
First, name your source. This is the source's title on the platform. If you're naming for eDiscovery purposes a common convention is to name it after the company.
Enter the Confluence site's URL as the host. You will also need to enter your credentials including your full username's email as well as an API token. To generate an Atlassian API token, follow the instructions here.
If the site is not hosted on Atlassian, then select the option to log in with user and password.
You have the option to Confirm, which will auto sync all of the spaces or you have the option to Configure the synchronization settings.
If you choose to configure your collection, you can filter on the spaces as indicated below. Select the space(s) you would like to sync. You can use the letters to search for specific spaces alphabetically or use the text input box to search for it by typing it's name. To sync all, click 'Select all' in the top right hand corner.
The final step is to select the sync mode that you'd like for the source. We describe the different sync types above.
Note: If you'd like to use the date range feature, select One-time Sync.
Once you have clicked 'Confirm,' you will see this integration under 'My Sources' page. Onna will begin to interact with Confluence's API and begin to sync files. Files will be processed and indexed so that all is searchable. A source will indicate that it's syncing during this process.
When you click on the Confluence data source, you will start seeing results being populated.
From this screen, you are able to filter results by date range, categories, and/or extensions using the menu on the left.
Confluence pages in Onna
Onna demonstrates the Confluence page as a PDF representation of the page.
Accessing audit logs
Clicking on the information icon on the top right will take you to the source details where you can see how many files it has and its size.
Click on Audit Logs to see logs from collection and processing
For Confluence on-premise collections, is it necessary to install anything on a server?
Yes, one needs to install an application on a Windows machine that is always on and has constant connectivity to the Confluence server and Internet.
Check out our guide to collecting from Confluence on-premise
What type of login is needed - database or user?
A user account to Confluence with full access to the space(s) that need to be collected.
Can Onna sync Confluence "sub-spaces"?
Confluence sub-spaces do not actually exists. What people may refer to as sub-spaces are the top-level pages within a Confluence space. They might think of them internally as a sub-space, but the API just sees them as pages. As such, Onna is unable to sync just a sub-space.
Here's is an article for reference: https://confluence.atlassian.com/confeval/confluence-evaluator-resources/confluence-can-i-create-subspaces
Can Onna collect Jira data from embedded Jira links within Confluence pages?
Even if the credentials are the same for both the Confluence and Jira accounts, Onna will not sync the data within the embedded Jira links. To sync Jira data a separate source needs to be created. More information on setting up a Jira source can be found here.
Why is Admin access required for Onna to pull data from Confluence?
We request admin access for completeness in the collection. A regular user may not have access to the space that is needed to collect or all of the pages in a space to collect. By authenticating with an admin user, we can be sure that all of the available spaces and pages are returned.