# Activity Monitoring Introduction

Kloudless Activity Monitoring is provided by the Activity API, which allows your application to monitor for changes in connected cloud accounts. Activity monitoring enables your app keep track of account activity to support features such as email and calendar sync.

For example requests and responses, descriptions of available endpoints, and a list of supported services, see our Activity API documentation.

This guide will explain how to enable and use activity monitoring in your app.

# Enabling activity monitoring

Subscriptions are required to use the Activity API endpoints. There are two ways to enable activity monitoring:

By default, each account can have only one subscription. If your app requires multiple subscriptions for an account, feel free to contact us, and we'd be glad to help you find a solution.

# Option 1: Enable activity tracking

Enabling activity tracking automatically creates a subscription for any account that connects to your application after configuring this setting. Activity is tracked beginning from the time the subscription is created.

Follow these steps to enable activity tracking for your Kloudless project:

  1. Log in to your Kloudless account and navigate to the Webhooks and Activity Monitoring page.

  2. Under the Activity Monitoring section, check Track Activity and click Save.

For accounts that have already been connected, simply re-connect the account, or manually create a subscription with an empty request body.

# Option 2: Manually create a subscription

To manually enable activity monitoring, create a subscription by making a POST request to the create a subscription endpoint with the relevant account_id parameter. By default, the subscription's subscription_type will be set to changes; this subscription type monitors the account for new activity.

# Monitoring individual resources

For a subset of connectors, you have the option to monitor specific resources by specifying a list of resource IDs in the monitored_resources attribute when creating a subscription. For example, you can include a list of drive IDs in the monitored_resources attribute to monitor activity from multiple shared drives in a Google Drive account.

For services that support monitoring individual resources, types of resources you can monitor, and limits on the number of monitored resources you can monitor, see the monitored_resources attribute in the Subscription object documentation.

Contact us to have limits raised on the number of resources each of your application's connected accounts can monitor.

# Using activity monitoring

There are two methods of obtaining activity data:

  1. Querying the list activity endpoint for new activity data
  2. Receiving new activity data via a notification queue, such as Amazon Eventbridge, Google Cloud Pub/Sub, or Azure Service Bus.

# Querying the list activity endpoint

Use the list activity endpoint to retrieve activity data. The alias default can be used for the subscription_id parameter.

Your application also needs to keep track of the cursor attribute which is part of the Activity API's response. Make an initial request to the list activity endpoint and use the returned value of cursor as your initial cursor. You can also make a request without any cursor to retrieve all activity data since the account's subscription was created.

Here is an example of what a cURL request to retrieve the default subscription's metadata looks like:

curl -X GET --header 'Authorization: Bearer TOKEN' \
    'https://api.kloudless.com/v1/accounts/me/subscriptions/default'

Once your application has saved the previous or initial value of cursor, pass it in to the next request to the list activity endpoint. In each subsequent request, pass in the value of cursor returned from the previous response to avoid seeing the same activity in the next response.

An example request to the list activity endpoint specifying a cursor value is shown here:

curl -X GET --header 'Authorization: Bearer TOKEN' \
    'https://api.kloudless.com/v1/accounts/me/subscriptions/default/activity' \
    -G -d cursor='123456789'

You can continue to monitor connected accounts for new activity by using webhooks, or by polling.

# Kloudless webhooks

Kloudless allows you to create webhooks to notify your application when activity has occurred in the resource you are subscribed to. Kloudless webhooks are much more efficient than polling since your application is notified of when to query the list activity endpoint. For this reason, we recommend configuring webhooks for your Kloudless project.

You can set up a webhook on the app's Webhooks and Activity Monitoring page.

webhook

For more specific details about webhooks, see the webhooks section of the Activity API docs.

# Polling

The simplest way to use activity monitoring with Kloudless is to regularly poll the list activity endpoint and check the response for new activity.

Polling has performance drawbacks as an application grows. This method also requires your app to send requests at an arbitrary time interval with no guarantee that new information will be available. Using webhooks eliminates these issues.

# Receiving activity data via a notification queue

If you are using any of the following platforms, use that platform's corresponding notification channel to receive activity data directly:

# Time-based monitoring (admin accounts only)

Rather than triggering queries for new activity by polling the endpoint for new cursors or by webhook notifications, you can check for new activity by specifying a time frame. This method gives you the ability to manage batches of activity data. Currently, this approach requires admin cloud account access. Kloudless supports the following cloud services with this feature:

  • Box box
  • SharePoint Online sharepoint
  • OneDrive for Business onedrivebiz
  • Dropbox dropbox
  • Google Drive gdrive

While using time-based activity monitoring does generate a cursor, the cursor is now used for pagination when working with a large quantity of activity. The query does not need the cursor to determine which activity results to return, because you are determining the time-frame. The time ranges to query can be managed by your application and incremented as you complete retrieving activity for past time intervals. Here is an example of an API request to retrieve activity using a start and end time:

curl -X GET --header 'Authorization: Bearer TOKEN' \
    'https://api.kloudless.com/v2/accounts/me/subscriptions/default/activity' \
    -d from='ISO-8601 string' \
    -d until='ISO-8601 string'

# Parallel tracking

When working with a large quantity of activity, time-based activity monitoring can help break the large sets of data down into smaller subsets to then pass to other threads. You can more efficiently work through the activity results by spreading the load across threads. This ‘parallelization’ allows you to split the load. We recommend a few steps that can help the application most effectively process these loads and avoid some common issues.

When the application makes a query for a large number of activity results, check the modified attribute of the last activity object retrieved. If this is for an account that was just connected or has previously not had activity, you can use the time the account was connected or any time in the past.

Next, the application needs to calculate the difference between the current time (Kloudless uses an ISO-8601 string format) and the timestamp you found from the modified attribute mentioned above. The calculated value represents the time-period for which the application needs to collect activity results. You can then split this time range between all the workers you have, and set their start/end (from/until) times for each respective worker’s requests. Each worker can then paginate through the time range provided to it using the cursor to advance to the next set of activity results until there are no remaining activity results. When a query for time-based activity returns an empty result, there is no more activity to report.

Here is an example cURL request that paginates through a time range using a cursor returned in an earlier response:

curl -X GET --header 'Authorization: Bearer TOKEN' \
    'https://api.kloudless.com/v2/accounts/me/subscriptions/default/activity/' \
    -d from='ISO-8601 string' \
    -d until='ISO-8601 string' \
    -d cursor='123456789' # Optional, for pagination of results

# Race conditions when parallelizing Activity Monitoring

When first gathering activity data for distribution to the application’s workers, it is best to use the timestamp of the oldest activity you've collected as the start of the time range for the next run as mentioned earlier. Even if you query activity for the past 5 minutes, there may only be activity for the first minute of that range available now if the upstream provider is delayed by 4 minutes. Therefore, the next run should use the last known activity's time as the start time when it tries to retrieve future activity.

Another common issue with parallelization can occur when older time slices complete first before your app has caught up. If an older time slice completes before newer time slices, it can lead to a race condition where you miss activity from other time range slices before moving on to a newer time slice. To ensure all activity results are collected, you can schedule the time slices to execute in reverse order so that you ensure all time slices complete before moving onto the next time range.