# Activity Monitoring Introduction
Kloudless Activity Monitoring is provided by the Activity API, which allows your application to monitor for changes in connected cloud accounts. Activity monitoring enables your app keep track of account activity to support features such as email and calendar sync.
For example requests and responses, descriptions of available endpoints, and a list of supported services, see our Activity API documentation.
This guide will explain how to enable and use activity monitoring in your app.
# Enabling activity monitoring
Subscriptions are required to use the Activity API endpoints. There are two ways to enable activity monitoring:
By default, each account can have only one subscription. If your app requires multiple subscriptions for an account, feel free to contact us, and we'd be glad to help you find a solution.
# Option 1: Enable activity tracking
Enabling activity tracking automatically creates a subscription for any account that connects to your application after configuring this setting. Activity is tracked beginning from the time the subscription is created.
Follow these steps to enable activity tracking for your Kloudless project:
Log in to your Kloudless account and navigate to the Webhooks and Activity Monitoring page.
Under the Activity Monitoring section, check Track Activity and click Save.
For accounts that have already been connected, simply re-connect the account, or manually create a subscription with an empty request body.
# Option 2: Manually create a subscription
To manually enable activity monitoring, create a subscription by making a
POST
request to the
create a subscription endpoint
with the relevant account_id
parameter. By default, the subscription's
subscription_type
will be set to changes
; this subscription type monitors
the account for new activity.
# Monitoring individual resources
For a subset of connectors, you have the option to monitor specific resources
by specifying a list of resource IDs in the monitored_resources
attribute
when creating a subscription. For example, you can include a list of drive IDs
in the monitored_resources
attribute to monitor activity from multiple shared
drives in a Google Drive account.
For services that support monitoring individual resources, types of resources
you can monitor, and limits on the number of monitored resources you can
monitor, see the monitored_resources
attribute in the
Subscription object documentation.
Contact us to have limits raised on the number of resources each of your application's connected accounts can monitor.
# Using activity monitoring
There are two methods of obtaining activity data:
- Querying the list activity endpoint for new activity data
- Receiving new activity data via a notification queue, such as Amazon Eventbridge, Google Cloud Pub/Sub, or Azure Service Bus.
# Querying the list activity endpoint
Use the
list activity endpoint
to retrieve activity data. The alias default
can be used for the
subscription_id
parameter.
Your application also needs to keep track of the cursor
attribute which is
part of the Activity API's response. Make an initial request to the
list activity endpoint
and use the returned value of cursor
as your initial cursor. You can also
make a request without any cursor to retrieve all activity data since the
account's subscription was created.
Here is an example of what a cURL request to retrieve the default subscription's metadata looks like:
curl -X GET --header 'Authorization: Bearer TOKEN' \
'https://api.kloudless.com/v1/accounts/me/subscriptions/default'
Once your application has saved the previous or initial value of cursor
,
pass it in to the next request to the
list activity endpoint.
In each subsequent request, pass in the value of cursor
returned from the
previous response to avoid seeing the same activity in the next response.
An example request to the list activity endpoint specifying a cursor
value is
shown here:
curl -X GET --header 'Authorization: Bearer TOKEN' \
'https://api.kloudless.com/v1/accounts/me/subscriptions/default/activity' \
-G -d cursor='123456789'
You can continue to monitor connected accounts for new activity by using webhooks, or by polling.
# Kloudless webhooks
Kloudless allows you to create webhooks to notify your application when activity has occurred in the resource you are subscribed to. Kloudless webhooks are much more efficient than polling since your application is notified of when to query the list activity endpoint. For this reason, we recommend configuring webhooks for your Kloudless project.
You can set up a webhook on the app's Webhooks and Activity Monitoring page.
For more specific details about webhooks, see the webhooks section of the Activity API docs.
# Polling
The simplest way to use activity monitoring with Kloudless is to regularly poll the list activity endpoint and check the response for new activity.
Polling has performance drawbacks as an application grows. This method also requires your app to send requests at an arbitrary time interval with no guarantee that new information will be available. Using webhooks eliminates these issues.
# Receiving activity data via a notification queue
If you are using any of the following platforms, use that platform's corresponding notification channel to receive activity data directly:
# Time-based monitoring (admin accounts only)
Rather than triggering queries for new activity by polling the endpoint for new cursors or by webhook notifications, you can check for new activity by specifying a time frame. This method gives you the ability to manage batches of activity data. Currently, this approach requires admin cloud account access. Kloudless supports the following cloud services with this feature:
- Box
box
- SharePoint Online
sharepoint
- OneDrive for Business
onedrivebiz
- Dropbox
dropbox
- Google Drive
gdrive
While using time-based activity monitoring does generate a cursor
,
the cursor
is now used for pagination when working with a large quantity of
activity. The query does not need the cursor
to determine which activity
results to return, because you are determining the time-frame. The time ranges
to query can be managed by your application and incremented as you complete
retrieving activity for past time intervals. Here is an example of an API
request to retrieve activity using a start and end time:
curl -X GET --header 'Authorization: Bearer TOKEN' \
'https://api.kloudless.com/v2/accounts/me/subscriptions/default/activity' \
-d from='ISO-8601 string' \
-d until='ISO-8601 string'
# Parallel tracking
When working with a large quantity of activity, time-based activity monitoring can help break the large sets of data down into smaller subsets to then pass to other threads. You can more efficiently work through the activity results by spreading the load across threads. This ‘parallelization’ allows you to split the load. We recommend a few steps that can help the application most effectively process these loads and avoid some common issues.
When the application makes a query for a large number of activity results,
check the modified
attribute of the last activity object retrieved.
If this is for an account that was just connected or has
previously not had activity, you can use the time the account was connected or
any time in the past.
Next, the application needs to calculate the difference between the current
time (Kloudless uses an ISO-8601 string format) and the timestamp you found from
the modified
attribute mentioned above. The calculated value represents the
time-period for which the application needs to collect activity results. You
can then split this time range between all the workers you have, and set their
start/end (from
/until
) times for each respective worker’s requests. Each
worker can then paginate through the time range provided to it using the
cursor
to advance to the next set of activity results until there are no
remaining activity results. When a query for time-based activity returns an
empty result, there is no more activity to report.
Here is an example cURL request that paginates through a time range using a cursor returned in an earlier response:
curl -X GET --header 'Authorization: Bearer TOKEN' \
'https://api.kloudless.com/v2/accounts/me/subscriptions/default/activity/' \
-d from='ISO-8601 string' \
-d until='ISO-8601 string' \
-d cursor='123456789' # Optional, for pagination of results
# Race conditions when parallelizing Activity Monitoring
When first gathering activity data for distribution to the application’s workers, it is best to use the timestamp of the oldest activity you've collected as the start of the time range for the next run as mentioned earlier. Even if you query activity for the past 5 minutes, there may only be activity for the first minute of that range available now if the upstream provider is delayed by 4 minutes. Therefore, the next run should use the last known activity's time as the start time when it tries to retrieve future activity.
Another common issue with parallelization can occur when older time slices complete first before your app has caught up. If an older time slice completes before newer time slices, it can lead to a race condition where you miss activity from other time range slices before moving on to a newer time slice. To ensure all activity results are collected, you can schedule the time slices to execute in reverse order so that you ensure all time slices complete before moving onto the next time range.