Added
- Introduced a
skipCache parameter to the internal _getOrCreateVisitor method, allowing for explicit cache bypass.
Fixed
- Improved session creation reliability by implementing retry logic that clears the visitor cache and re-fetches visitor data if initial creation fails due to an invalid visitor reference.
- Added a validation check to ensure a valid visitor object is present before attempting session creation.
- Enhanced error handling and logging during visitor and session creation processes.
Changed
- Adjusted IP geolocation process to attempt lookup for all IP addresses, removing previous exclusions for private, loopback, and link-local IPs.
Added
- New IP geolocation feature using Chapybara for country and state detection.
- Added
chapybara package dependency for IP intelligence services.
Changed
- The SDK now defaults country and state to "Unknown" if no Chapybara API key is provided or if geolocation lookup fails for an IP.
Removed
- The
geoip-lite package and its associated memory overhead, as it has been replaced by Chapybara for IP geolocation.
Removed
- Removed GeoIP lookup functionality for visitor/session handling, meaning geo-IP enrichment is no longer performed.
- Removed the 'geoip-lite' dependency from the project.
Added
- New maximum queue size limits for JavaScript errors and events.
- Automatic flushing of event and JavaScript error queues when their capacity is reached.
- An exported
clearBotCache utility function for managing bot detection cache, integrated into the shutdown process.
Changed
- Reduced Time-To-Live (TTL) and maximum size for visitor and bot detection caches to improve memory management and data freshness.
- Optimized visitor and session caching mechanisms to store only essential data (e.g., record IDs) instead of full objects, significantly reducing memory footprint.
- Refined cache cleanup thresholds for session and visitor caches for more efficient resource management.
- Improved event queue flushing logic for better performance.
- Truncated JavaScript error stack traces to a maximum length of 2048 characters to prevent storing overly large strings.
Fixed
- Enhanced the SDK shutdown process for more robust resource cleanup by explicitly clearing all internal caches and queues, and nullifying timers.
- Addressed potential memory issues by implementing maximum size limits and forced flushing for event and JavaScript error queues.
Changed
- Updated session management to more actively track user activity, including recording the current path on each session interaction. This also means sessions are verified against the database more frequently to ensure validity.
Added
- Implemented time-to-live (TTL) and maximum size limits for the visitor cache to prevent unbounded memory growth.
- Introduced a maximum size limit for the session cache to prevent unbounded memory growth.
- Added throttled and time-aware cleanup for the bot detection cache, including a time-to-live (TTL) for cached entries.
- Integrated constants for common regex patterns and validation sets to enhance code readability and maintainability.
- Implemented batch processing for JavaScript error flushing, improving performance and resilience for large error queues.
Changed
- Updated internal string sanitization for user profiles and API event payloads to use consistent regex patterns for control character removal and validation.
- Improved the visitor creation lock mechanism to release locks immediately, reducing potential contention.
- Refactored bot detection and user agent parsing to utilize a shared UAParser instance and dedicated constants for efficiency.
- Optimized
_processEvent to prevent assigning an empty eventData object when no customData is provided.
Fixed
- Resolved a race condition during visitor creation under high concurrency, ensuring consistent visitor record management.
- Ensured comprehensive cleanup of visitor-related caches and locks during SDK shutdown.
- Corrected the bot detection cache to properly handle expiration with a time-to-live, preventing stale bot classifications.
- Addressed potential memory issues by implementing eviction policies for both visitor and session caches.
Added
- Implemented a proactive administrative token refresh mechanism to prevent API request delays due to expired authentication.
- Introduced in-memory caching for visitor records, significantly reducing database lookups for repeat visitors and improving performance.
- Added in-memory caching for bot detection results to optimize the bot scoring process.
- Enhanced IP blacklist lookups by utilizing a Set data structure for faster checks.
Changed
- Improved event flushing efficiency by implementing parallel writes with smaller batches when sending events to the database.
- Optimized session verification by performing periodic database updates instead of on every event, reducing database load.
- Enhanced bot detection logic with consolidated bot patterns and early exit conditions for high-certainty bot user agents.
Removed
- Removed the
discardShortSessions feature, including its configuration and logic for automatically deleting sessions shorter than one second. This functionality is no longer available at the SDK, only at the dashboard.
Removed
- Removed all dashboard summary generation and flushing logic from the SDK. This includes the
_updateDashboardSummary and _flushSummaries methods, associated queues, timers, and constants. This functionality is now handled externally or is deprecated.
- Removed the
summaryUpdated flag and related logic from session handling.
Changed
- Enhanced the discardShortSessions feature to prevent premature reporting of short, unengaged sessions to the dashboard. Sessions are now only reported if they last at least 1 second or contain 2 or more events, preventing very short, unengaged sessions from skewing analytics data.
Fixed
- Improved the reliability of deleting discarded short sessions by ensuring proper administrative authentication prior to deletion, reducing instances of failed session clean-up.
Added
- Added a new configuration option
discardShortSessions to automatically filter out very short sessions (less than 1 second duration) from analytics data.
- Introduced tracking of session
startTime to enable accurate session duration calculation and support the new discardShortSessions feature.
Changed
- Enhanced JS error logging to include the website ID, providing better context for error debugging and management.
Added
- Implemented GeoIP lookup to capture visitor state/region information, enhancing data tracking and dashboard summaries.
Added
- Implemented automatic SDK version reporting to the dashboard, providing visibility into the deployed SDK version for each website.
- Introduced an optional privacy setting for IP address storage. When enabled in dashboard settings, the SDK will store full IP addresses for sessions; otherwise, only hashed visitor IDs are stored by default.
Fixed
- Improved session tracking robustness by verifying cached sessions against the database. If a cached session is no longer found in the database, it's invalidated, and a new session is created, preventing issues with stale session IDs.
Fixed
- Fixed a race condition that could cause duplicate visitor records to be created when
identify() is called simultaneously with tracking events for the same visitor. Implemented a promise-based locking mechanism (_getOrCreateVisitor) that ensures only one visitor creation happens at a time per visitorId, with concurrent requests waiting for the same promise to resolve.
Added
- New private method
_getOrCreateVisitor() that centralizes visitor fetching/creation logic with proper concurrency control.
Fixed
- Fixed a race condition that could cause duplicate visitor records to be created when
identify() is called simultaneously with tracking events for the same visitor. Both methods now properly handle concurrent visitor creation attempts by catching the error and re-fetching the existing record. Had to be reworked in 0.13.2
Added
- Introduced a new
identify method (skopos.identify(req, userId, userData?)) to link anonymous visitors with authenticated users. This includes a new type definitions for IdentifyData, and a robust implementation with extensive data validation and sanitization for user fields like name, email, phone, and metadata.
- Added a new private method
_validateAndSanitizeIdentifyData to ensure the integrity and security of user identification data.
Changed
- Significantly enhanced bot detection capabilities by expanding the
calculateBotScore function. This includes new rules for detecting longer user agents, additional known bot and automated client user agents (e.g., Playwright, social media bots), security scanning tools (e.g., sqlmap, nmap), older browser versions, and suspicious header patterns (e.g., missing accept-language, x-selenium headers, 'Headless' platform indications). The bot score is now capped at 100.
- Improved and hardened API payload validation and sanitization in
validateAndSanitizeApiPayload. Stricter length limits were applied to url, event name, errorMessage, referrer, language, and stackTrace. URL validation now includes protocol checks, and customData validation includes checks for dangerous keys (e.g., __proto__) to prevent prototype pollution attacks.
Changed
- Integrated Prettier into the development workflow, adding it as a development dependency and introducing a 'format' script for automated code formatting.
Changed
- Optimized bot detection logic by streamlining user agent regex evaluations in
calculateBotScore.
- Reduced verbose debug logging when event, summary, and JavaScript error queues are empty.
Added
- Added a new
debug option to SkoposSDKOptions to enable verbose logging for debugging purposes.
- Introduced an internal, timestamped logging utility to provide clearer, level-based insight into SDK operations.
Changed
- Significantly enhanced the SDK's internal logging capabilities, replacing direct console calls with a structured, level-based system.
- Standardized error messages across the SDK for improved consistency.
Changed
- Improved the admin authentication mechanism by replacing the periodic token refresh with an on-demand re-authentication process, ensuring a valid session before all PocketBase interactions. This simplifies the SDK's internal lifecycle management.
Removed
- The adminAuthRefreshTimer and the associated _refreshAdminAuth method, as periodic token refreshing is no longer required with the new on-demand authentication strategy.
Added
- Implemented automatic periodic refreshing of the admin authentication token to prevent session expiration.
Added
- Introduced new utility functions:
validateAndSanitizeApiPayload for event data processing and getSanitizedDomain for domain extraction.
- Added dynamic domain tracking functionality based on the website's configuration.
Changed
- Enhanced the
trackApiEvent method to include comprehensive validation, sanitization, and clamping of incoming API event payloads. This ensures data integrity and rejects invalid or untrusted data at runtime.
- Implemented domain-specific tracking for
trackApiEvent, preventing events from being processed if their URL hostname does not match the configured website domain or its subdomains.
- The SDK now dynamically retrieves and updates the website's domain configuration.
Added
- Introduced a new score-based system for bot detection, leveraging multiple indicators from user-agent strings and request headers to identify bots more accurately.
Changed
- Enhanced bot detection logic to utilize full request headers in addition to the User-Agent string, allowing for more comprehensive bot identification.
- The SDK now extracts and passes all incoming request headers to internal processing for improved context and bot detection.
- Improved the robustness of visitor ID generation by providing default "unknown" values when IP address or User-Agent are unavailable, ensuring consistent ID creation.
Fixed
- Ensured graceful handling of undefined User-Agent strings in parsing utilities to prevent potential errors.
Added
- Implemented real-time updates for website configuration, allowing immediate synchronization of settings such as IP blacklists, localhost tracking preferences, and archival status.
- Added
eventsource dependency to support real-time subscriptions.
Removed
- The
configRefreshIntervalMs SDK option has been removed.
- The periodic website configuration refresh mechanism (polling) has been removed, as it has been replaced by a real-time subscription system.
Added
- Added a new
isArchived property to the SDK, enabling the system to identify and respect the archival status of a website.
- Implemented logic to automatically halt the processing of all tracking events for websites that are marked as archived.
Removed
- Removed all UTM parameter tracking and processing features from the SDK, including fields in the API event payload, internal data structures, and summary generation.
Added
- New
configRefreshIntervalMs option to customize the interval for refreshing website configurations.
- Automatic synchronization of website settings (e.g., IP blacklists, localhost tracking preferences) from the Skopos Dashboard.
- IP blacklisting functionality to prevent tracking from specified IP addresses.
- Ability to disable tracking of events originating from localhost.
- Added
ipaddr.js as a new dependency.
Changed
- Improved the
skopos.shutdown() method to clear all internal timers for a more robust application shutdown.
- Modified the recommended
trackApiEvent endpoint response to 204 No Content for a non-blocking, immediate client acknowledgment.
Removed
- The standalone "Advanced Configuration (Batching)" section from the
README.md (batching options are now integrated into the SkoposSDK.init options table).
Added
- Support for JavaScript error tracking.
- A new
jsError event type, including errorMessage and stackTrace fields for detailed error reporting.
- A new configuration option,
jsErrorBatchInterval, to control the frequency (in milliseconds) at which batched JavaScript error reports are sent.
- Dedicated collection and flushing mechanisms for processing and storing batched JavaScript errors.
- JavaScript errors count and top JavaScript errors breakdown to dashboard summaries.
Changed
- The
close method now ensures all pending JavaScript error reports are flushed before shutting down.
Fixed
- Improved robustness of URL path extraction from event payloads, preventing issues with malformed URLs.
Added
- Introduced an in-memory queue for aggregating dashboard summary data, significantly improving performance by reducing direct database writes for daily statistics.
- Added dedicated background processes for flushing aggregated dashboard summaries and proactively cleaning up the session cache.
- Implemented functionality to track entry and exit pages for user sessions.
- Enhanced session tracking to differentiate between new and returning visitors.
- Added detection for engaged sessions based on event count or custom duration data.
- New configuration constants:
SUMMARY_FLUSH_INTERVAL_MS for summary flushing and SESSION_CACHE_CLEANUP_INTERVAL_MS for session cache cleanup.
Changed
- Refactored the dashboard summary update mechanism from immediate database writes to an optimized, in-memory aggregation with periodic batched flushes, resulting in improved performance and reduced database load.
- Updated the SDK's shutdown process to ensure all pending events and aggregated summary data are flushed to the database.
- Renamed the
flush() method to flushEvents() for clearer distinction between event and summary flushing.
- Improved internal session cache management to store
lastPath and eventCount, enabling more accurate engaged session detection and exit page tracking.
Added
- Implemented IP-based country detection for visitor geographical analytics.
- Added a new "countryBreakdown" metric to dashboard summaries to track visitor origins.
- Visitor session data now includes the detected country.
Added
- Introduced a new dashboard summary collection (
dash_sum) for daily analytics.
- Implemented a new
_updateDashboardSummary method to calculate and persist daily aggregates (pageviews, visitors, top pages, referrers, device/browser/language/UTM breakdowns, custom events).
- Integrated
_updateDashboardSummary to process data from both new sessions and subsequent events.
- Enhanced SDK initialization (
SkoposSDK.init) to validate siteId against backend website records and store the internal websiteRecordId.
- Improved JSDoc comments for clarity, examples, return types, and exception handling for various methods.
Changed
- Refactored internal data storage for
visitors and sessions to consistently use the websiteRecordId instead of siteId directly.
Fixed
- Addressed potential initialization issues by ensuring the provided
siteId corresponds to a valid website in the backend, throwing an error if not found.
- Implemented robust handling for race conditions during the creation of daily summary records.
- Initial release of the Skopos Node.js SDK.