How Caching Works In Data Virtualization Environments

Managing Performance and SLAs

I am often asked how to manage query performance of frequently-accessed data sources, in order to minimize impact on operational systems or to support service level agreements.

While this can be a challenge in large scale data virtualization environments, caching, also known as materialized views, provides an excellent performance adjunct to query optimization.

Caching Flexibly Persists Data to Meet Service Level Needs

Mature data virtualization platforms provide multiple caching options and techniques.

These let you flexibly persist queried data to meet data delivery service level agreements and protect source system performance.

  • Any View, Any Service, Any Procedure – Any view, service or procedure may be cached for future use, and all caches may be periodically and automatically refreshed to stay synchronized with their systems of record. Queries are processed against caches just as if you were querying the original data source.
  • Multiple Cache Repository Options – It’s a good idea to cache data with other frequently accessed sources. Composite for instance can cache on DB2 , Greenplum, Microsoft SQL Server, MySQL, Netezza, Oracle, Sybase, Teradata and Vertica.
  • Event-driven Refresh – Updating a cache based on defined business rules provides significant flexibility based on events and activities.
  • Scheduled Refresh – Updating a cache based on set times is useful in more schedule-driven environments.
  • Manual Refresh – Updating a cache on demand, for example when a report is run, provides an additional option.
  • Incremental Refresh – Updating a partial cache based on triggered changes is useful for large data sets with frequent refreshes.
  • Native Data Source Load – Using the target repository’s native load functions to load and refresh the cache accelerates loading times by 10x or more over a typical SQL insert.
  • Parallel Load – Using multiple threads to load a cache in parallel also accelerates loads.
  • Centralized Caching – In centralized mode, all cached data is stored in a single cache repository. Centralized cache refreshes are fully configurable including timed refresh, event-based refresh (CJM or JMS message), incremental refresh and forced refresh.
  • Distributed Caching – In distributed mode, users dedicate one or more data virtualization servers as edge servers and configure edge cache policies.  Edge cache policies let you control which cache data is replicated from the central cache to the edge location and the refresh rules. Refresh can be time based, event-based or incremental.
  • Clustered Deployment – For clustered deployments, a centralized cache reduces the need for each cluster node to re-fetch the data from the source, which significantly reduces the impact on production data sources.

Enjoy the Flexibility

As you can see, caching’s many options help provide architects and developers with significant flexibility to address nearly any performance or SLA challenge.

Caching is easy to implement, and easy to change as conditions change. Take advantage.

3 Responses to How Caching Works In Data Virtualization Environments

  1. Tim Cassidy says:

    Brilliant list, this is a question I get asked about a lot!!

    Do any of these cover “in-memory caching” on the Composite servers to act like some sort of low-latency data store?

  2. Robert Eve says:

    Composite provides high performance caching (using bulk loads vs inserts, parallel loading) for a number of databases including Greenplum, IBM DB2; Microsoft SQL Server; MySQL; Netezza; Oracle; Sybase; Teradata; and Vertica.

    • These write ups are very high level. Are there any technical white papers with real world examples. We know the benefits, but I have noticed that the site is short on implementation docs. How its really done. Are these type docs available?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

Join 27 other followers

%d bloggers like this: