ARTÍCULO VII ASUNTOS LEGALES
VII- a) OFICIO AL-145-2016SJ ANÁLISIS Y RECOMENDACIONES OFICIO OB-086-.2016
Caching in a clustered environment is difficult because each machine will update the database directly, but not update the other machine's caches, so each machines cache can become out of date. This does not mean that caching cannot be used in a cluster, but you must be careful in how it is configured.
For read-only objects caching can still be used. For read mostly objects, caching can be used, but some mechanism should be used to avoid stale data. If stale data is only an issue for writes, then using optimistic locking will avoid writes occurring on stale data. When an optimistic lock exception occurs, some JPA providers will automatically refresh or invalidate the object in the cache, so if the user or application retries the transaction the next write will succeed. Your application could also catch the lock exception and refresh or invalidate the object, and potentially retry the transaction if the user does not need to be notified of the lock error (be careful doing this though, as normally the user should be aware of the lock error). Cache
invalidation can also be used to decrease the likelyhood of stale data by setting a time to live on the cache. The size of the cache can also affect the occurrence of stale data.
Although returning stale data to a user may be an issue, normally returning stale data to a user that just updated the data is a bigger issue. This can normally be solved through session
infinitely, but ensuring the user interacts with the same machine in the cluster for the duration of their session. This can also improve cache usage, as the same user will typically access the same data. It is normally also useful to add a refresh button to the UI, this will allow the user to refresh their data if they think their data is stale, or they wish to ensure they have data that is up to date. The application can also choose the refresh the objects in places where up to date data is
important, such as using the cache for read-only queries, but refreshing when entering a transaction to update an object.
For write mostly objects, the best solution may be to disable the cache for those objects. Caching provides no benefit to inserts, and the cost of avoiding stale data on updates may mean there is no benefit to caching objects that are always updated. Caching will add some overhead to writes, as the cache must be updated, having a large cache also affects garbage collection, so if the cache is not providing any benefit it should be turned off to avoid this overhead. This can depend on the complexity of the object though, if the object has a lot of complex relationships, and only part of the object is updated, then caching may still be worth it.
[edit] Cache Coordination
One solution to caching in a clustered environment is to use a messaging framework to coordinate the caches between the machines in the cluster. JMS or JGroups can be used in combination with JPA or application events to broadcast messages to invalidate the caches on other machines when an update occurs. Some JPA and cache providers support cache
coordination in a clustered environment.
TopLink / EclipseLink : Support cache coordination in a clustered environment using JMS or RMI. Cache coordination is configured through the @Cache annotation or <cache> orm.xml element, and using the persistence unit property
eclipselink.cache.coordination.protocol.
[edit] Distributed Caching
A distributed cache is one where the cache is distributed across each of the machines in the cluster. Each object will only live on one or a set number of the machines. This avoids stale data, because when the cache is accessed or updated the object is always retrieved from the same location, so is always up to date. The draw back to this solution is that cache access now
potentially requires a network access. This solution works best when the machines in the cluster are connected together on the same high speed network, and when the database machine is not as well connected, or is under load. A distributed cache reduces database access, so allows the application to be scaled to a larger cluster without the database becoming a bottleneck. Some distributed cache providers also provide a local cache, and offer cache coordination between the caches.
TopLink : Supports integration with the Oracle Coherence distributed cache.
[edit] Cache Transaction Isolation
When caching is used, the consistency and isolation of the cache becomes as important as the database transaction isolation. For basic cache isolation it is important that changes are not committed into the cache until after the database transaction has been committed, otherwise uncommitted data could be accessed by other users.
Caches can be either transactional, or non-transactional. In a transactional cache, the changes from a transaction are committed to the cache as a single atomic unit. This means the
objects/data are first locked in the cache (preventing other threads/users from accessing the objects/data), then updated in the cache, then the locks are released. Ideally the locks are
obtained before committing the database transaction, to ensure consistency with the database. In a non-transactional cache the objects/data are updated one by one without any locking. This means there will be a brief period where the data in the cache is not consistent with the database. This may or may not be an issue, and is a complex issue to think about and discuss, and gets into the issue of locking, and the application's isolation requirements.
Optimistic locking is another important consideration in cache isolation. If optimistic locking is used, the cache should avoid replacing new data, with older data. This is important when reading, and when updating the cache.
Some JPA providers may allow for configuration of their cache isolation, or different caches may define different levels of isolation. Although the defaults should normally be used, it can be important to understand how the usage of caching is affecting your transaction isolation, as well as your performance and concurrency.
[edit] Common Problems
[edit] I can't see changes made directly on the database, or from another server
This means you have either enabled caching in your JPA configuration, or your JPA provider caches by default. You can either disable the 2nd level cache in your JPA configuration, or refresh the object or invalidate the cache after changing the database directly. See, Stale Data
TopLink / EclipseLink : Caching is enabled by default. To disable caching set the persistence property "eclipselink.cache.shared.default" to false in your
persistence.xml or persistence properties. You can also configure this on a per class basis if you want to allow caching in some classes and not in others. See, EclipseLink FAQ.
[edit] Spring
Spring is an application framework for Java. Spring is an IoC container that allows for a different programming model. Spring is similar to a JEE server, in that it provides a transaction service, XML deployment, annotation processing, byte-code weaving, and JPA integration.
[edit] Persistence
Persistence in Spring in normally done through a DAO (Data Access Object) layer. The Spring DAO layer is meant to encapsulate the persistence mechanism, so the same application data access API would be given no matter if JDBC, JPA or a native API were used.
Spring also defines a transaction manager implementation that is similar to JTA. Spring also supports transactional annotations and beans similar to SessionBeans.
[edit] JPA
Spring has specific support for JPA and can emulate some of the functionality of a JEE container with respect to JPA. Spring allows a JPA persistence unit to be deployed in container managed mode. If the spring-agent is used to start the JVM, Spring can deploy a JPA persistence unit with weaving similar to a JEE server. Spring can also pass a Spring DataSource and integrate its
transaction service with JPA. Spring allows the @PersistenceUnit and @PersistenceContext annotations to be used in any Spring bean class, to have an EntityManager or
EntityManagerFactory injected. Spring supports a managed transactional EntityManager similar to JEE, where the EntityManager binds itself as a new persistence context to each new transaction and commits as part of the transaction. Spring supports both JTA integration, and its own transaction manager.