Showing posts with label configuration. Show all posts
Showing posts with label configuration. Show all posts

Thursday, August 9, 2012

JCR project - part 2

In this post I'll show how to instantiate a repository using a repo home directory and a repository.xml file. To focus on the actual code and project configuration I am using the xml generated by Jackrabbit when instantiating an Automatic and Transient repository as shown in the previous post.

These are the steps:

1) Create src/repository directory in your project folder
2) Create repository.xml in the directory at point 1
3) Paste the content below in the file at point 2


<?xml version="1.0"?>
<!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor
license agreements. See the NOTICE file distributed with this work for additional
information regarding copyright ownership. The ASF licenses this file to
You under the Apache License, Version 2.0 (the "License"); you may not use
this file except in compliance with the License. You may obtain a copy of
the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required
by applicable law or agreed to in writing, software distributed under the
License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS
OF ANY KIND, either express or implied. See the License for the specific
language governing permissions and limitations under the License. -->

<!DOCTYPE Repository
          PUBLIC "-//The Apache Software Foundation//DTD Jackrabbit 2.0//EN"
          "http://jackrabbit.apache.org/dtd/repository-2.0.dtd">

<!-- Example Repository Configuration File Used by - org.apache.jackrabbit.core.config.RepositoryConfigTest.java
- -->
<Repository>
<!-- virtual file system where the repository stores global state (e.g.
registered namespaces, custom node types, etc.) -->
<FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
<param name="path" value="${rep.home}/repository" />
</FileSystem>

<!-- data store configuration -->
<DataStore class="org.apache.jackrabbit.core.data.FileDataStore" />

<!-- security configuration -->
<Security appName="Jackrabbit">
<!-- security manager: class: FQN of class implementing the JackrabbitSecurityManager
interface -->
<SecurityManager class="org.apache.jackrabbit.core.DefaultSecurityManager"
workspaceName="security"
>
<!-- workspace access: class: FQN of class implementing the WorkspaceAccessManager
interface -->
<!-- <WorkspaceAccessManager class="..."/> -->
<!-- <param name="config" value="${rep.home}/security.xml"/> -->
</SecurityManager>

<!-- access manager: class: FQN of class implementing the AccessManager
interface -->
<AccessManager
class="org.apache.jackrabbit.core.security.DefaultAccessManager"
>
<!-- <param name="config" value="${rep.home}/access.xml"/> -->
</AccessManager>

<LoginModule
class="org.apache.jackrabbit.core.security.authentication.DefaultLoginModule"
>
<!-- anonymous user name ('anonymous' is the default value) -->
<param name="anonymousId" value="anonymous" />
<!-- administrator user id (default value if param is missing is 'admin') -->
<param name="adminId" value="admin" />
</LoginModule>
</Security>

<!-- location of workspaces root directory and name of default workspace -->
<Workspaces rootPath="${rep.home}/workspaces"
defaultWorkspace="default" />
<!-- workspace configuration template: used to create the initial workspace
if there's no workspace yet -->
<Workspace name="${wsp.name}">
<!-- virtual file system of the workspace: class: FQN of class implementing
the FileSystem interface -->
<FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
<param name="path" value="${wsp.home}" />
</FileSystem>
<!-- persistence manager of the workspace: class: FQN of class implementing
the PersistenceManager interface -->
<PersistenceManager
class="org.apache.jackrabbit.core.persistence.pool.DerbyPersistenceManager"
>
<param name="url" value="jdbc:derby:${wsp.home}/db;create=true" />
<param name="schemaObjectPrefix" value="${wsp.name}_" />
</PersistenceManager>
<!-- Search index and the file system it uses. class: FQN of class implementing
the QueryHandler interface -->
<SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
<param name="path" value="${wsp.home}/index" />
<param name="supportHighlighting" value="true" />
</SearchIndex>
</Workspace>

<!-- Configures the versioning -->
<Versioning rootPath="${rep.home}/version">
<!-- Configures the filesystem to use for versioning for the respective
persistence manager -->
<FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
<param name="path" value="${rep.home}/version" />
</FileSystem>

<!-- Configures the persistence manager to be used for persisting version
state. Please note that the current versioning implementation is based on
a 'normal' persistence manager, but this could change in future implementations. -->
<PersistenceManager
class="org.apache.jackrabbit.core.persistence.pool.DerbyPersistenceManager"
>
<param name="url" value="jdbc:derby:${rep.home}/version/db;create=true" />
<param name="schemaObjectPrefix" value="version_" />
</PersistenceManager>
</Versioning>

<!-- Search index for content that is shared repository wide (/jcr:system
tree, contains mainly versions) -->
<SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
<param name="path" value="${rep.home}/repository/index" />
<param name="supportHighlighting" value="true" />
</SearchIndex>

<!-- Run with a cluster journal -->
<Cluster id="node1">
<Journal class="org.apache.jackrabbit.core.journal.MemoryJournal" />
</Cluster>
</Repository>


4) Edit your pom.xml in order to instruct maven to copy the repository directory and its content in target/. You simply need to specify the resource in the build part of the pom.


<build>
<resources>
<resource>
<directory>src/repository</directory>
<includes>
<include>*/**</include>
</includes>
<targetPath>repository</targetPath>
</resource>
</resources>
</build>


If you now run: mvn clean test-compile you should see these two entries:

  • target//classes/repository
  • target//classes/repository/repository.xml


5) If you have an interface that your repository initializer implements then change the initializeRepository method in order to add the (String configFile, String repHomeDir) parameters. 
The configFile points to the repository.xml file, relative path from your project home is enough for test purposes, while the repoHomeDir points to the repository directory.

6) The last step is instructing the creation of the repository using the RegistryHelper. Note that the code leverages the Java Naming and Directory Interface. 
Now rather than creating a repository automatically with the TransientRepository as follows:

repository = new TransientRepository(new File("target"));

You need to execute this code:

Hashtable<String, String> env = new Hashtable<String, String>();
env.put(Context.INITIAL_CONTEXT_FACTORY,
"org.apache.jackrabbit.core.jndi"
+ ".provider.DummyInitialContextFactory");

env.put(Context.PROVIDER_URL, "localhost");

InitialContext ctx = new InitialContext(env);

RegistryHelper.registerRepository(ctx, "repo", configFile, repHomeDir,
true);

repository = (Repository) ctx.lookup("repo");

Note that the automatic way needs the directory where the repository.xml file and the repository directory will be created to be passed as a parameter.

Run your test everything should work without any problem. This is one of the big advantage of using JCR: changing the underlying storage implementation does not require any change to your business logic!

You can now try to play with the repository DataStore for the binary files or the PersistanceStorage in order to use different implementation. This link may be handy.

The source code is here (branch: manual_transient)

JCR project - part 1

This is a brief tutorial on how to start playing with JCR and Jackrabbit. At the end of it you will have a some code that creates a repository and allows you to interact with it by adding and search content.

In order to keep the configuration simple you will going to use the Automatic Configuration for a Transient Repository. This means that you don't need to create a repository descriptor in order to define how your data will be stored. Moreover since the repository is transient when you shut down the program all the data will be lost.

The technologies involved are:

  • JCR 2.0
  • Jackrabbit 2.5.0
  • derby 10.9.1.0
  • slf4j 1.6.6

The source code of the project can be found here in github (branch: automatic_transient) but I will show you how to implement it yourself.


Let's go step by step:

1) Create a quickstart project using maven (follow the link for details)

You can use these as the project parameters:


        <groupId>com.acme</groupId>
<artifactId>jcr-poc</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>jcr-poc</name>


1.1) If you want to import the project in eclipse then simply run: mvn eclipse:eclipse in the project directory

2) Add the following dependencies to your pom:


<dependency>
<groupId>org.apache.jackrabbit</groupId>
<artifactId>jackrabbit-core</artifactId>
<version>2.5.0</version>
</dependency>
<dependency>
<groupId>org.apache.derby</groupId>
<artifactId>derby</artifactId>
<version>10.9.1.0</version>
</dependency>
<dependency>
<groupId>javax.jcr</groupId>
<artifactId>jcr</artifactId>
<version>2.0</version>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.6.6</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.6.6</version>
</dependency>
<!-- Test -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.9</version>
<scope>test</scope>
</dependency>
</dependencies>

3) Since you are all set it's time to create (initialize) your repository. Create a RepositoryInitializerImpl class that implements the following method:

  • Session initializeRepository() throws Exception
The goal of this method is to create:
  • the repository
  • a session
  • a workspace
  • a namespace 
that will be used to manipulate the content.

public Session initializeRepository() throws Exception {

repository = new TransientRepository(new File("target"));
log.debug("Transient repository created");

session = repository.login(new SimpleCredentials("admin", "admin"
.toCharArray()));
log.debug("Session created");

workspace = session.getWorkspace();
log.debug("Workspace created");

Node root = session.getRootNode();
log.debug("Root node created");

try {
workspace.getNamespaceRegistry().registerNamespace(namespace, url);
log.debug("Workspace added as: " + namespace + ", " + url);
} catch(NamespaceException e) {
log.warn(e.getMessage());
}

repoMainNode = root.addNode(namespace + REPOSITORY);

log.debug("Saving the session");
session.save();

return session;
}

Notice that a session is created by logging in the repository. When you run the tests (you will!) open the repository.xml under the target directory and take a look at the LoginModule element. Logging in as an anonymous user does not need a password but does not allow the creation of any namespace.

Within this method you can actually add content to the root node or to the repoMainNode and then use the session to commit your operations. Besides I implemented a DataHandlerImpl class that performs addition and search. The design is faulty therefore feel free to improve it!

4) Create Article and Author beans

The schema that I used is something like this:

<repository>
     <article doi='1'>
          <authors>
               <author firstname='io' lastname='me'/>
               <author firstname='you' lastname='yourself'/>
          </authors>
     </article>
     <article doi='2'>
          <authors>
               <author firstname='io' lastname='me'/>
               <author firstname='he' lastname='himself'/>
          </authors>
     </article>
</repository>


You can think of JCR as a n-ary three where each node has one or more type associated and any node can be linked to any other by using an id. In order to retrieve nodes from the repository you can use:

  • XPath (deprecated in version 2.0)
  • SQL2
  • SQL
  • JQOM (Java Query Object Model)
I actually used XPath to implement the search but I will soon rewrite (and post) the operations using JQOM and SQL2. The operations are:
  • add a new article
  • add an author to an article
  • retrieve articles for an author
Adding a new article is straight forward as shown below:

Node articleNode = repoMainNode.addNode(namespace + ARTICLE_PREFIX);

articleNode.setProperty(namespace + DOI_PREFIX, article.getDoi());
articleNode.setProperty(namespace + TITLE_PREFIX, article.getTitle());

Node authorsNode = articleNode.addNode(namespace + AUTHORS_LIST_PREFIX);

for (Author author : article.getAuthors()) {
Node authorNode = authorsNode.addNode(namespace + AUTHOR_PREFIX);
authorNode.setProperty(namespace + AUTHOR_FIRST_NAME_PREFIX,
author.getFirstName());
authorNode.setProperty(namespace + AUTHOR_LAST_NAME_PREFIX,
author.getLastName());
}

session.save();

Saving the session is equal to commit when performing a database query thus you need to do it in order to save your data.

Searching is less straight forward but it is still pretty easy. You need to use the QueryManager object to create an execute a query in any language. Here follows a snippet of the code to collect an article based on the doi:

QueryManager queryManager = workspace.getQueryManager();
Query query = queryManager
.createQuery("//" + namespace + ":article[@" + namespace
+ ":doi = '" + article.getDoi() + "']", Query.XPATH);

log.debug(query.getStatement());
QueryResult results = query.execute();

When you run the code you should see something similar to this query statement: //test:article[@test:doi = '10.1038/2012.11109']

Notice that I used test as the namespace for my content.

A slightly more complex query is needed in order to get the articles for a given author, at runtime you'll this entry somewhere in your log: //element(test:article, nt:unstructured)[test:authors/test:author/test:firstName='John' and test:authors/test:author/test:lastName='Doe']

Take a look at this link for some more information about XPath and JCR.

Last but not least you can write some tests to see how your repository works. I used JUnit @Before and @After annotations. As a matter of facts before running each test I create my DataHandlerImpl object setting the repository parameters (here is the flaw). While after each test I execute session.logout(); in order to commit the operation.

If you haven't done it before I suggest you to take a look at the code in github (branch: automatic_transient). I also strongly encourage you to take a look at the long and verbose log that your tests will produce and, more important, to the xml files in the repository that Jackrabbit created for you. 


Monday, May 7, 2012

Karaf and Graylog2 (log4j appenders in general)

In order to use Graylog2 in the right way within servicemix and karaf some configuration steps are required. According to the karaf documentation these are the steps:
  1. add a new appender in <servicemix_home>/etc/org.ops4j.pax.logging.cfg
  2. generate an OSGi bundle using the Fragment-Host element in maven felix plugin 
  3. add the bundle to the servicemix installation
  4. update the DS component log4j.properties file

Step 1.

On your servicemix server go to servicemix_home directory and open the file etc/org.ops4j.pax.logging.cfg using any text editor. Add the following lines at the bottom:
#Graylogger
log4j.logger.my.package.logging.to.graylog=INFO, graylog2
log4j.appender.graylog2=org.graylog2.log.GelfAppender
log4j.appender.graylog2.graylogHost=<graylog_server_address>
log4j.appender.graylog2.facility=<application_name>
log4j.appender.graylog2.Threshold=INFO
The line in italic is actually setting the package using the graylog2 appender for the agent. Let's say we want to have DSS using graylog2 then we need to add a similar there.

Step 2.

Clone the repository from 
branch: osgi-bundle-enable
And generate the bundle using the command: mvn clean install
Note that the pom.xml already contains all the configuration for the felix plugin as defined by karaf documentation:
            <plugin>
                <groupId>org.apache.felix</groupId>
                <artifactId>
maven
-bundle-plugin</artifactId>
                <version>2.3.5</version>
                <extensions>true</extensions>
                <configuration>
                    <instructions>
                        <Bundle-Name>...</Bundle-Name>                        <Bundle-SymbolicName>...</Bundle-SymbolicName>
                        <Import-Package>!*</Import-Package>
                        <Embed-Dependency>*;scope=compile|runtime;inline=true</Embed-Dependency>
                        <Fragment-Host>org.ops4j.pax.logging.pax-logging-service</Fragment-Host>
                        <Implementation-Version>...</Implementation-Version>
                    </instructions>
                </configuration>
            </plugin>

Step 3.

Copy the generated file gelfj-0.9.1-SNAPSHOT.jar on test-integral in <servicemix_home>/system/org/ops4j/logging/gelfj/0.9.1-SNAPSHOT/gelfj-0.9.1-SNAPSHOT.jar. Generate the missing directory if needed.
The etc/startup.properties must be updated as well adding the part in bold below:
org/ops4j/pax/url/pax-url-wrap/1.2.4/pax-url-wrap-1.2.4.jar=5
org/ops4j/logging/gelfj/0.9.1-SNAPSHOT/gelfj-0.9.1-SNAPSHOT.jar=7
org/ops4j/pax/logging/pax-logging-api/1.5.3/pax-logging-api-1.5.3.jar=8
Restart servicemix: sudo /etc/init.d/servicemix restart

Step 4.

To setup the bundle that is going to log through Graylog its log4j.properties must be updated as listed below:
log4j.logger.com.nature.dsm.agent.bean.graylog=INFO, graylog2
log4j.appender.graylog2=org.graylog2.log.GelfAppender
log4j.appender.graylog2.graylogHost=<graylog_server_address>
log4j.appender.graylog2.facility=<application_name>
log4j.appender.graylog2.Threshold=INFO
Redeploy the component on your servicemix instance and have it writing some entries in Graylog2 (ie: reaching the point where your business logic is logging) then check if those entries are saved in Graylog.

Appenders in servicemix

This approach can be applied for any other appender that you want to use in any bundle either if it is using Camel in Karaf or not. As a matter of fact you generally want to have some log information written in a separate appender (ie: a file) for debugging purposes for instance.
This can be easily configured by adding the appender in etc/org.ops4j.pax.logging.cfg like:

log4j.logger.my.package.logging.to.the.file=DEBUG, F

log4j.appender.F=org.apache.log4j.DailyRollingFileAppender
log4j.appender.F.File=<servicemix_home>/data/log/app_log.log
log4j.appender.F.layout=org.apache.log4j.PatternLayout
log4j.appender.F.layout.ConversionPattern=%d{ABSOLUTE} %d{ABSOLUTE} [%5p - %t] %c: %m%n

Then add the same configuration to your bundle and redeploy. The nice thing is that you don't need to restart Karaf or Servicemix to have the new logging configuration working.

Note

Check this page out if you want to install your local instance of Graylog:

http://blog.milford.io/2012/03/installing-graylog2-0-9-6-elasticsearch-0-18-7-mongodb-2-0-3-on-centos-5-with-rvm/