Find activate-immediately services that lack an activate method

Immediate service activation in OSGi can be tricky but there are some basic rules to consider. Another point to think about is when a service doesn’t contain an activate method. The code base I work in uses Felix’s SCR annotations which makes this search pretty concise. I also assume that the code is in git. If your code isn’t, you should be able to replace git with find <dir> -type f -exec <grep-fu here>.

Find activate-immediately services

Using a little git and grep-fu, we can find our services that activate immediately.

git grep -l 'immediate[[:space:]]\?=[[:space:]]\?true'

The command below makes a few assumptions:

  1. We use SCR annotations that allow immediate = (true|false)
  2. We use immediate = true almost never except in the @Component annotation.

Find services that have an activate method

For our next trick, we’ll find the set of files that have an activate method as noted by the use of an @Activate annotation.

git grep -l '@Activate'

Find the intersection of activate-immediately services that lack an activate method

Now to the fun. With the information we’ve grepped above, we need the things that are in set 1 but not in set 2. There’s a great little tool in linux called comm that can help us with the set operations. It needs sorted data to work correctly, so we’ll dress up our previous commands and send them in.

comm -23 <(git grep -l 'immediate[[:space:]]\?=[[:space:]]\?true' | sort) <(git grep -l '@Activate' | sort)

The -23 argument tells comm that we want to suppress unique items from the second set (activate method list) and to suppress items that exist in both sets (immediate + activate). This leaves us with services that don’t have an activate method. If you want to see services that are immediate and have an activate method, change the argument to -12.

Advertisements

Understanding the ‘unresolved constraint’, ‘missing requirement’ message from Apache Felix Pt. 2

We previously took a look at Felix’s unresolved constraint message. I started testing with Felix 4.0.2 today and realized the output for an unresolved constraint has changed a bit.

ERROR: Bundle org.sakaiproject.nakamura.world [76] Error starting file:bundles/org.sakaiproject.nakamura.world_1.4.0.SNAPSHOT.jar (org.osgi.framework.BundleException: Unresolved constraint in bundle org.sakaiproject.nakamura.world [76]: Unable to resolve 76.0: missing requirement [76.0] osgi.wiring.package; (&(osgi.wiring.package=javax.servlet)(version>=3.0.0)))
11.07.2012 17:01:43.297 *ERROR* [FelixDispatchQueue] org.sakaiproject.nakamura.world FrameworkEvent ERROR (org.osgi.framework.BundleException: Unresolved constraint in bundle org.sakaiproject.nakamura.world [76]: Unable to resolve 76.0: missing requirement [76.0] osgi.wiring.package; (&(osgi.wiring.package=javax.servlet)(version>=3.0.0))) org.osgi.framework.BundleException: Unresolved constraint in bundle org.sakaiproject.nakamura.world [76]: Unable to resolve 76.0: missing requirement [76.0] osgi.wiring.package; (&(osgi.wiring.package=javax.servlet)(version>=3.0.0))
 at org.apache.felix.framework.Felix.resolveBundleRevision(Felix.java:3826)
 at org.apache.felix.framework.Felix.startBundle(Felix.java:1868)
 at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1191)
 at org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:295)
 at java.lang.Thread.run(Thread.java:679)

We have the same unresolved constraint..missing requirement as before but the part we’re interested in has changed a bit. Let’s break apart that first message.

(org.osgi.framework.BundleException: Unresolved constraint in bundle org.sakaiproject.nakamura.world [76]: Unable to resolve 76.0: missing requirement [76.0] osgi.wiring.package; (&(osgi.wiring.package=javax.servlet)(version>=3.0.0)))

Given what we know from last time, the interesting bits above are:

Unresolved constraint in bundle org.sakaiproject.nakamura.world [76]

This tells you what bundle had an issue trying to resolve a constraint. The next part is a bit less obvious but can be broken up.

osgi.wiring.package; (&(osgi.wiring.package=javax.servlet)(version>=3.0.0)))

osgi.wiring.package looks pretty foreign, hu? Disregard that and you see javax.servlet and version>=3.0.0. Tada! That’s the good stuff. So, figure out where that bundle is that exports javax.servlet>=3.0.0 and you’re on your way. (Hint: maybe, javax.servlet:javax.servlet-api:3.0.0 or org.ops4j.pax.web:pax-web-jetty-bundle:2.0.1 is what you’re looking for.)

Getting started with Pax Runner

After fighting through a Maven assembly for a small project, I just couldn’t take that headache again. I’ve used Apache Sling’s Maven Launchpad Plugin to put together a standalone OSGi server but Launchpad doesn’t allow you to pick which OSGi container you deploy to or what version of Felix gets used. I’ve started working with Pax Runner because a) it looks pretty nifty, b) those Pax folks are doing great stuff for the OSGi deployers out there.

Pax Runner has a few sweet features I’m really digging right now.

Different OSGi platforms and versions

I generally develop on Apache Felix, but if I should be able to run in any other OSGi container, right? Well, to test that theory I can ask Pax Runner to load up my profile using a different platform (--platform; defaults to ‘felix‘) and the version of that platform (--version).

If I want to test my setup with Equinox I just run:

pax-run --platform=equinox awesome-profile.composite

Knopflerfish you say?

pax-run --platform=knopflerfish awesome-profile.composite

What about an older version of Felix? Easy!

pax-run --platform=felix --version=3.0.8 awesome-profile.composite

Deploy multiple profiles and build composite profiles

I plan to run my project with a single profile, but if you find the need to include other profiles it’s just another command line switch:

pax-run --profiles=config,log awesome-profile.composite

“But I want a specific version of a profile.” And you should! So use this:

pax-run --profiles=config/1.0.0,log/1.2.0 awesome-profile.composite

Profiles? What’s this crazy talk you speak?!

So, the root of all this chatter is Pax Runner Profiles. I can barely speak to the topic, but will attempt to anyway. (Don’t trust me; read the documentation)

A short glimpse into my profile file shows that I just include bundles that I know to live in a Maven repository:

scan-bundle:mvn:commons-io/commons-io/1.4@1
scan-bundle:mvn:commons-fileupload/commons-fileupload/1.2.2@1
scan-bundle:mvn:commons-collections/commons-collections/3.2.1@1
scan-bundle:mvn:commons-lang/commons-lang/2.6@1
scan-bundle:mvn:commons-pool/commons-pool/1.5.6@1
scan-bundle:mvn:commons-codec/commons-codec/1.5@1

A quick explanation of these lines is simply:
scan-bundle:mvn:<groupId>/<artifactId>/<version>@<startLevel>

For those looking for more OSGi goodness, Pax Web has really stepped up with the 2.0.0 release. You can configure your Jetty server by deploying a bundle fragment with the appropriate jetty.xml file. Awesome!

Understanding the ‘unresolved constraint’, ‘missing requirement’ message from Apache Felix

It’s pretty common while developing an OSGi bundle that your imports and exports won’t quite match what you need or what exists in the server you’re deploying to. This can show up as NoClassDefFoundError, ClassNotFoundException or as log output in a stacktrace from bundle resolution. Hall, Pauls, McCullough and Savage did a great job of covering NCDFE and CNFE in “OSGi In Action” (chapter 8), let’s take a look at figuring out what the bundle resolution stacktrace is telling us. (I make nothing from the sales of “OSGi In Action” and suggest it to anyone interested in OSGi.)

Just like learning to read the stacktrace from an exception in Java is key to debugging, so is true about the dependency resolution messages from an OSGi container. Below is the output from Apache Felix when it encountered a missing dependency required by a bundle:

ERROR: Bundle org.sakaiproject.nakamura.webconsole.solr [124]: Error starting slinginstall:org.sakaiproject.nakamura.webconsole.solr-1.2-SNAPSHOT.jar (org.osgi.framework.BundleException: Unresolved constraint in bundle org.sakaiproject.nakamura.webconsole.solr [124]: Unable to resolve 124.0: missing requirement [124.0] package; (package=org.apache.solr.client.solrj) [caused by: Unable to resolve 84.0: missing requirement [84.0] package; (package=org.sakaiproject.nakamura.api.lite) [caused by: Unable to resolve 86.0: missing requirement [86.0] package; (&(package=com.google.common.collect)(version>=9.0.0)(!(version>=10.0.0)))]])
org.osgi.framework.BundleException: Unresolved constraint in bundle org.sakaiproject.nakamura.webconsole.solr [124]: Unable to resolve 124.0: missing requirement [124.0] package; (package=org.apache.solr.client.solrj) [caused by: Unable to resolve 84.0: missing requirement [84.0] package; (package=org.sakaiproject.nakamura.api.lite) [caused by: Unable to resolve 86.0: missing requirement [86.0] package; (&(package=com.google.common.collect)(version>=9.0.0)(!(version>=10.0.0)))]]
    at org.apache.felix.framework.Felix.resolveBundle(Felix.java:3443)
    at org.apache.felix.framework.Felix.startBundle(Felix.java:1727)
    at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1156)
    at org.apache.felix.framework.StartLevelImpl.run(StartLevelImpl.java:264)
    at java.lang.Thread.run(Thread.java:619)

What you have here is a stacktrace with a lengthy message. The important part of the stacktrace for us is the message.

ERROR: Bundle org.sakaiproject.nakamura.webconsole.solr [124]: Error starting slinginstall:org.sakaiproject.nakamura.webconsole.solr-1.2-SNAPSHOT.jar (org.osgi.framework.BundleException: Unresolved constraint in bundle org.sakaiproject.nakamura.webconsole.solr [124]: Unable to resolve 124.0: missing requirement [124.0] package; (package=org.apache.solr.client.solrj) [caused by: Unable to resolve 84.0: missing requirement [84.0] package; (package=org.sakaiproject.nakamura.api.lite) [caused by: Unable to resolve 86.0: missing requirement [86.0] package; (&(package=com.google.common.collect)(version>=9.0.0)(!(version>=10.0.0)))]])

This message is pretty simple but the structure is common for nastier messages (i.e. deeper resolution paths before failure). Let’s pull it apart to see what’s happening in there.

ERROR: Bundle org.sakaiproject.nakamura.webconsole.solr [124]: Error starting slinginstall:org.sakaiproject.nakamura.webconsole.solr-1.2-SNAPSHOT.jar

This very first part tells us that an error occurred while trying to load the org.sakaiproject.nakamura.webconsole.solrbundle. Nice start, but not quite the crux of the matter. Let’s keep reading.

org.osgi.framework.BundleException: Unresolved constraint in bundle org.sakaiproject.nakamura.webconsole.solr [124]: Unable to resolve 124.0: missing requirement [124.0] package; (package=org.apache.solr.client.solrj) [caused by: Unable to resolve 84.0: missing requirement [84.0] package; (package=org.sakaiproject.nakamura.api.lite) [caused by: Unable to resolve 86.0: missing requirement [86.0] package; (&(package=com.google.common.collect)(version>=9.0.0)(!(version>=10.0.0)))]])

Phew, that’s a lot of text! This is the heart of what we need though, so let’s break it down to make more sense of it.

(
    org.osgi.framework.BundleException: Unresolved constraint in bundle org.sakaiproject.nakamura.webconsole.solr [124]: Unable to resolve 124.0: missing requirement [124.0] package; (package=org.apache.solr.client.solrj)
        [
            caused by: Unable to resolve 84.0: missing requirement [84.0] package; (package=org.sakaiproject.nakamura.api.lite)
            [
                 caused by: Unable to resolve 86.0: missing requirement [86.0] package; (&(package=com.google.common.collect)(version>=9.0.0)(!(version>=10.0.0)))
            ]
        ]
)

What are those [number]s in the message?

The numbers in the message tell us the bundle ID on the server.

Unresolved Package Name Bundle ID Where Resolution Failed
org.apache.solr.client.solrj 124
org.sakaiproject.nakamura.api.lite 84
com.google.common.collect 86

Once you pull apart the message it becomes more obvious that it has structure and meaning! The structure of the message tells us that bundle 124 depends on a package from bundle 84 which depends on a package from bundle 86 which is unable to resolve com.google.common.collect;version=[9.0.0, 10.0.0). The innermost/very last message tells us the root of the problem; the dependency resolver was unable to find com.google.common.collect at version=[9.0.0, 10.0.0). Now we have somewhere to start digging.

How To Fix This

I suggest one of the following steps:

  1. Add a bundle that exports the missing package with a version that matches the required version
  2. Change the version to match an exported package already on the server

In this particular environment, com.google.common.collect;version=10.0.0 is what our server has deployed. The descriptor above specifically blocks any version not in the 9.x.x range. We generate the OSGi manifest by using the Maven Bundle Plugin which uses the BND tool to generate the manifest. In BND version > 2.1.0, the macro for versions was changed. Our solution has ranged from rolling back to bnd version=2.1.0 OR define the macro differently. The results are the same; the version segment in the manifest header becomes com.google.common.collect;version>=9.0.0 which finds our bundle of com.google.common.collect;version=10.0.0.


Notes about environment

The above message and stacktrace originated from a Sakai OAE environment which is built on Apache Sling and thusly Apache Felix. We use an artifact ID that is the root package of the bundle (org.sakaiproject.nakamura.webconsole.solr). This has the side effect that our bundle names look like package names in the message but gives a very clear naming convention.

Using pygraphviz to plot OSGi bundle dependencies

I’ve been working on an OSGi project for the last few years. As with any project, evolutionary changes will eventually require some cleanup. As new bundles have been added over time, the graph of dependencies is starting to get unwieldy in places. Even with good management of these dependencies, a nice visual layout of things can really help you see how your bundles are interconnected and give you the power to start separating some connections if you graph starts to hit cyclical dependencies.

I drove around a few different visual tools in Eclipse (PDE visualization tool, m2eclipse dependency graph) but needed something that would narrow the view to just my project. I not only wanted to see what things my bundles depend on in the project but I wanted to see what depends on a given bundle.

This was my first project with pygraphviz but after I figured out which way the grain runs, I was able to give it a basic graph of my project and generate the diagrams I wanted. I finally let pygraphviz handle the graph traversing and things got better (less code, faster results).

Since I wanted to analyze the runtime dependencies of my server, I installed the Felix Remote Shell bundle along with the Felix Shell bundle to allow remote connections to management functionality. With these in place, I was able to connect via telnet and query for package level information to build my graph.

Using the code below, I was able to generate a graph of the entire project (successors) and a graph for each bundle in the graph (predecessors and successors). Some sample images are below the source code. Eventually I’ll add saving and opening of dot files to allow for analysis without a running server.

#! /usr/bin/env python
import re
from sets import Set
import os
import pygraphviz as pgv
import sys
import telnetlib

# [   1] [Active     ] [   15] org.sakaiproject.nakamura.messaging (0.11.0.SNAPSHOT)
bundle_from_ps = re.compile('^\[\s*(?P<bundle_id>\d+)\]\s\[.+\]\s(?P<bundle_name>.+)\s')

# org.sakaiproject.nakamura.api.solr; version=0.0.0 -> org.sakaiproject.nakamura.solr [11]
bundle_from_req = re.compile('^.*-\> (?P<bundle_name>.*) \[(?P<bundle_id>.*)\]$')

def get_sakai_bundles():
    """Get a list of bundles that are create as part of Sakai OAE.
    Returns a dictionary of dict[bundle_name] = bundle_id.
    """
    tn = telnetlib.Telnet('localhost', '6666')
    tn.write('ps -s\nexit\n')
    lines = [line for line in tn.read_all().split('\r\n') if 'org.sakaiproject' in line]

    bundles = {}
    for line in lines:
        m = bundle_from_ps.match(line)
        bundles[m.group('bundle_name')] = m.group('bundle_id')
    return bundles

def get_package_reqs(bundle_id):
    """Gets the requirements (imports) for a given bundle.
    Returns a dictionary of dict[bundle_name] = bundle_id.

    Keyword arguments:
    bundle_id -- Bundle ID returned by the server in the output of
                 get_sakai_bundles()
    """
    tn = telnetlib.Telnet('localhost', '6666')
    tn.write('inspect package requirement %s\nexit\n' % (bundle_id))
    lines = [line for line in tn.read_all().split('\r\n') if line.startswith('org.sakaiproject') and line.endswith(']')]

    reqs = {}
    for line in lines:
        m = bundle_from_req.match(line)
        reqs[m.group('bundle_name')] = m.group('bundle_id')
    return reqs

def build_bundle_graph():
    """Build a graph_attr (nodes, edges) representing the connectivity
    within Sakai bundles
    """
    sakai_bundles = get_sakai_bundles()
    bundles = {}
    for b_name, b_id in sakai_bundles.items():
        reqs = get_package_reqs(b_id)
        bundles[b_name] = reqs.keys()
    return bundles

def draw_subgraph(name, graph, filename, successors = True):
    """Draw (write to disk) a subgraph that starts with or ends with
    the specified node.

    Keyword arguments:
    name -- name of the node to focus on
    graph -- graph of paths between bundles
    filename -- filename to write subgraph to
    successors -- whether to lookup successors or predecessors
    """
    if successors:
        nbunch = graph.successors(name)
    else:
        nbunch = graph.predecessors(name)

    if nbunch:
        nbunch.append(name)
        subgraph = graph.subgraph(nbunch)
        subgraph.layout(prog = 'dot')
        subgraph.draw('graphviz/%s' % filename)

def main():
    if not os.path.isdir('graphviz'):
        os.mkdir('graphviz')

    bundles_graph = build_bundle_graph()

    # print the whole graph
    graph = pgv.AGraph(data = bundles_graph, directed = True)
    graph.layout(prog = 'dot')
    graph.draw('graphviz/org.sakaiproject.nakamura.png')

    for b_name, reqs in bundles_graph.items():
        draw_subgraph(b_name, graph, '%s.png' % b_name)
        draw_subgraph(b_name, graph, '%s-pred.png' % b_name, False)

if __name__ == '__main__':
    main()
Sakai OAE Bundle Dependency Graph
Sakai OAE Bundle Dependency Graph
Sakai OAE Solr Bundle Predecessors
Sakai OAE Solr Bundle Predecessors
Sakai OAE Presence Bundle Successors
Sakai OAE Presence Bundle Successors

When To Immediately Activate An OSGi Component

OSGi has a fantastic feature for immediate components [1] and delayed components [2]. This allows components to delay their possibly expensive activation until the component is first accessed. At the very least it allows the OSGi platform to consume resources as needed. No sense is sucking up those server resources for something that can wait.

As developers start their adventure into OSGi, I’ve noticed a pattern creeping up that I’m not sure folks are aware of: just because you like your component doesn’t mean it needs to start immediately. Unless your component has something it needs to do before other components can do their work, there’s little need to start a component immediately. A lot of times, the work can be moved into the bundle activator or is fine being delayed.

People often see that a component does something in its activation method and assume it should start immediately. The questions to ask yourself are:

  • Will the things that need to use this component have a reference to it?
    Yes? Delayed activation.
  • Before activation, will functionality be missing that should be available as soon as possible?
    Yes? Immediate activation.
  • Can something else do the job of activating the component at a later time?
    Yes? Delayed activation.

Registering with an external service, such as JMS, is a perfect scenario for an immediate component. If the JMS service were in OSGi, declarative services could do the work of making sure service trackers know your component exists, but OSGi has no jurisdiction over external resources. Binding to a JMS topic requires manual interaction and therefore should happen by way of immediate component activation.

Getting and setting of properties is not a case for immediate component activation. Neither is because the service is referenced by something else.

When using declarative services, a reference to your not-yet-started component will be given to anything that has said it wants to collect an interface you implement. This requires no effort on the part of the service being referenced and thusly does not require the referenced service to be activated immediately. Fret not! On first access of this service by any consumer the service will get activated.

OSGi has not been the fastest thing for me to learn but understanding the nuances of little things like this really helps me understand why it is a good approach at decoupled services and modular software construction.

1. OSGi compendium spec version 4.2, section 112.2.2
2. OSGi compendium spec version 4.2, section 112.2.3

JavaMail in OSGi

UnsupportedDataTypeException

Sun decided ages ago that JavaMail and the Java Activation Framework should be released as separate artifacts and no one really cared. If you need JavaMail, you knew to also include JAF as Sun’s JavaMail page also told what version of JAF to use. Using this same approach, I created and deployed separate bundles for javax.activation and javax.mail. As soon as I tried to send the first test email, I found trouble. The most basic of email content types, text/plain, could not be sent.

javax.activation.UnsupportedDataTypeException: no object DCH for MIME type text/plain at javax.activation.ObjectDataContentHandler.writeTo(DataHandler.java:885)

This sent me digging into the source of javax.activation and javax.mail to find the fix.
The problem here is that when the Java Activation Framework starts up, it loads META-INF/mailcap.defaults then looks for META-INF/mailcap files in other libraries in the same classloader. These mailcap files define handlers for specific types of content (eg. text/plain). Since JAF and JavaMail were in separate bundles and thusly classloaders, the mailcap file in JavaMail couldn’t be seen by JAF.
The fix I went with was to merge the 2 bundles into 1. I didn’t want to do this at first but after I saw that JAF uses an internal class named MailcapCommandMap, I decided that if the Sun developers felt it fine to give JAF explicit knowledge of JavaMail, it was fine with me to marry them to the same bundle.
If you’re using Equinox, you can use buddy classloaders without merging the bundles together, but I have a couple of problems with buddy classloaders.

  1. They are not part of the OSGi spec and break your ability to move to another container.
  2. They create a situation where a parent bundle has to have knowledge of children or “buddy” bundles. This goes against the basic whiteboard approach of OSGi.

Unable to resolve sun.security.util

Another fun bit to note is that starting with JavaMail 1.4.1, there is use of the sun.security.util package. If your OSGi container exports sun.* classes, you won’t notice this. If your OSGi container doesn’t export this or you run on a non-Sun JVM, this seems like a blocking issue. Fear not! After digging through the JavaMail code I find a comment in the SocketFetcher class that reads:

/*
 * First, try to use sun.security.util.HostnameChecker,
 * which exists in Sun's JDK starting with 1.4.1.
 * We use reflection to access it in case it's not available
 * in the JDK we're running on.
 */

This let me know that I can set this package to optional in my import list and all would be well. In other words, your bundle manifest should have something like this:

Import-Package: sun.security.util;resolution:=optional