Friday, July 6, 2012

Modular services with OpenJDK Jigsaw and Guice

This blog entry describes an experiment exploring the connection between Java modules and services in OpenJDK Jigsaw and the dependency injection framework Guice.

Modular services in Jigsaw define a very simple way to bind a (service) interface to implementations (service provider classes) and, using java.util.ServiceLoader, a way to iterate through all (service) instances of implementations bound to an interface.

Guice supports a rich binding model and scoping model. Guice can support a JDK services style approach using multibindings where service instances are obtained by referencing an injection type Set<T> where T is the service interface. (Note that when instance of the Set<T> is injected all members are instantiated according to the scoping rules; the set is not lazy.)

Can Jigsaw and Guice be combined for richer modular services? Perhaps :-)

Before we can start playing around with Jigsaw and Guice we first need to modularize Guice, so this is also an experiment in taking a popular framework and modularizing it to work in the modular OpenJDK.

The complete set of code can be found here on GitHub.

A modular Guice

The set of jar files that comprise of the Guice jar files and it's runtime dependent jar files need to be converted into modular jar files so that those modular jar files can be installed into a library.

A modular jar file is a jar file that contains a compiled module description, module-info.class, at the top-level. The module-info.class is the result of compiling the module declaration source, module-info.javaThe module declaration declares important information such as the a module name, version, dependent modules, and what types are exported.

The approach to work out that set of jar files was to manually analyze the runtime dependencies of Guice modules on maven central (com.google.inject:guice:3.0 and com.google.inject.extensions:guice-multibindings:3.0) to obtain the complete set of jar files required for for Guice to run.

For each jar a corresponding source module was created that contained just a module-info.java.  The Java compiler, javac, was used to compile all the source modules to obtain corresponding module-info.class files.

Finally each jar file was downloaded from maven central and updated to include the corresponding module-info.class, thus transforming each jar into a modular jar.

The ant script which automates all this can be found here.

I have no doubt this can be further automated. Alan Bateman showed of a demo at JavaOne last year that processed maven pom files, slurped stuff from maven central, and automatically created modular jars from analyzing the dependencies.

This black box approach does not work in my case as i need tweak the module declarations (as explained later in this section). Some sort of automated white-box approach is required where an initial module-info.java can be created from which it can then be modified.

There is also some particularly thorny issues lurking here. Maven and Jigsaw have different dependency resolution algorithms. Maven, using dependency management, allows a pom to override versions of it's transitive dependencies. Jigsaw currently does not support this (it's an open question as to whether it will). Thus anything produced from such automation may not be reusable in other contexts since the dependency graphs might be different.

A total of four jars were identified:
  1. aopalliance-1.0.jar
  2. javax.inject-1.jar
  3. guice-3.0.jar
  4. guice-multibindings-3.0.jar
And the following modular source was created:
$ find msrc
msrc
msrc/aopalliance
msrc/aopalliance/module-info.java
msrc/guice
msrc/guice/module-info.java
msrc/guice.multibindings
msrc/guice.multibindings/module-info.java
msrc/javax.inject
msrc/javax.inject/module-info.java
The module declaration for the aopalliance module is:

module aopalliance@1.0 {
    exports org.aopalliance.aop;
    exports org.aopalliance.intercept;
}

The module declaration for the javax.inject module is:

module javax.inject@1 {
    exports javax.inject;
}

Both of the above modules do not have any dependencies and just export types contained in the declared packages.

The module declaration for the guice module is:

module guice@3.0 {
    requires public javax.inject@1;
    requires public aopalliance@1.0;
    requires jdk.logging;

    exports com.google.inject;
    exports com.google.inject.binder;
    exports com.google.inject.matcher;
    exports com.google.inject.name;
    exports com.google.inject.spi;
    exports com.google.inject.util;

    view guice.internal {
        permits guice.multibindings;

        exports com.google.inject.internal;
        exports com.google.inject.internal.util;
    }
}

Now it gets a little bit more interesting. First the guice module requires (or depends on) the javax.inject and aopalliance modules. Those dependencies are qualified as public so that any module requiring the guice module will automatically have access to the types exported by the other two modules; the types in those modules are re-exported.

Note that i am not sure if types in the aopalliance module are referenced in the Guice API, if not there is no need for the types in that module to be re-exported. Certainly types in the javax.inject module are referenced in the Guice API.

Guice requires the jdk.logging module but even though some types from this module are bound, by default, by Guice no such types are (AFAICT) exposed publicly hence the types of this module are not re-exported.

Guice takes particular care of placing all implementation classes underneath the package com.google.inject.internal. The expectation is that developers should not directly reference any such classes that are publicly accessible. Where possible such implementation classes are package private. This made it easy produce a list of exports clauses for the public API. More importantly with Java modularity the developers need not know anything about the internal classes and the module developer can be assured that at compilation and runtime such non-exported types are not visible and accessible.

It might be reasonable to suggest why not allow wildcards in exports clauses?, for example "exports com.google.inject.*;". In this case wildcards would be not be a good idea because types in the internal packages would be exported. In general the use of wildcards makes it all too easy to expose more than necessary and unintentionally, for example if packages are renamed.

However, there is a little problem. The exported types are not sufficient for other Guice modules to function, such as the guice.multibindings module. The Guice multibinding code imports types from the packages com.google.inject.internal and com.google.inject.internal.util:
$ grep import *.java | grep internal
MapBinder.java:import com.google.inject.internal.util.ImmutableList;
MapBinder.java:import com.google.inject.internal.util.ImmutableMap;
MapBinder.java:import com.google.inject.internal.util.ImmutableSet;
MapBinder.java:import com.google.inject.internal.util.Lists;
Multibinder.java:import com.google.inject.internal.Annotations;
Multibinder.java:import com.google.inject.internal.Errors;
Multibinder.java:import com.google.inject.internal.util.ImmutableList;
Multibinder.java:import com.google.inject.internal.util.ImmutableSet;
Multibinder.java:import com.google.inject.internal.util.Lists;
So there is some tighter coupling between the Guice modules. It is necessary to create a non-default module view, guice.internal (the default view is the module guice), that exports types from the internal packages but only permits the guice.multibindings module to require this module view. Non-default views are never versioned and inherit the version, export and requires clauses from the default view.

The module declaration for the guice.multibindings module is:

module guice.multibindings@3.0 {
    requires public guice@3.0;
    requires guice.internal@3.0;

    exports com.google.inject.multibindings;
}

It has a dependency on the guice.internal view of module guice. For ease of use this module re-exports all the types in the guice module, otherwise this would be redundant since the guice.internal view inherits from the guice module (or default view).

I had to go through a couple of iterations of editing the module declarations, producing the modular jars, installing those into a library, and running a test application. The perils of modularizing from jars files rather than compiling the source code!

Without the view and permits clauses a dreaded NoClassDefFoundError was thrown:

Exception in thread "main" java.lang.NoClassDefFoundError: Lcom/google/inject/internal/util/$ImmutableList;
        at java.lang.Class.getDeclaredFields0(Native Method) 
        at java.lang.Class.privateGetDeclaredFields(Class.java:2318)
        at java.lang.Class.getDeclaredFields(Class.java:1771)
        at com.google.inject.spi.InjectionPoint.getInjectionPoints(InjectionPoint.java:649)
        ...
        at com.google.inject.multibindings.Multibinder$RealMultibinder.configure(Multibinder.java:269)
        ...
        at com.google.inject.Guice.createInjector(Guice.java:95)
        ...
        at mapp.App.main(App.java:9)
Caused by: java.lang.ClassNotFoundException: com.google.inject.internal.util.$ImmutableList : requested by +guice.multibindings
        at org.openjdk.jigsaw.Loader.loadClass(Loader.java:116)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

So Guice is doing some reflection accessing fields of some type in the guice.multibindings module that has a field whose type is in the internal package of the guice module, see here. Note that the Guice build process re-names certain class files e.g. ...util.ImmutableList to ...util.$ImmutableList. There is enough information here to guess what is going on but the indirection can throw one slightly off course.

Tooling analyzing class files and presenting class dependencies would be most helpful.

In the example on github the modular jars are installed into a guice-specific library. The application library (containing the installed application modules using Guice) is hooked up to the guice-specific library. This is primarily to avoid doing too much work when building the application library but it does show that libraries can be reusable (although not necessarily portable) components.

Services and Guice

The project that experiments with services and Guice can be loaded into NetBeans just like the modular services example.

There are five modules that are compiled and installed into the application library:

  • The guice.service module that exports the GuiceInjectorService service interface and the GuiceServiceLoader for obtaining service instances bound using Guice.
  • The stringer module that exports the StringTransformer service interface.
  • The hasher and rotter modules that both provide a service provider class for the GuiceInjectorService service interface and implement a Guice module that binds the StringTransformer service interface to one or more implementations.
  • The mapp module that provides the main entry point that looks up StringTransformer instances using GuiceServiceLoader.

Essentially Jigsaw modular services is used to bootstrap Guice services.

Each module that participates in Guice services provides an implementation of GuiceInjectorService service interface:

import com.google.inject.Injector;

public interface GuiceInjectorService {

    Injector getInjector();
}

which supports the getting of a Guice Injector instance. java.util.ServiceLoader can be used to obtain all Injector instances, and then the bindings of those instances can be introspected.

An alternative solution might be to obtain Guice Module instances then create one Injector for all those instances. However, since the Java modules are decoupled from each other it cannot be guaranteed that no binding conflicts will arise.

The GuiceInjectorService implementation and the Guice module for the rotter module are as follows:

public class RotterInjectorProvider implements GuiceInjectorService {

    @Override
    public Injector getInjector() {
        return Guice.createInjector(new RotterModule());
    }
}

public class RotterModule extends AbstractModule {

    @Override
    protected void configure() {
        Multibinder<StringTransformer> uriBinder = Multibinder.
                newSetBinder(binder(), StringTransformer.class);
        uriBinder.addBinding().to(RotterStringTransformer.class).
                in(Scopes.SINGLETON);
    }
}

The Guice module is using the multibinding support to bind RotterStringTransformer to a singleton instance. Multibinding makes it easier to bind two or more implementations to the same interface (internally it will create unique keys, containing unique annotations so that each individual binding is unique).

The GuiceInjectorService implementation and the Guice module for the hasher module are as follows:

public class HasherInjectorProvider implements GuiceInjectorService {

    @Override
    public Injector getInjector() {
        return Guice.createInjector(new HasherModule());
    }
}

public class HasherModule extends AbstractModule {

    @Override
    protected void configure() {
        bind(StringTransformer.class).
                to(HasherStringTransformer.class).
                in(Scopes.NO_SCOPE);
    }
    
    @Provides
    @Named("ToLowerCaseHasher")
    @Singleton
    protected StringTransformer toLowerCase(
                final StringTransformer st) {
        return new StringTransformer() {

            @Override
            public String description() {
                return "LowerCase: " + st.description();
            }

            @Override
            public String transform(String s) {
                return st.transform(s).toLowerCase();
            }
        };
    }
}

This Guice module registers two bindings of the StringTransform service interface. The first uses the programmatic binding. The latter uses the annotation-based binding to adapt the former and convert the String to lower case. To differentiate the later from the former so it can be bound the later is named.

The main class in the mapp module is as follows:

public class App {

    public static void main(String[] args) throws Exception {
        for (StringTransformer s : GuiceServiceLoader.
                load(StringTransformer.class)) {

            System.out.println(s + " " + s.description());
        }

        for (StringTransformer s : GuiceServiceLoader.
                load(StringTransformer.class)) {

            System.out.println(s + " " + s.description());
        }
    }
}

It just iterates through the StringTransformer service instances using the GuiceServiceLoader much like one can do with ServiceLoader.

So how does GuiceServiceLoader work? The Injector instances are lazily cached as follows:

public class GuiceServiceLoader<T> implements Iterable<T> {

    static class Initializer {

        private static Set<Injector> injectors = load();

        private static Set<Injector> load() {
            final Set<Injector> giss = new LinkedHashSet<>();
            for (GuiceInjectorService g : ServiceLoader.
                    load(GuiceInjectorService.class)) {
                giss.add(g.getInjector());
            }
            return giss;
        }
    }

    static Set<Injector> getInjectors() {
        return new LinkedHashSet<>(Initializer.injectors);
    }

    ...
}

When an instance of GuiceServiceLoader is created for a given service interface the bindings of each Injector is introspected to find the list of all Guice Key instances whose type literal is equal to the service interface.

When the GuiceServiceLoader instance is iterated over the Keys are used to get the instances from the corresponding Injector. It's reasonably straightforward although i complicated the implementation with some additional caching.

Observations

I have used javac to compile modular source for module declarations that export types in packages that don't exist in that source. This may seem a little odd but package names are sort of an artificial concept when it comes to compilation and runtime. So to me that seems reasonable.

It was an iterative process to produce the right module declarations. Better tooling could help in this regard to generate initial module declaration source from say maven with further analysis from class file dependencies. Guice is a logically and physically well-structured project which helped make this process easier.

Guice has a much richer binding model than Jigsaw modular services, as demonstrated by the hasher and rotter modules. However, the resolver knows nothing about Guice-based service interfaces and bindings. Ideally only the Java modules that bind required service interfaces should be resolved and included in the configuration of the application. To achieve that requires either the expression of a richer binding model in the module declaration or a more general way to express requirements and capabilities.

Finally the Guice modularization effort uncovered a bug in the jdk.logging module. The mapp module needs to require the jdk.logging module otherwise an annoying exception is logged:

Can't load log handler "java.util.logging.ConsoleHandler"
java.lang.ClassNotFoundException: java.util.logging.ConsoleHandler : requested by +mapp
java.lang.ClassNotFoundException: java.util.logging.ConsoleHandler : requested by +mapp
        at org.openjdk.jigsaw.Loader.loadClass(Loader.java:116)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.util.logging.LogManager$3.run(LogManager.java:418)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.util.logging.LogManager.loadLoggerHandlers(LogManager.java:405)
        ...
        at com.google.inject.internal.util.$FinalizableReferenceQueue.<init>(FinalizableReferenceQueue.java:131)
        ...
        at com.google.inject.Guice.createInjector(Guice.java:62)
        at hasher.HasherInjectorProvider.getInjector(HasherInjectorProvider.java:11)
        ...
        at guice.service.GuiceServiceLoader.<init>(GuiceServiceLoader.java:72)
        at guice.service.GuiceServiceLoader.load(GuiceServiceLoader.java:118)
        at mapp.App.main(App.java:9)

The culprit is at line 418 of java.util.logging.LogManager:

Class<?> clz = ClassLoader.getSystemClassLoader()
    .loadClass(word);

The system class loader, which is the class loader of the mapp module, is being used to load a class, java.util.logging.ConsoleHandler, in the jdk.logging module, but that class is not visible to the mapp module, hence the ClassNotFoundException is thrown and logged. This area is a strong candidate to be modified to support to modular services.

2 comments:

  1. Did I miss something or the "ToLowerCaseHasher" should call "st.tranform(s)... " instead of "transform(..." ???
    Thanks

    ReplyDelete
    Replies
    1. No, you found a bug, well spotted! updated blog and code. Thanks.

      Delete