My git aliases

Just a sort note about the useful git aliases I’ve came up with during the course of my daily work with git and github.

Although it’s been done by different people several times, I also came up with some shortcuts to work with github’s pull requests from the commandline. I tend to use github’s website only for commenting on the pull requests but I like to do the rest locally in my repository (including the merge of the pull request).

Checkout this gist where I keep my aliases.

Phantom reads and data paging

Paging through the results is easy, right?

The client only needs to supply the number of rows to skip and the maximum number of rows it wants returned (aka the page number and the page size). The server then returns the data along with the information about the total number of results available. Et voila you have all the information you need. The number of rows to skip together with the page size give you the information about what page you’re showing and the page size with the total number of rows gives you the total number of pages available. Nothing too difficult or complex.

But there’s a catch. On the server, one needs to perform (at least) two queries – one query to get the data for the requested page and the second query to fetch the total number of rows. Now most of the databases set the default transaction isolation level to READ_COMMITTED and for very good reasons. But this transaction isolation level allows for phantom reads, i.e. 2 queries in the same transaction might “see” different number of rows of data, if another transaction committed and added or deleted rows that would be returned by the queries.
So, it may happen that you will:

  • return “rows 5 to 10 out of 2 total”,
  • say “there are no available results on the first page, while the total number of rows is 5″,
  • etc.

All that info acquired within one transaction.

What can you do about such situations? The obvious solution is to just admit that these things can happen ;) Another option is to try and detect if such situation might have occured and re-try.

I’ve come up with the following rules for consistency of the results:

N is the actual number of elements on the page, P is the maximum number of elements on the page (i.e. the page size), I is the number of rows to skip and T is total number of results.

  • T < I && N == 0. This means we’re trying to show a page that is past the total number of results. We therefore expect the collection to be empty.
  • T - I > P && N == P. If we are are not showing the last page, the number of elements in the collection should be equal to the page size.
  • T - I <= P && N == T - I. If showing the last page, the collection should have all the remaining elements in it.

These are kind of obvious assumptions but phantom read can easily break them and therefore one should be checking them if one wants to be serious about returning meaningful results to the user.

So while paging is simple in principle, there are a couple of interesting corner cases that one needs to handle if one reads data out of a dynamic data set. It took us a good couple of years in RHQ to get to the bottom of this but hopefully now our paging is more robust than it was before.

Scripting News in RHQ 4.5.0

RHQ 4.5.0 (which we released today) contains a great deal of scripting enhancements that I think are worth talking about in more detail. In my eyes, the changes make the scripting in RHQ ready for serious use.

CommonJS support

This, I think, is huge. In the previous versions of RHQ, the only way of reusing functionality from another script was to use the exec -f command in the interactive CLI shell (in another words, this was NOT available in the batch mode, which is how majority of people are using the CLI). So if you needed to implement something bigger and needed to split your code in several files (as any sane person would do), you only had 1 option – before executing the “script”, you needed to concatenate all the scripts together.

This sucked big time and we knew it ;) But we didn’t want to just add functionality to “include files” – that would be too easy ;) At the same time it wouldn’t solve the problem, really. The problem with just “including” the files into the current “scope” of the script is that this would mean that each and every variable or function in those files would have to be uniquely named because javascript lacks any sort of namespace/package resolution. Fortunately, the CommonJS spec solves this problem.

Here’s how you use a module. Notice that you assign the module to a variable and that’s how you prevent the “pollution” of your scope. The loaded module can have methods and variables with the same name as your script and they won’t influence each other:

var myfuncs = require("modules:/myfuncs");
myfuncs.helloworld();

You may wonder what that "modules:/myfuncs" identifier means. It is an URI that the underlying library uses to locate the script to load. This “sourcing” mechanism is pluggable and I will talk about it more in the following chapter. To see some examples of the modules, you can look at the samples/modules directory of your CLI deployment and you can also read some documentation about this on our wiki.

Locating the scripts

With the ability to load the scripts there comes the problem of locating them. For the standalone CLI, the obvious location for them is the filesystem, but what about alert notification scripts on the RHQ server? These scripts are stored in RHQ repositories which don’t have a filesystem location. The solution is not to tie the scripts to the filesystem but have a generic way of locating them using URIs and a pluggable way of resolving those URIs and loading the scripts from various locations. This means that you can for example load the script from an RHQ repository in your standalone CLI installation, or to define 1 central location for your local CLI scripts and use the “modules” URIs to refer to them. Or you can easily implement your own “source provider” and for example load the scripts from your internal git repo or ftp or whatnot. RHQ comes with a small set of predefined source providers, documented here.

With this ability at hand, you can make an RHQ repository a central place for your scripts that you will then be able to use universally – both in the standalone CLI installations and also in the alert notification scripts.

Python

In previous versions, our scripting was tied to Javascript. Thanks to quite a bit of refactoring, the RHQ scripting integration is now language independent and language support is pluggable (see my previous post where I detail how this was done in case of Python).

What this all means is that you can now write your CLI scripts in Python and still use the same API as you were able to use before from Javascript only. I.e. you will find the ResourceManager, AlertManager and all the other predefined variables that define the RHQ API available in Python, too. The only thing that this initial implementation doesn’t support is code-completion in the interactive CLI shell.

Last but not least, the ability load the scripts from various locations is available in Python, too, using an automatically installed path_hook. You can read about how to use it on our wiki. This also means that you can now write your alert notification scripts in Python, too.

When running an alert notification script (i.e. an alert notification of the type “CLI Script”), the language of the script is determined from the script file extension – “.py” for python and “.js” for javascript. When you start the CLI shell, you pass your language of choice using the --language commandline parameter – “javascript” or “python” are the obvious possible values for it.

Conclusion

In my opinion, these changes are great and will allow our users to really start building useful tools using our scripting support. If you feel like you’ve come up with a script module you would like to share with the RHQ community, why don’t you just send a pull request to our github repo with sample scripts?

The Dark Powers of PowerMock

Recently, we’ve started using Mockito and PowerMock in our testing. I won’t explain mocking and why or why not you should use it, but I want to share my experience with using PowerMock.

PowerMock comes with a very strong promise: “PowerMock uses a custom classloader and bytecode manipulation to enable mocking of static methods, constructors, final classes and methods, private methods, removal of static initializers and more.”

That is seriously cool, right? I thought so, too, but I stumbled upon several problems the very first time I tried to use it. Frankly, those problems, as always, stemmed from my lack of experience with the tool, but hey – everyone’s a novice at first. Let me share my experience with you.

The Problem

 
public class ClassUnderTest { 
    public InputStream method(boolean param, URI uri) throws Exception { 
        String scheme = param ? "https" : "http"; 
        URI replacedUri = new URI(scheme, uri.getAuthority(), uri.getPath(), uri.getQuery(), uri.getFragment()); 
        return replacedUri.toURL().openStream(); 
    } 
}

The above fabricated example expresses the essence of the testing challenge I faced (the real class was this.) The method I wanted to test obtains an URI and transforms it based on some parameters. Then it tries to open a stream on the URI so that the caller can download the contents.

Because the URI that the method tries to download from is by design either http or https URL, it is kind of hard to test without actually standing up a HTTP server to serve the file during the test. This is of course not impossible and possibly would not be that hard, but I thought PowerMock can come here to the rescue. I should be able to mock those calls out in my tests.

Attempt #1 – mocking system classes

 
@Test
@PrepareForTest(ClassUnderTest.class)
public class MyTest { 
    @ObjectFactory 
    public IObjectFactory getObjectFactory() { 
        return new PowerMockObjectFactory(); 
    } 

    public void testMethod() throws Exception { 
        URI uriMock = PowerMockito.mock(URI.class); 
        URL urlMock = PowerMockito.mock(URL.class); 
        
        PowerMockito.whenNew(URI.class).withArguments("http", "localhost", null, null, null).thenReturn(uriMock); 
        Mockito.when(uriMock.toURL()).thenReturn(urlMock); 
        Mockito.when(urlMock.openStream()).thenReturn(new FileInputStream(new File(".", "existing.file"))); 
        
        ClassUnderTest testObject = new ClassUnderTest(); 
        testObject.method(false, new URI("blah://localhost")); 
    } 
} 

This should be fairly easy to understand for everyone that used some mocking framework. I’m creating two mocks: one for URI and one for URL classes. Then I’m using PowerMock to capture the construction of a new URI (see the code of the ClassUnderTest) and returning my uriMock. The uriMock is set up to return the urlMock when its toURL() method is called. When the openStream() method is called on my urlMock, I’m returning an input stream of a local file.

Nice and easy, right? Except it doesn’t work. I get the following stacktrace as soon as I try to mock the URI class:

 org.mockito.exceptions.base.MockitoException: Mockito cannot mock this class: class replica.java.net.URI$$PowerMock0 Mockito can only mock visible & non-final classes. 

After a bit of googling, the cause is apparent – PowerMock cannot mock the system classes (unless PowerMock java agent is used). Ok, let’s try another approach, this time trying to avoid using mocks.

Attempt #2 – PowerMockito.whenNew(URL.class)

The idea behind this attempt is that PowerMockito can capture and override constructor calls. Because URI.toURL() constructs a new URL instance with a single string argument, so we theoretically should be able to intercept that?

 
public void testMethod() throws Exception { 
    URL realUrl = new File(".", "existing.file").toURI().toURL(); 

    PowerMockito.whenNew(URL.class).withArguments("http://localhost").thenReturn(realUrl); 
    
    ClassUnderTest testObject = new ClassUnderTest(); 
    testObject.method(false, new URI("blah://localhost")); 
} 

As you might have guessed, this doesn’t work either. And frankly if it did, I’d have some serious questions about how it could. The constructor of URL is only called inside the toURL() of the URI which is a system class that PowerMock can’t touch. So, the third attempt.

Attempt #3 – PowerMockito.whenNew(URI.class)

What is the difference between this one and the previous attempt? Well, it took me a while to decipher the javadoc for the @PrepareForTest annotation, but it boils down to this. If you need to use the PowerMockito.whenNew method, you need to tell PowerMock to do bytecode manipulation on the class that (in some method) directly calls given constructor. This is kinda understandable when you know what PowerMock is doing – it will actually change the byte code of the “prepared” class so that any constructor calls (and other things) are checking for the rules defined using whenNew and other methods. You realize this for real when you try to debug the class under test (that has been prepared by power mock) – you can no longer be sure that what you see in the code is actually what is happening, because the bytecode of the class no longer exactly corresponds to what you see in the source code.

So to sum it up, here’s the code that works:

 
@Test 
@PrepareForTest(ClassUnderTest.class) 
public class MyTest { 
    @ObjectFactory 
    public IObjectFactory getObjectFactory() { 
        return new PowerMockObjectFactory(); 
    } 

    public void testMethod() throws Exception { 
        URI realUri = new File(".", "existing.file").toURI(); 
        PowerMockito.whenNew(URI.class).withArguments("http", "localhost", null, null, null).thenReturn(realUri); 
        
        ClassUnderTest testObject = new ClassUnderTest();         
        testObject.method(false, new URI("blah://localhost")); 
    } 
} 

The constructor of the URI is intercepted and we return a “realUri”, i.e. a different instance of otherwise “normal” URI class. This works, because exactly that constructor with those arguments is called in the class under test that has been manipulated by PowerMock (as instructed by the @PrepareForTest annotation). From that point on, we don’t need any special behavior on either the URI or URL classes and so the code can stay untouched.

Conclusion

The conclusion is basically the famous 4 letters – RTFM :-) I just wanted to detail my journey through the dark corners of the PowerMock forest just in case some of you were as confused as I was when I first entered it.

Posted in Java, RHQ. 2 Comments »

RHQ speaks Python

In the past few weeks I was quite busy refactoring RHQ’s CLI and scripting integration. Funnily enough it all started because we wanted to add the support for CommonJS modules to our javascript interface. During the course of the refactoring, I found out that I’m actually heading in the direction of completely separating the “language” support from the rest of the RHQ, which then only speaks to it through the Java’s standard scripting APIs which are language independent.

RHQ’s CLI was originally only implemented for and tightly coupled with javascript for which the JRE has support by default. The problem we had was that the version of Rhino (i.e. the Javascript implementation Java uses) that is bundled with the JRE does not support CommonJS modules while the newer versions do.

But this is about Python, right? So once I saw that we have a nice little API that one can implement to add support for another language, I thought why not try bringing another language to RHQ? The obvious choice was Python – the most popular language among the ones that can integrate with Java. So I grabbed Jython and started looking if would be possible to do with it everything we needed to do to implement our API. And it turned out it was – a mere 200 lines of Java code and RHQ can speak Python :)

Let’s look at how the API we needed implement looked like:


public class PythonScriptEngineProvider implements ScriptEngineProvider {

    @Override
    public String getSupportedLanguage() {
        return "python";
    }

    @Override
    public ScriptEngineInitializer getInitializer() {
        return new PythonScriptEngineInitializer();
    }

    @Override
    public CodeCompletion getCodeCompletion() {
        // XXX are we gonna support code completion for multiple langs in the CLI?
        return null;
    }
}

Now that’s quite trivial, isn’t it? :) Of course, this is the basic interface which just delegates the real work to other classes. So let’s look at the ScriptEngineInitializer – the class that really does the all the important work:


public class PythonScriptEngineInitializer implements ScriptEngineInitializer {

    private static final Log LOG = LogFactory.getLog(PythonScriptEngineInitializer.class);

    static {
        Properties props = new Properties();
        props.put("python.packages.paths", "java.class.path,sun.boot.class.path");
        props.put("python.packages.directories", "java.ext.dirs");
        props.put("python.cachedir.skip", false);
        PythonInterpreter.initialize(System.getProperties(), props, null);
    }

    private ScriptEngineManager engineManager = new ScriptEngineManager();

    @Override
    public ScriptEngine instantiate(Set packages, PermissionCollection permissions) throws ScriptException {

        ScriptEngine eng = engineManager.getEngineByName("python");

        //XXX this might not work perfectly in jython
        //but we can't make it work perfectly either, so let's just
        //keep our fingers crossed..
        //http://www.jython.org/jythonbook/en/1.0/ModulesPackages.html#from-import-statements
        for (String pkg : packages) {
            try {
                eng.eval("from " + pkg + " import *\n");
            } catch (ScriptException e) {
                //well, let's just keep things going, this is not fatal...
                LOG.info("Python script engine could not pre-import members of package '" + pkg + "'.");
            }
        }

        //fingers crossed we can secure jython like this
        return permissions == null ? eng : new SandboxedScriptEngine(eng, permissions);
    }

    @Override
    public void installScriptSourceProvider(ScriptEngine scriptEngine, ScriptSourceProvider provider) {
        PySystemState sys = Py.getSystemState();
        if (sys != null) {
            sys.path_hooks.append(new PythonSourceProvider(provider));
        }
    }

    @Override
    public Set generateIndirectionMethods(String boundObjectName, Set overloadedMethods) {
        if (overloadedMethods == null || overloadedMethods.isEmpty()) {
            return Collections.emptySet();
        }

        Set argCnts = new HashSet();
        for (Method m : overloadedMethods) {
            argCnts.add(m.getParameterTypes().length);
        }

        String methodName = overloadedMethods.iterator().next().getName();
        StringBuilder functionBody = new StringBuilder();

        functionBody.append("def ").append(methodName).append("(*args, **kwargs):\n");
        functionBody.append("\t").append("if len(kwargs) > 0:\n");
        functionBody.append("\t\t").append("raise ValueError(\"Named arguments not supported for Java methods\")\n");
        functionBody.append("\t").append("argCnt = len(args)\n");

        for (Integer argCnt : argCnts) {
            functionBody.append("\t").append("if argCnt == ").append(argCnt).append(":\n");
            functionBody.append("\t\treturn ").append(boundObjectName).append(".").append(methodName).append("(");
            int last = argCnt - 1;
            for (int i = 0; i < argCnt; ++i) {
                functionBody.append("args[").append(i).append("]");
                if (i < last) {
                    functionBody.append(", ");
                }
            }
            functionBody.append(")\n");
        }

        return Collections.singleton(functionBody.toString());
    }

    @Override
    public String extractUserFriendlyErrorMessage(ScriptException e) {
        return e.getMessage();
    }
}

The most important task of the initializer is to instantiate the script engine of the language it supports and intialize it – pre-import java packages of RHQ’s classes and apply java security to the script engine. The other tasks it has are to install a “script source provider” to the engine (the script source provider is a class that is able to locate a script “somewhere”), to extract a user-friendly error message from the script exception and finally to generate “indirection methods” – basically define top level functions that delegate to a method on certain object. All these methods are there so that RHQ can correctly set up the bindings that the scripts then can use to access and manipulate RHQ data.

I won’t be listing the source of the class that integrates the source providers with Python, you can take a look at it here. But I’ll show you how it is possible in your local CLI session to import a python script stored in the RHQ server in some repository:


import sys

sys.path.append("__rhq__:rhq://repositories/my_repo/")

import my_script as foo

...

RHQ has a path_hook in Python that looks for paths prefixed with __rhq__:. After that you can specify the root URL that the RHQ’s source provider understand. The import statement then looks for a module under that URL. In the example above, you will import the script called my_script.py that is stored on the RHQ server in the repository called my_repo.

So that’s it. You can see that adding support for another scripting language is not that hard. What language will you add? ;-) You can read more about the language support on the RHQ wiki.

Posted in Java, RHQ. 3 Comments »

RHQ meets Arquillian

Historically, RHQ has had a little bit of a problem with test coverage of its various (agent) plugins. There is a multitude of problems with testing these but the following two are, IMHO, the main ones:

Managed Resources

You somehow need to have the managed resource available for the plugin to connect to (i.e. you need to have the JBoss AS, Postgres or whatever your plugin manages). This is always a problem for a clean quick unit test. You either somehow need to mock the managed resource (try that with Postgres) or you need to have a way of configuring your test to get at or start the managed resource. This is where Arquillian certainly can come to the rescue with its ability to manage the lifecycle of its “containers” (for managed resources that have an Arquillian extension, like JBoss AS) but generally this needs to be in the “hands” of the tests for each plugin. There are a million ways the plugins talk to their managed resources and so trying to come up with a generic solution to start, stop and configure them would IMHO create more problems than it would solve.

Setting up Agent Environment

While not even too hard, running your test in RHQ’s plugin container requires a little bit of setup. It is important to realize that if you want your tests to be run inside a real plugin container (i.e. “almost an RHQ agent”), it is not enough to have your dependencies on your test classpath. The thing is that the plugin container is a container of its own – it has its own deployment requirements and classloading policies. It is best to think about deploying a plugin into RHQ agent as deploying a webapp into Tomcat – you wouldn’t expect to be able to test the webapp in Tomcat just by virtue of having them both on the classpath and starting Tomcat.

So to put it straight, you need to jump through some maven and antrun hoops to package your plugin (and any other plugin it depends on) and put them in defined locations, where the plugin container can then pick them from. Also, if you want to take advantage of our native APIs to obtain running processes, etc., you need to use another bucket of antrun incantations in your POM to set that up.

Previous Attempts

The two problems outlined above caused that the test coverage of our plugins is rather low. We always knew this sucked and there have been attempts to change that.

A ComponentTest class used in some of our plugins is an attempt at testing the plugins out-of-container, bootstrapping them with some required input. Advantage of this approach is that you don’t need to care about the plugin container and its intricacies, disadvantage being that you don’t get to test your plugin in an environment it will be deployed to. Also, you need to implement support for bootstrapping the parameters for any plugin facet your plugin implements – in the end you’d end up reimplementing large parts of the plugin container just for the testing needs.

Another attempt was the @PluginContainerSetup annotation that took care of the configuration and lifecycle of the plugin container. The advantage was that you got access to a real plugin container running with your plugins, disadvantage being that you still were required to perform some maven and antrun artistry so that the plugin container could find all the plugins and libraries you’d need.

Enter Arquillian

As I already hinted at above, the RHQ agent shares a lot of similarities with EE/Servlet containers from the deployment point of view. Arquillian was therefore an obvious choice to try and solve our plugin test problems once and for all (well, this is a lie – the problem with having to have a managed resource available for the test is a problem that cannot be reasonably solved using a single solution).

So what is this integration about? It certainly won’t help you, as the plugin developer, with connecting to a managed resource you’re creating your plugin for. But it does bring you a lot of convenience over the previous state of things if you want to test your plugin in container.

Most importantly there is no more any maven and/or antrun required to test your plugin in-container. You just define your plugin in the Arquillian way using the @Deployment annotation (and you can “attach” to it any another plugins it depends on by instructing Arquillian to use the maven resolver). Using arquillian.xml (yes, a configuration file but an order of magnitude shorter and much more focused and simple than pom.xml), you can configure your container to use RHQ’s native APIs by flipping one config property to true. You can declaratively say you want to run discovery of managed resources (using, surprise, a @RunDiscovery annotation) and you get get results of such discovery injected into a field in your test class. You can even set the container up so that it thinks it is connected to an RHQ server and you can provide your ServerServices implementation (i.e. the RHQ server facade interface) and there is a default implementation ready that uses Mockito to mock your serverside. There’s still more, you can read all about the supported features and see some examples on this wiki page.

Conclusion

While not a panacea for all problems the testing of RHQ plugins brings about, using Arquillian we were able to cut the setup needed to run a plugin in-container by 90% and we were able to introduce a number of convenience annotations using which you can get a variety of data injected into your unit tests. This is still just a beginning though, the next step is to start actually using this integration and come up with other useful annotation and/or helper methods/classes that will ease the working with and retrieving information from the plugin container as much as possible.

RHQ CLI over XMPP

I watched the great demo of the XMPP server plugin for RHQ from Rafael Chies. Rafael is using a custom DSL to query the RHQ server for information but I thought that that really shouldn’t be necessary – it should be possible to use an ordinary CLI session behind this. Granted – the “query language” of our remote API is more complicated than the simple DSL Rafael is using but at the same time, the API we use in the CLI is much more feature rich and I wouldn’t have to reimplement any of it if I was able to “blend” the CLI session with the XMPP chat.

So I forked Rafale’s code on github and went off to work. During the course of reimplementing Rafael’s code I discovered 2 bugs in RHQ itself (BZ 786106 and BZ 786194) which I fixed immediately (well, it took me a couple of hours to figure out what the hell was going on there ;) ). After that, it wasn’t really that hard to integrate XMPP and the CLI’s script engine and here’s a short video to prove that it actually works :-) :

RHQ CLI over XMPP on Vimeo.

For the interested, all the important code is included in this class.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: