Sunday, December 30, 2012

Logging Key Events on Mac OS X with JNA

In the previous post I introduced the APIs available in Mac OS X to get access to key events. The task was to write a small Eclipse plugin to log all keystrokes including hot-keys issued by the user while using the IDE. To my knowledge, this is not possible just using Java. The basic idea was to use JNA in order to call the native Objective-C APIs to log the key events.

Road Blocks

The first thing I tried was to write a simple call to Cocoa's NSEvent API (as outlined last time) via JNA. As a reminder, the goal was to implement this one-liner in JNA:

    [NSEvent addLocalMonitorForEventsMatchingMask:NSKeyDownMask 
                                          handler:
      ^NSEvent *(NSEvent * event) {
       /**
        * Here goes your logging...
        */
        return event;
    } ];

Easy, right?

Not quite so easy, as I had to find out. First it's Objective-C: You can't do it in JNA out of the box. As you may know, a message in Objective-C is sent to a class ( javaish: a static method is called ) like this:

   [classname messagename:argument]

The compiler translates this into something that could also be expressed as:

   Class cls = objc_getclass(classname);
   SEL selector = sel_registerName(messagename);
   objc_msgSend(cls, selector, argument);

It starts by looking up the class in the Objective-C runtime (l.1: objc_getclass). Then (l.2), sel_registerName registers a message name with the runtime. In our case, the message has already been registered (remember we are trying to call an existing API), so this function just returns the so called selector identifying the message. The last call actually tries to send the message to the class as the receiver.

This API exposes quite clearly the dynamic nature of Objective-C, which is based on message passing to receivers determined at runtime instead of just calling methods on entities defined at compile time.

You might wonder why these low-level functions are exposed in the Objective-C runtime at all. The reason is that they allow you to write all sorts of bridging layers between Objective-C and other languages. As a matter of fact, there used to be an Apple-provided Java-Objective-C bridge.

Well, not any more. Java on the desktop seems to be considered legacy technology by Apple. So we have to build all this by hand in JNA. Still no big deal, as we just need the three methods just mentioned out of the runtime API.

import com.sun.jna.Library;
import com.sun.jna.Native;

public interface ObjC extends Library {
 
 public static final ObjC INSTANCE = (ObjC) Native.loadLibrary("objc", ObjC.class);
 
 public void objc_msgSend(Object id, SEL theSelector, Object...objects);
 
 public Class objc_getClass(String name);
 
 public SEL sel_registerName(String str);

}
All there is left to do now is to call [NSEvent addLocalMonitorForEventsMatchingMask: handler:] from Java using the methods now available via JNA.
  ObjC objc = ObjC.INSTANCE;
  Class cls = objc.objc_getClass("NSEvent");
  objc.objc_msgSend(cls, objc.sel_registerName("addLocalMonitorForEventsMatchingMask:handler:"), mask.getMask(), block);  

Brilliant. Except it does not work. Why? Well, if you scroll to the very right in the code snippet—ignoring the mask.getMask() statement to define the event mask—you will notice that the last argument of the method is something I clumsily named block, because it's an Objective-C block acting as the callback when a key event is received. So surely this must work just like a function pointer, I thought. Not quite. Time to look at blocks in more detail.

Objective-C Blocks and JNA

Blocks are a C-level language feature built into the compilers that ship with Mac OS X since 10.6. They are closures capturing the enclosing lexical scope. So calling from Java you cannot just use com.sun.jna.Callback instead and pray that it works.

One of the reasons why this won't work is that you would pass in a function pointer where actually a pointer to a block literal would be expected.

A block literal is a C struct of the following form:

struct Block_literal_1 {
    void *isa; 
    int flags;
    int reserved; 
    void (*invoke)(void *, ...); //function pointer
    struct __block_descriptor_1 *descriptor;//omitted for brevity
    // imported variables go here
};

The important part for now is that it contains a pointer to a function with the logic you put into your block. You will find that this function pointer is located at offset 16 (assuming x86_64). The struct contains an isa pointer (8 bytes, so it's structurally also an Objective-C object!) and two integer fields (4 bytes each) followed by the function pointer.

If we then look at how the call of the block looks like in assembly we find another difference from a regular function pointer. We assume that a pointer to the block literal is stored in %rax and a pointer to the argument for the block (in our case an instance of NSEvent) is held on the stack at -32(%rbp). An invocation of the block could look like this:

0x100001452:  movq   %rax, %rdx //copy the block literal pointer to %rdx
0x100001455:  movq   -32(%rbp), %rsi //copy the argument to %rsi
0x100001459:  movq   %rdx, %rdi // copy the block literal pointer to %rdi
0x10000145c:  callq  *16(%rax) //call the function contained in the block literal (offset 16 remember)

The block function is thus invoked with the block literal as the implicit first argument and all other explicit arguments follow shifted by one place. In our case it's just the pointer to the NSEvent instance that comes in %rsi instead of %rdi as you would expect for a regular function call.

Even if you managed to get your function pointer in there, it would never be executed. The calling code expects the actual function pointer at *16(%rax) and it uses a different calling convention with the added block literal argument in the first position.

So in order to make this work from JNA, you would need to fabricate a structure isomorphic to the ones created by the compiler for Objective-C blocks and pass that instead. This would require a change to JNA itself.

Two Solutions

If you are free to choose your dependencies, you could just give up on JNA and use either JNI or give BridJ a try. BridJ seems relatively immature compared to JNA but it implements support for Objective-C blocks. It is actually done by creating an empty block as a template and then manipulating the function pointer inside the struct.

But I did not want to introduce another dependency just for the Mac OS version of the plugin or resort to JNI, consequently my solution was to implement a Quartz event tap using JNA. As explained in the last post, this is a low level C-API intended for the implementation of assistive devices and such. It works with JNA because it does not use blocks. If you are interested in the code it can be found here.

No comments: