Friday, December 31, 2010

Optimizing concurrency with Freescale’s Deadlock Preventer

The overall concurrency of a program, given it already uses an optimal number of threads, will depend on how much lock contention is occurring.

To avoid both deadlocks and lock contention, synchronization primitives should be the most granular possible given a fixed number of primitives.

For example, in the following snippet:

public synchronized void setProperty(Object newValue) {
    value = newValue;
    notifyPropertyChange();
}

private void notifyPropertyChange() {
    synchronized(listeners) {
        for (IListener listener : listeners)
            listener.changed();
    }
}

The setProperty() method is synchronized in order to protect the “value” data from concurrent data access.  Inadvertently, the synchronized block will cause all the code path executed by the listener.changed() invocation to needlessly have the synchronized(this) lock as a precedent.

This will not only cause the code to have a sub-optimal concurrent execution, but also introduce a lock order dependency between ‘this’, ‘listeners’, and whatever lock is acquired by the listener’s changed() implementation, possibly eventually causing a deadlock (in the Deadlock' Preventer’s terminology, the ‘listeners’ lock is then a ‘follower’ of the ‘this’ lock, and the ‘this’ lock a precedent of the ‘listeners’ lock).

The code could be re-written as follows:

public void setProperty(Object newValue) {
    synchronized(this) {
        value = newValue;
    }
    notifyPropertyChange();
}

By simply reducing the granularity of the ‘this’ lock acquisition, we improve possible concurrent execution, eliminate the lock dependency, and avoid lock ordering acquisition problems between ‘this’ and ‘listeners’, and ‘this’ and all listener’s changed() implementations, without any negative effect whatsoever.

We could also re-write the notifyPropertyChange() implementation as follows:

private void notifyPropertyChange() {
    IListener[] tmp;
    synchronized(listeners) {
        tmp = listeners.clone();
    }
    for (IListener listener : tmp)
        listener.changed();
}

At the cost of the clone() statement (something probably insignificant compared to the code executed in the listeners), the lock dependency is completely removed between ‘this’, ‘listeners’, and all lock acquisitions in the listener’s changed() implementations.

In a java program, the locks with the most number of followers are not only the most likely to cause deadlocks, but also the locks that are likely to reduce overall system concurrency.

It is unfortunately very easy to induce unnecessary dependencies between synchronization primitives, and hard to discover which locks have effectively a large number of followers.

Without a tool such as the Deadlock Preventer, which helps identifying which locks have the most followers, developers would be reduced to half hazardly trying to improve lock granularity without knowing where to start.

This is very similar to performance optimization without using a profiler:  without knowing what takes the most time, developer’s time is easily wasted, and so is concurrency optimization.

Help with the Deadlock Preventer

With the latest release of the Deadlock Preventer, it is now possible to use the Deadlock Preventer view to download lock statistics for a java program.

image

By right-clicking on the process name in the “Conflicts” tab, and selecting the “Statistics…” menu, the following dialog appears:

image

The table lists all locks in the program, the number of followers and precedents, and their acquisition locations.  More information can be obtained for each lock by selecting its row and doing copy, which puts the following information in the clipboard (for the first lock in the table above):

lock: java.lang.Object (id=2), precedents(1) followers(1)
  zzz.Main.test(Main.java:14)
  zzz.Main.main(Main.java:6)
  precedents:
    java.lang.Object (id=1), thread id(1 (main))
     zzz.Main.test(Main.java:13)
     zzz.Main.main(Main.java:6)
  followers:
    java.lang.Object (id=1), thread id(13 (Thread-2))
     zzz.Main$1.run(Main.java:22)
     java.lang.Thread.run(Thread.java:619)

The text contains detailed information about each precedents and followers, along with the stack trace in which their relationship to the lock were recorded.

The information for all locks can also be exported to a single text file by using the “Export…” button.

One very useful way to use the statistics generated by the Deadlock Preventer is to sort all the locks by their follower count.

To give an example of what large programs lock statistics look like, the following table shows what the Eclipse IDE synchronization primitives statistics just after opening the workbench window:

image

Listing the top followers’ count locks out of a total of 21 830 locks.

Running Freescale’s Deadlock Preventer from the command line

One of the best ways to detect potential deadlocks is to integrate the Deadlock Preventer in a release engineering process, so it can run with the java program executing the automated tests, and have an excellent code coverage to analyze the system’s synchronization primitives.

To integrate the Deadlock Preventer in a release engineering process, although, means being able to configure the test scripts to run the Deadlock Preventer.

Adding the Deadlock preventer is as simple as adding the following two arguments to the “java” executable:

-javaagent:path-to\com.freescale.deadlockPreventer.wrapper_1.0.0\com.freescale.deadlockpreventer.jar

-Xbootclasspath/a:path-to\com.freescale.deadlockPreventer.wrapper_1.0.0\com.freescale.deadlockpreventer.jar;path-to\javassist.wrapper_1.0.0\javassist.jar

Where the two plugins “com.freescale.deadlockPreventer.wrapper” and “javassist.wrapper” can be obtained from the following link(in the build/eclipse/plugins sub-folder):

https://github.com/sbeauchamp/Deadlock-Preventer/zipball/master

The following java property can be specified to write the Deadlock Preventer output to a file:

-Dcom.freescale.deadlockpreventer.logToFile=path-to\deadlock_preventer_output.log

Alternatively, the output can be re-directed to the standard output with the following property:

-Dcom.freescale.deadlockpreventer.logToStream=out

or

-Dcom.freescale.deadlockpreventer.logToStream=err

Memory Requirements

An important factor to keep in mind, is that running a java program with the Deadlock Preventer will cause all locks ever acquired to be referenced in memory by the analysis engine, so they will not be garbage collected for the life time of the program.

This can cause some significant memory consumption increase, and may require the –Xmx flag to be passed to java to be increased.

For example, running an eclipse.exe instance with the the Deadlock Preventer can increase its memory usage by several hundred MBs.

Running the Deadlock Preventer interactively

Another alternative to either running the Deadlock Preventer in the Eclipse workbench with the UI or running it in an automated build, is to use the built-in command line server/client.

To start the server, the following command can be used:

java –cp path-to\com.freescale.deadlockPreventer.wrapper_1.0.0\com.freescale.deadlockpreventer.jar com.freescale.deadlockpreventer.ConsoleNetworkServer 43537

The last argument is the port number of the server, which can be omitted, in which case the default port will be used, 43537.

The following output will then appear to the command line:

Server started on port: 43537
services:
  report: report.2
  query: query.1
Type 'help' for a list of commands.

This information can then be used to pass the following two properties to the java instance that is running the Deadlock Preventer (the one that is getting the –javagent flag):

-Dcom.freescale.deadlockpreventer.reportService=localhost:43537:report.2

-Dcom.freescale.deadlockpreventer.queryService=localhost:43537:query.1

Note that if the server is started on a different machine, the “localhost” string must be replaced by the IP address of the server.

When the Deadlock Preventer is connected to the server, any inconsistent locking detected will be displayed on the server’s console interactively, so that the user can decide to continue, or abort the program.  The server console can also be used to query lock statistics with the following commands:

getLockCount

      Returns the number of locks in the program.
getDetails index end

      Return detailed information on the lock(s) from index to end (being numbers between 0 and the number of locks in the program)
getAll

      Return all lock information as output on the standard out stream
writeAll file-path

      Download all lock information and write the complete output in a file

Note that downloading lock statistics can require quite a lot of memory and bandwidth (several hundred MBs for an eclipse workbench, for example).

Monday, December 6, 2010

Freescale's Deadlock Preventer is now released!

To try it out:
  1. Be sure to have an Eclipse 3.6.1 SDK layout, and using JDK 1.6 as your default JRE (jdk 1.7 doesn't work yet)
  2. Install Git
  3. Set your user and email in Git
  4. Type in the shell (cygwin on windows):
git clone git@github.com:sbeauchamp/Deadlock-Preventer.git
This will create a directory on your file system containing all the deadlock preventer plugins.
If you simply want to run the deadlock preventer tool, get the 4 plugins under
./Deadlock-Preventer/build/eclipse/plugins/
And copy them in the eclipse/dropins of your eclipse directory:
You will see that a new view is available from "Other/Deadlock Preventer".

alt
The easiest way to try is out is to create a new java project, with the following sample code in Main.java:
public class Main {
public static void main(String[] args) {
new Main().test();
}
Object lock = new Object();
Object lock2 = new Object();
private void test() {
synchronized(lock) {
synchronized(lock2) {
code();
}
}
Thread thread = new Thread(new Runnable() {
public void run() {
synchronized(lock2) {
synchronized(lock) {
code();
}
}
}
});
thread.start();
try {
thread.join();
} catch (InterruptedException e1) {
}
}
private void code() {
System.out.println("test");
}
}
Then create a new launch configuration to run the program. The new configuration will automatically appear in the combo box under the tab "Eclipse JDT Debugging" of the "Deadlock Preventer" view.
Once you click "Debug" from that tab (using the "Run/Debug" menu does not have the same effect), the program runs with the deadlock preventer automatically instrumenting the byte code.
You will then see the following error being reported:

altAnd if double-clicked the full information is displayed:

alt
By setting a breakpoint on "RuntimeException"s and using the 'Interactive' or “throw exception" mode, you are able to break in the debugger at the exact point where the inconsistent locking order is detected.
This should be enough to get your started with using the deadlock preventer, and investigating potential deadlocks in your programs.
In a next post, I will post the different options and runtime settings of the deadlock preventer including those useful to integrate it in a release engineering process.
P.S. I submitted a talk to EclipseCon 2011 on the deadlock preventer and using it to prevent deadlocks in Eclipse plugins. Please drop a comment in the submission if you think it would be an interesting talk.

Tuesday, November 30, 2010

How to Debug and Detect Deadlocks

Deadlocks in 2 minutes

Deadlocks are software bugs where multiple threads (typically one of them being the main thread) wait indefinitely on common synchronization primitives (locks) because their resolution is mutually interdependent.

Thread A waits on lock L1 to be released, but it held by thread B, who in turns waits on lock L2 to be released while it is held by thread A.

The scenario above is by far the most common source of deadlocks. The fundamental cause is that the locks are acquired in a different order by two different threads, and depending on an unpredictable race condition, can deadlock at runtime.

For example, the following snippet can cause a deadlock when Property.get() and Property.set() are accessed by two different threads:

class Property {
    static final public String DEFAULT_VALUE = "...";
    HashMap map = new HashMap();
    ArrayList listeners = new ArrayList();

    synchronized void set(String key, String value) {
        synchronized (map) {
            map.put(key, value);
        }
        for (Listener listener : listeners)
            listener.notifyChanged();
    }

    String get(String key) {
        synchronized (map) {
            String value = map.get(key);
            if (value == null) {
                value = DEFAULT_VALUE;
                set(key, value);
            }
            return value;
        }
    }
}

When the method Property.set() is called, the lock for "this" and "map" are acquired (using the synchronized keyword), while when Property.get() is called, the locks "map" and "this" are acquired, in the opposite order than Property.set().

This can cause a random deadlock when Property.get() and Property.set() are called from two different threads.

The example illustrate that:

  1. It is easy to write code that has incorrect lock acquisition order.
  2. It is hard to detect such errors, until a deadlock occurs randomly at runtime, and it might not be consistently reproducible.

Considering large source base where thousand of line of code acquire locks and have a very complex code path, it is often nearly impossible to prevent deadlocks from occurring at runtime, at the developer are faced with reducing the concurrency of the software, leading to poor usability, and sub-optimal performance.

Here is where the deadlock preventer tool comes to help.


The Deadlock Preventer

The deadlock preventer is a tool that dynamically instruments the java bytecode of an application when running in a java virtual machine in order to analyze the lock ordering and detect potential deadlocks without the need to rely on unpredictable and unreliable race conditions.

By running the deadlock preventer while executing the java program (the Eclipse IDE in our case) with a good code coverage (basically covering all features that the customers are susceptible to use), we can ensure that the code does not contain incorrect lock order, and is prone to deadlocks at runtime.

When we run the Property class in the example above with the following junit test code:

@Test
public void testProperty() {
    final Property property = new Property();
    Thread thread = new Thread(new Runnable() {
        @Override
        public void run() {
            property.get("foo");
        }
    });
    thread.start();
    try {
        thread.join();
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
    try {
        property.set("bar", "value");
        fail("should throw an exception");
    } catch (RuntimeException e) {
        assertTrue(e.getMessage().contains("ERROR"));
    }
}

We obtain a RuntimeException (as validated by the junit test case - something optional for the deadlock preventer to issue), but also the following error report on the console:

***DEADLOCK PREVENTER*** ERROR: Inconsistent locking order while acquiring lock: java.util.HashMap (id=0) in thread: 1 (main)
Property.set(Main.java:812)
Main.testProperty(Main.java:38)
...
(junit stack)
with predecent : Property (id=1)
Property.set(Main.java:811)
Main.testProperty(Main.java:38)
...
(junit stack)
Previously acquired lock: java.util.HashMap (id=0) in thread: 9 (Thread-0)
Property.get(Main.java:818)
Main$1.run(Main.java:28)
java.lang.Thread.run(Thread.java:619)
Previously acquired precedent: Property (id=1)
Property.set(Main.java:811)
Property.get(Main.java:822)
Main$1.run(Main.java:28)
java.lang.Thread.run(Thread.java:619)

The important aspect is that we receive this error report deterministically, each time the code runs, whether or not the code actually deadlocks at runtime given a race condition (something we made impossible with the thread.join() call).

Once the error is detected, it can then be corrected by the developer.

Note also that the deadlock preventer will report the error whether or not the code is actually susceptible to run concurrently or not, which in our case was impossible because we made the main thread block until the second thread finished execution (something that defeats the purpose to have a separate thread to begin with).

The deadlock preventer can also issue warnings when an inverse lock acquisition is detected in the same thread, but never run in different thread - it is always a good programming practice to have consistent lock order.

Using the deadlock preventer agent in a java program does not change any of the application logic and code path, but it does change its performance characteristics. Specifically, acquiring and releasing locks become computationally expensive. Ricks of actual deadlocks are effectively increased too, since the time spent while an incorrect lock acquisition order is progress is longer, especially if the 'Interactive' setting is used (see below).

Therefor, if the java program deadlocks, the last conflict reported by the deadlock preventer agent will be the out that refers to the deadlock obtained.

Using the Deadlock Preventer Agent

The deadlock preventer agent is available in two forms for two different development role.

  1. As a set of eclipse plugins for developers to test existing code, and investigate potential deadlocks.
  2. As a standalone launcher for testers, to run on a test machine in the background while the Eclipse Workbench is performing automated tests.

The Deadlock Preventer Eclipse integration.

agent_view

The deadlock preventer Eclipse integration consist of 3 plugins that contribute a new view in eclipse, as shown above (the view is accessible from the menu "Menu / Show View/Other..." under the "Other/Deadlock Preventer" section).

The Deadlock Preventer view allows the developer to select an existing launch configuration (only launch configurations for "Eclipse Application" as listed), configure settings, and click "Debug". The debugged Eclipse session will be automatically instrumented, and report any error in the view, under the "Conflicts" tab.

The best way to use the tool for an Eclipse developer, is first to add a breakpoint on the RuntimeException, then use the 'Interactive' mode, which allows to skip over some errors that are not of interest, and throw an exception for the relevant one.

Important: Note that if the 'throw exception' mode is selected, it will cause the code path to divert from the normal execution path, so it could cause sub-sequent errors, and bugs that would not otherwise happen.

Correcting incorrect lock acquisition order

Generally speaking, when two locks (A, B) are acquired in an inverse order by two different code paths susceptible to be called concurrently (i.e. by two different threads), a deadlock can occur.

For example, if thread 1 acquires the locks A and then B, while thread 2 acquires the locks B and then A, a deadlock is possible.

Thread 1: A -> B

Thread 2: B -> A

There are only 2 ways to avoid the problem:

  1. Re-order the lock acquisition, so that both threads acquire the locks in the same order

or

  1. Introduce a third lock, C, and make both threads acquire it before acquiring either A or B.

In the last case, the pattern would become:

Thread 1: C-> A –> B

Thread 2: C-> B -> A

The lock "C" would be in effect a "guard", preventing the locks A and B to be acquired concurrently, and avoiding the deadlock. The deadlock preventer agent recognize such patterns, and do not report any errors.

Obviously, the first solution is optimal, since it does not reduce concurrency as the second does, but it is sometimes impractical to re-order lock acquisition order, especially when access or modification of existing code is not possible.

Needless to say, removing (even partially) the locks is probably not a good solution, since the deadlock preventer agent does not report data corruption due to concurrent data access.

Miscellaneous comments

Running a full instrumented Eclipse workbench while the deadlock preventer agent analyzes the lock acquisition order, especially through the debugger, is demanding in computational resources. A recent (preferably 4 core or more) CPU is highly recommended.


Not all causes of deadlocks are detected at the moment by the deadlock preventer engine, such as Semaphore release leaks, or async execution from a background thread to the main thread.

(source code of the deadlock preventer to be posted within a few days)

Friday, April 16, 2010

Use the new Resource Filters in 3.6M7 and JDK 1.7 to simulate snapshots

Suppose you are working in a project in Eclipse, but after updating from your source control server, new files were populated in your workspace that cause some build errors.

How to continue to work in your project without being temporarily distracted by the newly created files?

In Eclipse 3.6, Helios, some new powerful features were added to the Eclipse project management that allows the users to configure exactly what files get to populate a project hierarchy.

By using the new Resource Filter feature (improved in M7), the user can either include only or exclude all files based on a rich set of configurable and extensible filters.

When running on JDK 1.7, another condition is available for filtering, the file creation date.

By creating a resource filter that excludes all files having a creation date newer than a given date, you can prevent those files from even appearing in the workspace, so they not be seen by JDT and generate build errors.

The condition for filtering files and folders are broad, including file name, location, workspace path, last modified, file length, etc... The user can also group those condition in logical prepositions with and, or, and not operators.

3rd party plugins can even contribute new matchers through a core.resources extension point to further extend the filtering ability available to the user (think filtering by CVS status, etc...).

Wednesday, March 31, 2010

Tip: How to Debug SWT components in Modal Dialogs

SWT and JFace components, and especially layouts can be quite time consuming to debug when some unexpected behavior is observed at runtime.

Some tools (such as the YARI tool set) can be used to debug SWT components and layouts at runtime, but when modal dialogs are displayed, such tools can't be used because the developer can't setup the inspector window and inspect the components in the list while the dialog is up.

To resolve this issue, one can

  1. Detach the inspector view so it displays in a floating window outside of the workbench.
  2. Put a breakpoint in the modal dialog source code, few lines after the setShellStyle() call, with a breakpoint condition so that it removes the MODAL SWT flag.

For example, for making the WizardDialog non modal, a breakpoint with the following condition:

   setShellStyle(SWT.CLOSE | SWT.MAX | SWT.TITLE | SWT.BORDER       
      | SWT.RESIZE | getDefaultOrientation());    
   return false; 

needs to be added to the "setWizard(newWizard)" line below:

   public WizardDialog(Shell parentShell, IWizard newWizard) {
       super(parentShell);       
       setShellStyle(SWT.CLOSE | SWT.MAX | SWT.TITLE | SWT.BORDER
               | SWT.APPLICATION_MODAL | SWT.RESIZE | getDefaultOrientation());
       setWizard(newWizard); 

This way, you will be able to debug the SWT components at runtime even in modal dialogs: