Sunday, 11 December 2011

Understanding the JVM, concurrency and GC

A great collection of links and presentations on Java Virtual Machine and why understanding how it works is critical.

First, let's start with HotSpot, the default JVM. This is a mandatory intro into the responsibilities of the VM. Next, learning about the some of the most common optimizations that the JIT compiler has available, HotSpot FAQs, and method inlining

Now we are going to go down one abstraction layer, and jump into the hardware. A great presentation by Jevgeni Kabanov, by far the simplest to undestand: Do you really get Memory? I find this presentation a must see before jumping into the more hardcore versions that follow. In this talk, Jevgeni presents a simplified Java CPU model, showing the CPU, Memory, and more fundamentally, Caches.

Armed with this knowledge, we are then ready to take on the mammoth paper "What every programmer should know about memory by Ulrich Drepper. This is very low level at times, so worth watching Cliff Click's presentation: A Crash Course on Modern Hardware, which summarises it nicely.

Interesting article also on Memory Barriers and obviously, it wouldn't be complete without Doug Lea's concurrency in Java presentation. Always a good read, although really extensive, the Java Language Specification, Third Edition

Finally, I have reserved it for the last, Gil Tene's excellent Understanding Java Garbage Collection. Brings in lots of points that i didn't know about in my previous post The Art of GC Tuning. In particular, I really enjoyed the discussion around Application Memory Wall, and how our applications seem to range between 1-4 GBs, but rarely take advantage of increasing hardware power, and what are the main problems behind it. He shows how the Azul Zing VM can easily cope with 1/2 TB VMs without pausing.

That's quite a lot for now!

Saturday, 3 December 2011

Rename a GIT project in GitHub

It is now supported to rename projects in GitHub.

Go to project admin, and rename.

Then, in your local .git project, simple edit your '.git/config' file

[remote "origin"]
        url =
        fetch = +refs/heads/*:refs/remotes/origin/*

Now just change old-name to new name, save and push

First Scala Step - Simple SBT template

I've created a simple project to clone when you need to get up and running really quickly using the latest Scala tools.

It provides:

  •  Scala 2.9.1
  •  SBT 0.11.2
  •  Specs2 1.6
  •  Mockito 1.8.5
  •  Eclipse
  •  IntelliJ IDEA
Clone first-scala-step and start using it! 

Alternatively, just today, coolscala has also pushed a similar project to kickstart using Akka 1.3-RC1+ ScalaTests : start-akka

Sunday, 6 November 2011

Exploring Scala Collections

This weekend I've been trying to write some microbenchmarks and discovered a large differences between different implements of the same problem: create a list of 10 million objects and print only the ones that are modulo of a million.

This is the code in question. It runs in 141 milliseconds requiring no more than 27 mbs JVM (the smallest i could create).

As shown in this StackOverFlow question, a range followed by a filter operation can even beat the performance of an array of primitives in Java. The space required is smaller, and the amount of time to execute it is barely comparable, whereas any other approach was in between one and two orders of magnitude worse, depending the case.

I learnt several things with this simple exercise, from Ranges, Array (compile to primitives, no need for Wrappers), and performance monitoring.

Sunday, 23 October 2011

The costs of concurrency and the actor model

I was interested to understand more about how actors are implemented in Erlang. In an old but still relevant article, an argument is raised:
These days every third grader knows that processes are expensive. Ok, so we're not using real processes, we're using threads, but if you constantly create, destroy, and switch between thousands of threads you'll bring any system to a crawl. Threads are expensive for a number of reasons. They take a long time and a lot of memory to create because the operating system needs to set up many things (the stack, internal data structures, etc.) for them to work. They take a long time to destroy because all the resources they consumed need to be freed. And they take a long time to switch between because unloading registers, storing them, loading them back, and flipping the stacks is complicated business.
So how expensive it really is to create a Thread in java these days?
Peter Lawrey, writer of the excellent Vanilla Java blog, tell us that roughly, creating a Thread takes
about 70μ (microseconds)
, which is fairly expensive, but not prohibitive. The size of the stack can be configured by command line argument to the JVM, with a default of 512kb. According to some Mac OS X 10.5 docs, a thread takes about 90μ, which is kind of validates his results.

As it was pointed out in the StackOverflow discussion
Creating threads is expensive if you're planning on firing 2000 threads per second, every second of your runtime. The JVM is not designed to handle that. If you'll have a couple of stable workers that won't be fired and killed over and over, relax.
This is slightly different in Erlang, where everything is a Process, so the model leads you to that situation.

Essentially, Erlang uses a kind of lightweight processes, akin to green threads (VM scheduled pseudo-threads as opposed to real threads). Since they only consist of a mailbox implemented as a queue, it takes about 300 words to build one, meaning we can create thousands, even millions of them, so the only constrain is memory.

I read a great post on Erlang lightweight-process actor concurrency and it's inherent problems in terms of message passing costs, i.e., having to serialize the entire message between processes, even within the same node. The lack of zero-copy messaging between actors, even within the same node, can really hurt performance. It also talks a great deal on how the BEAM VM and HiPE JIT compiler aren't anywhere near as good as the industry leader JVM (or his favoured Azul Systems "pauseless" VM).

On that basis, after years of hardcore work on Erlang, he decants for Erjang (using Java's Killim actors) or Clojure, possibly
due to been more functional and lisp syntax. I wonder why Scala wasn't mentioned? Akka
offers the same principles of Erlang actors, but implemented in a much more robust way and running on top of the JVM.

Scala actors, particularly Akka actors, use configurableexecutors that efficiently reuse threads (and in Akka 2.0, this will be declaratively configured). Thus, i can see Akka
as supersiding Erlang, an the heir of the throne in actors concurrency.

Saturday, 22 October 2011

The art of GC Tuning

Great presentation discussing best practices when tuning Garbage Collection on the JVM

The different GC Collectors:

Fragmented Heap:

Also, a comparison of all the JVM GC implementations (particularly valuable given that Java8 will incorporate JRockit's features.

I've came across this links by reading the blog post on the new features of Cassandra 1.0. Some very interesting JVM tuning allowed them to increase performance manyfolds.

The JVM JIT compiler at runtime, and even the java and particularly scala compiler at compile time, love small methods that can be analysed in isolation. It is very important to consider the effect that the compiler can have whenever reading microbenchmarks.

Quoting Brian Goetz:

"Often, the way to write fast code in Java applications is to write dumb code -- code that is straightforward, clean, and follows the most obvious object-oriented principles."
Brian Goetz
Technology Evangelist, Sun Microsystems

For huge collections, is better to use a framework that can handle billions of items, potentially mapped outside of the heap, thus avoiding Full GC completely.

Friday, 21 October 2011

Scala Options: Type and Null Safety

Options are great because they allow you to circumvent the $1 billion bug (Hoare).
Several posts discuss the great values of Options. It was Graham Tackley who really made me click when he presented Options as collections with zero or one elements in it.
Yesterday I was trying to make good use of this idea, and came with the problem of how retain only the elements that actually contain a value. Since you cannot call get on a None for obvious reasons, there must be an easy way to retrieve only the values.
Here's what i found:

I've transformed a List[Option[String]] with possible Nones in it, to a List[String].

How about this then?

Which is basically doing the operation that I've just done before, call flatMap(_.toList)

This makes passing around options a really valuable thing, because you have safety in your operations with minimal effort.

Very useful Scala Option cheatlist by Tony Morris

Friday, 7 October 2011

Supporting different paradigms of Parallelism

Doug Lea's keynote in Scala Days is a really interesting presentation.

It shows the importance of multicore, Amdahl's Law and lots more. Well worth it.

Monday, 26 September 2011

Attending the Low latency summit conference in London

I will be attending the Low-Latency Summit 2011 - Winning Strategies for Deploying Low-Latency Technologies. I expect to get a good insight on vendor technologies used for ultra low latency such as Infiniband networks.

Thanks Eleanna for the invite!

Update: Some interesting panels during the event. The low-latency website is probably one of the best takeaways from here. Here's a good summary on classification of sources of latency and suggestions on how to mitigate them.

Also available is this podcast from NASDAQ OMX’s Bjorn Carlson discussed technologies and approaches that are in use today at markets around the world.