1. Handle Huge Data With C++ and STXXL

    If you ever have to deal with really huge data that does not fit in RAM anymore, but you still need a consistent interface and efficient handling - and you have not the time to write it yourself, as usual - have a look at STXXL. It has some downsides, but after all it works well and efficiently. For me it was the fastest way - both in runtime and implementation time - to deal with data in the TB range.

    You may find it here: http://stxxl.sourceforge.net

  2. Technical Quality is an Insurance Policy

  3. Velocity is Killing Agility!

  4. If I had more time I would have written less code

  5. I Regret

    I regret. Before I learned to think and to act functional, I proclaimed X=X+1 for years. Have mercy!

  6. Why Functional: One Reason

    Why functional is better? So many reasons, one comes here: Programming languages are not made for computers, they are for humans to formulate generic solutions computers can solve faster in the details. The imperative style formulates the steps the computer has to follow, the functional style formulates the generic solution as such. Imperative style wants you to act like a machine, functional style allows you to think like a human.

  7. Name Dropping Consultants

    I don’t like name dropping consultants. Name dropping transports no knowledge at all, just obfuscates what is needed to find a decision. A little better is concept dropping, but only a little. What I expect of a consultant is, that she or he asks a lot of questions about my problem and afterwards shows two or three possible solutions for that particular problem, combined with an estimation of the costs.

  8. The Measure Of Agile Is Quality

    In agile, I guess, it’s all about quality, not quantity. Quality of code, quality of user experience, quality of customer satisfaction. It’s about maintaining quality in a world of change. It’s not at all about being as fast as possible, but as good as possible - while everything around us is changing constantly. It’s about being smart, not being “schneller”.

    Remark: “schneller” is German for “faster, now”.

  9. Publish To A Local Maven Repository With SBT

    The Simple Build Tool SBT to build Scala projects is simple most of the time, but if you need something special it can become hard to configure. Documentation is quite well, but not very detailed. More often than not I found me googling some receipts and follow some iterations of trial and error before I finally succeeded. Since I use Scala in real life commercial projects I’m oft confronted with scepticism about the ability to integrate with Java, Maven, Hudson, you name it. Then, I have no choice but to make it work for the environment. Changing the environment is not an option, not in the current phase when Scala is still not widely in use in the industry.

    Here is a working receipt for XSBT (tested with 0.11.0)  to “publish-local” to a local Maven repository at “~/.m2” instead to the usual local Ivy repository at “~/.ivy2”. You should add the following expressions to your “build.sbt”.

    // publish to local ~/.m2
    publishMavenStyle := true

    otherResolvers := Seq(Resolver.file("dotM2",

    file(Path.userHome + "/.m2/repository")))

    publishLocalConfiguration <<=
      (packagedArtifacts, deliverLocal, ivyLoggingLevel) map { 
        (arts, _, level) => new PublishConfiguration(None, "dotM2", arts, List[String](), level)
      }

  10. Databases Need Predictable Garbage Collection

    After been forced to play with different NoSQL systems for a while in a high load environment, eventually I had an insight: You should never rely on a database system based on a technology with non-deterministic garbage collection. If you cannot say when exactly your memory is freed - don’t use it to implement a database. Don’t use Java or .NET, don’t even use Erlang regardless of the partially used reference counting GC. And don’t use a database written by third party for one of the former platforms.

    What counts here is not plain performance or speed but knowledge. It should be predictable when exactly your program suffers from freeing memory, at least in a long running server it should be. A typical database usage scenario is to get a high number of requests per second from a number of different clients where each request is processed in very short time producing not that much data (with the exception of blobs). This results in a huge number of garbage being produced since all the stuff is no longer used or has to be recalculated after the request has been processed. What you really want is the garbage to be destroyed as soon as it is no longer of any use to have a smooth application behavior. It’s smoothness what counts if you want to scale horizontally. The opposite is a collection sometimes when the runtime decides. You need good luck not to hit a load peek that way. The longer the server is running the higher the chance to miss the luck.

    At the end of the day I saw all those Java databases run into severe garbage collection problems with those stop-the-world-pauses and even the most sophisticated juggling with even the most secret -XX options could not sweep the fundamental problem out of the way. A database can only be build upon a completely predictable runtime. That’s why people in the know write databases with C/C++. Sure, the cost to produce such a system might be a little higher compared to Java or .NET or Erlang. Whatever: Databases are not build for the impatient but for the settled.

Powered by Tumblr; designed by Adam Lloyd and Ingo Schramm.