epoch: 1326360240
If you ever have to deal with really huge data that does not fit in RAM anymore, but you still need a consistent interface and efficient handling - and you have not the time to write it yourself, as usual - have a look at STXXL. It has some downsides, but after all it works well and efficiently. For me it was the fastest way - both in runtime and implementation time - to deal with data in the TB range.
You may find it here: http://stxxl.sourceforge.net
#rabbitmq #amqp #cluster #distributed computing
epoch: 1322214092
Clustering RabbitMQ is very easy - if you know how. Unfortunately, the documentation on this topic is good but not good enough (cf. RabbitMQ Clustering). If you try to do it, you may get lost on the track until you find some insightful posts on the mailing list. This is why I summarize here how I got it to work.
Say, you want to create a cluster having two disc nodes and two ram nodes. If you do this on at least two machines, each having a disc and a ram node you achieve good fault tolerance and good scalability both with one setup. Your clients may connect to the ram nodes only or these are balanced by an additional load balancer.
But, how do I make a node a disc node and another node a ram node?
There’s no such command like “rabbitmqctl mkdisc” and there is no related configuration option. On one hand, this is a little counter intuitive, on the other hand this adds a lot of flexibility since you may alter the roles of nodes and restructure your cluster on the fly whenever necessary.
The rules are assigned by the way you call the “rabbitmqctl cluster” command. In our scenario, we have multiple nodes on the same host, so we need to wrap the calls to “rabbitmqctl” into shellscripts setting some environment variables (cf. RabbitMQ Configuration). If this has been done, you ensure all nodes of the cluster are running. Afterwards you execute a sequence of “stop_app”, “reset”, “cluster”, “start_app” commands for all nodes. If it comes to the “cluster” command, you add a space separated list of all disc nodes you want to create to the “cluster” command executed for each node. My mnemonic for this is that you copy the current node to all disc nodes. The whole sequence may look like this, with “rbctl.*” being your wrapper scripts:
host-of-disc1$ rbctl.disc1 stop_app
host-of-disc1$ rbctl.disc1 reset
host-of-dics1$ rbctl.disc1 cluster disc1@host-of-disc1 disc2@host-of-disc2
host-of-disc1$ rbctl.disc1 start_app
host-of-ram1$ rbctl.ram1 stop_app
host-of-ram1$ rbctl.ram1 reset
host-of-ram1$ rbctl.ram1 cluster disc1@host-of-disc1 disc2@host-of-disc2
host-of-ram1$ rbctl.ram1 start_app
host-of-ram2$ rbctl.ram1 stop_app
host-of-ram2$ rbctl.ram1 reset
host-of-ram2$ rbctl.ram1 cluster disc1@host-of-disc1 disc2@host-of-disc2
host-of-ram2$ rbctl.ram1 start_app
host-of-disc2$ rbctl.disc2 stop_app
host-of-disc2$ rbctl.disc2 reset
host-of-disc2$ rbctl.disc2 cluster disc1@host-of-disc1 disc2@host-of-disc2
host-of-disc2$ rbctl.disc2 start_app
If you have to add users, vhost and permissions, you better do it at the end of this procedure, otherwise the “reset” will delete all of this information. Also, if you want to change the cluster setup later, you should be careful with “reset”, omitting it for one disc node at least.
Another weak point with the whole clustering stuff is the location of the “.erlang.cookie” file. This file is essential for clustering and must have the same content for all nodes in the cluster. Documentation says RabbitMQ looks at “/var/lib/rabbitmq/.erlang.cookie” but I found this not always true. Supposed RABBIT_HOME points to the directory where the rabbit distribution is located, I copied the file to “$RABBIT_HOME/../.erlang.cookie” and RabbitMQ used this one. I’m not quite sure if this is a general rule.
epoch: 1321952595
Very interesting paper: The Anatomy of the Facebook Social Graph
Reading this very interesting analysis of the structure of the Facebook graph I come to the conclusion that the main features qualitatively may be common to all graphs of social networks. For example, I’ve seen another social graph one order of magnitude smaller having very similar features. It would be really interesting to compare such graphs to other graphs, especially directed social graphs like Twitter or Google+ in terms of different measures. It would also be interesting to compare these graphs to offline social networks. Another interesting question could be to analyze the evolution of such graphs over time, for example using methods of the theory of random graphs.
#Functional #programming #erlang #scala
epoch: 1321520074
I regret. Before I learned to think and to act functional, I proclaimed X=X+1 for years. Have mercy!
#programming #Functional #scala #erlang
epoch: 1321520002
Why functional is better? So many reasons, one comes here: Programming languages are not made for computers, they are for humans to formulate generic solutions computers can solve faster in the details. The imperative style formulates the steps the computer has to follow, the functional style formulates the generic solution as such. Imperative style wants you to act like a machine, functional style allows you to think like a human.
#agile #project management #programming
epoch: 1321519931
I don’t like name dropping consultants. Name dropping transports no knowledge at all, just obfuscates what is needed to find a decision. A little better is concept dropping, but only a little. What I expect of a consultant is, that she or he asks a lot of questions about my problem and afterwards shows two or three possible solutions for that particular problem, combined with an estimation of the costs.
#agile #programming #project management
epoch: 1321519870
In agile, I guess, it’s all about quality, not quantity. Quality of code, quality of user experience, quality of customer satisfaction. It’s about maintaining quality in a world of change. It’s not at all about being as fast as possible, but as good as possible - while everything around us is changing constantly. It’s about being smart, not being “schneller”.
Remark: “schneller” is German for “faster, now”.
Powered by Tumblr; designed by Adam Lloyd and Ingo Schramm.