I made a map of my followers on Twitter. This is not entirely straight forward, as most Twitter users don’t attach geo coordinates to their tweets or profiles. Luckily, many people leave something sensible in the location field of their profile (e.g. ‘Amsterdam’ or ‘London, UK’). You can match this field against a Lucene index of all the cities in the world, which I happen to have. I was able to place 15 out of my grand total of 19 followers on the map.
Followers of @fzk:
Why is this important? Read on! Also, somewhere down the line I will explain how to make such a map for your own account.
Note: this is a cross post. You can see the original here: http://waredingen.nl/twitter-data-fun.
Filed under Uncategorized | 1 Comment »
Today, Xebia published a white paper on NoSQL and Big Data crunching. This white paper presents a introduction on NoSQL and big data crunching along with a case study that was carried out at one of Xebia’s customers. Read more for the outline and full text…
Filed under NoSQL | 2 Comments »
Q: “Where in the world was the situation in Egypt the hottest talk of the town?”
A: “People in UK / London were all over it, also the Middle East and US east coast cities show more interest than the rest of the world.”
Q: “How do you know?”
A: “Just take a couple hundred thousand Twitter messages containing ‘egypt’ and run them through a MapReduce job that counts the number of messages per location and plot that on a map like this:”

(http://geocommons.com/maps/49541)
The map shows the number of Twitter messages containing the word ‘egypt’ that originated from locations around the world. A larger circle means more messages from that place. The messages were gathered during a five hour period on january 28, the day after the Egyptian internet was crippled.
(more…)
Filed under General, NoSQL | No Comments »
At Xebia, we show a lot of interest in the developing NoSQL community and all the great software and solutions that result from it. Big data analysis and heavy traffic web sites and applications are here to stay and we need solutions capable of dealing with those. The commodity stack of some flavor of relational database with a Java app server on top and the stateful model of server side sessions just doesn’t cut it in some cases. As data volume and traffic grow, these cases will present themselves increasingly often. In our App Incubator program we see a lot of interest in non relational databases and stateless server side setups with more logic on the client side (cleverly coined: NoJSP). Also, at clients the problem of ever growing data sets and the lack of options to do proper analysis with existing tools and databases starts to arise. One of these clients is the RIPE NCC. The story is roughly this: about 80GB of data comes in per day and there is ten years of historical data of the same kind and volume; we need to do queries against this and get sub-second answers. We solve this with the use of Hadoop en HBase.
(more…)
Tags: hadoop, HBase, NoSQL
Filed under Architecture, Java, NoSQL | 2 Comments »