Archive for February, 2010
Reiser4 Performances on Ubuntu 9.10
Since its debut in 2004, there is a lots of controversy on Reiser4, the game-changing file-system developed by Namesys. It has been available for Linux for quite a while and there are great reviews on the Internet, but most distribution do not include it as an option by default, including Ubuntu Karmic 9.10. Since the [...]
Top Shared Domains on Twitter, February 2010
Almost every magazine, tech blog, and news site (even CNN Money!) announced this week that Twitter receives more than 50 Million of tweets per day. Great, but how many of those are junk or spam? Twitter does not care because (good or bad) every tweet translates in growth for them, but I am trying to [...]
Why Twitter Clients still lack Classification, Clustering and Ranking?
On Facebook the average user has about 130 friends and I believe that the average user of Twitter follows a similar number of people. Considering one or two Tweets per day from each, plus 10 or more from accounts like CNN or ABC, it’s reasonable to think that you would have to look at 250 [...]
Average Query Length on Major Search Engines, February 2010
With the increase on popularity of Internet access, people use the Web for almost everything. Web search engines are used as recipe books, calculators, encyclopedias, howto’s, DYI references, and so on. In the last years users became better at formulating their queries and it is kind of funny to think that at the beginning they [...]
Facebook’s Email could really Take Down Gmail Supremacy
According to some statistics from Google, people spend 4x more times surfing the Internet than driving their car. However, when asked what a browser is, they had no clue. The first Internet users were hackers which spent most of their time on terminals, chatting through IRC, using Pine for their emails and a few newsgroup. [...]
Most Used URLs Shortener on Twitter, January 2010
URLs shorteners are definitively a hot business right now: Twitter made them popular restricting the tweets to only 140 characters, and while developing a URLs shortener is pretty simple, the amount and quality of data that they can collect (e.g., number of time a URL has been clicked on) is amazing. It is easy to [...]
Use shared_clone() to Share Variables among Perl Threads
Sharing variables across threads is generally very annoying in Perl. You have to declare the variable as shared before using it, and pay attention to the values you put in it. Things get especially messy with multi-level hashes, since you are obligated to pre-declare each level as shared. Luckily, there is a way to make [...]
Perl: if you chomp() to split(), skip the first
In Perl it is common to write a readline() while loop over a file to read its content in memory. When the file contains tab-separated data, many use chomp() to remove the newline from each input line and then split(/\t/) to separate the values into an array. Today, trying to improve the performances of one [...]
