<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Tech, Research and Life &#187; Internet &amp; Search</title>
	<atom:link href="http://blog.alessiosignorini.com/category/internet/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.alessiosignorini.com</link>
	<description>Cool things I believe the world should know about...</description>
	<lastBuildDate>Tue, 06 Jul 2010 02:12:40 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>How to Integrate Skype and MythTV</title>
		<link>http://blog.alessiosignorini.com/2010/07/how-to-integrate-skype-and-mythtv/</link>
		<comments>http://blog.alessiosignorini.com/2010/07/how-to-integrate-skype-and-mythtv/#comments</comments>
		<pubDate>Tue, 06 Jul 2010 02:12:40 +0000</pubDate>
		<dc:creator>Alessio Signorini</dc:creator>
				<category><![CDATA[Internet & Search]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[mythtv]]></category>
		<category><![CDATA[skype]]></category>
		<category><![CDATA[webcam]]></category>

		<guid isPermaLink="false">http://blog.alessiosignorini.com/?p=420</guid>
		<description><![CDATA[One of the things I am really confident will happen in the near future is an integration between our TV and phones. In the past years Skype and VoIP improved significantly but yet we have to see an example of seamless integration between those technologies. In my living room I have a big flat screen [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft" src="http://www.gadgetrivia.com/photos/o/450-hook_pc_tv_achieve_thing.jpg" alt="Webcam Over TV" width="300" height="211" />One of the things I am really confident will happen in the near future is an integration between our TV and phones. In the past years <a title="Skype" href="http://www.skype.com" target="_blank">Skype</a> and <a title="Voice Over IP" href="http://en.wikipedia.org/wiki/Voice_over_IP" target="_blank">VoIP</a> improved significantly but yet we have to see an example of seamless integration between those technologies.</p>
<p>In my living room I have a big flat screen TV connected to Linux <a title="MythTV" href="http://www.mythtv.org" target="_blank">MythTV</a> server, which I use for recording and watching TV and DVDs.</p>
<p>Thanks to the long weekend I had some time to attach a webcam (VF0415 Live! Cam Vid. IM Ultra) to that computer, mount it over the TV, and make it work (nothing to do, really) with Skype.</p>
<p>It works great, but I still needed to get out of MythTV and use mouse/keyboard to access Skype and make calls. I am sure it would not be that hard to create a proper plugin to make MythTV work with the client-side Skype API, but it probably makes little sense now that they are about to release their <a title="SkypeKit" href="http://developer.skype.com/public/skypekit" target="_blank">SkypeKit</a> platform and I am sure someone will just convert the good SIP plugin for that.</p>
<p>Here is how you can add an entry in the Main Menu of MythTV to start Skype:</p>
<ul>
<li>Find and save somewhere on your disk a <a title="Skype Logo" href="http://www.testfreaks.com/blog/wp-content/uploads/2009/05/windowslivewriter50waysskypecanconnectyouwiththeworld-f205skype-logo-online-2.png" target="_blank">reasonably sized Skype logo</a> in PNG format</li>
<li>Add to &#8220;<em>/usr/share/mythtv/themes/&lt;your_theme&gt;/menu-ui.xml</em>&#8221; an entry like the following
<pre>&lt;state name="SKYPE"&gt;
&lt;imagetype name="watermark"&gt;
&lt;filename&gt;watermark/skype.png&lt;/filename&gt;
&lt;/imagetype&gt;
&lt;/state&gt;
</pre>
</li>
<li>Add to &#8220;<em>/usr/share/mythtv/themes/defaultmenu/mainmenu.xm</em>l&#8221; an entry like the following
<pre>&lt;button&gt;
   &lt;type&gt;SKYPE&lt;/type&gt;
   &lt;text&gt;Skype&lt;/text&gt;
   &lt;description&gt;Launch Skype&lt;/description&gt;
   &lt;action&gt;EXEC /usr/bin/skype&lt;/action&gt;
&lt;/button&gt;</pre>
</li>
</ul>
<p>Clearly you will have to change <em>&lt;your_theme&gt;</em> with the name of the theme you use (I use &#8220;Retro&#8221;), &#8220;<em>watermark/skype.png</em>&#8221; with the real location of your Skype logo and &#8220;<em>/usr/bin/skype</em>&#8221; with the location of your Skype executable (try with the command &#8220;<em>which skype</em>&#8221; if you do not know it), but everything else should work.</p>
<p>Restart MythTV and your new shiny Skype entry should appear at the bottom of the main menu. Clicking on it will stop MythTV and launch Skype. At first launch, maximize the Skype window with the mouse, then it will do it automatically. When you close Skype (for real, right-click on the systray icon and click close) it will go back to MythTV.</p>
<p>I am also fiddling with Lirc to allow to completely control Skype with the remote. It should not be too hard. I will update this post when/if I manage to do it.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.alessiosignorini.com/2010/07/how-to-integrate-skype-and-mythtv/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google Reader tells Google what you Like and what to Index</title>
		<link>http://blog.alessiosignorini.com/2010/05/google-reader-tells-google-what-you-like-and-what-to-index/</link>
		<comments>http://blog.alessiosignorini.com/2010/05/google-reader-tells-google-what-you-like-and-what-to-index/#comments</comments>
		<pubDate>Mon, 24 May 2010 11:56:15 +0000</pubDate>
		<dc:creator>Alessio Signorini</dc:creator>
				<category><![CDATA[Internet & Search]]></category>
		<category><![CDATA[ADs]]></category>
		<category><![CDATA[Aggreagator]]></category>
		<category><![CDATA[Crawlers]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[Google Reader]]></category>
		<category><![CDATA[PageRank]]></category>
		<category><![CDATA[Queue]]></category>
		<category><![CDATA[RSS]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://blog.alessiosignorini.com/?p=202</guid>
		<description><![CDATA[While most of the Facebook&#8216;s generation kids who populate the Internet nowadays have no idea about what RSS feeds are (but likely &#8220;follow&#8221; CNN on Twitter), there is still some percentage of tech-savvy people (me included) who take a look to their favorites feeds every morning before starting their day. Not many RSS feeds reader [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft" style="margin: 0px 5px 5px 0px" src="https://www.google.com/accounts/reader/screenshot_en.gif" alt="Google Readers" width="255" height="144" />While most of the <a title="Facebook" href="http://www.facebook.com" target="_blank">Facebook</a>&#8216;s generation kids who populate the Internet nowadays have no idea about what <a title="RSS" href="http://en.wikipedia.org/wiki/RSS" target="_blank">RSS feeds</a> are (but likely &#8220;follow&#8221; <a title="CNN" href="http://www.cnn.com" target="_blank">CNN</a> on <a title="Twitter" href="http://www.twitter.com" target="_blank">Twitter</a>), there is still some percentage of tech-savvy people (me included) who take a look to their favorites feeds every morning before starting their day.</p>
<p>Not many RSS feeds reader product exists, and the few are pretty much all the same. The most widely used online <a title="RSS Aggregators" href="http://en.wikipedia.org/wiki/Aggregator" target="_blank">RSS aggregators</a> are probably <a title="Google Reader" href="http://www.google.com/reader" target="_blank">Google Reader</a> and <a title="Bloglines" href="http://www.bloglines.com/" target="_blank">Bloglines</a>.</p>
<p>While Bloglines is clearly supported by online advertising, why do you think Google created its own for free? Yes, you guessed it: to get your traffic information.</p>
<p>They probably use the number of people subscribed to each RSS feed and the frequency of their visits to Google Reader to optimize the frequency of refresh (i.e., when and how often they should recrawl it) for that particular feed/domain. Then, they look at how many people open each post/link and use that information to make decisions on its priority in the <a title="Crawling the Web" href="http://en.wikipedia.org/wiki/Web_crawler" target="_blank">crawling</a> queue or ranking of those pages.</p>
<p>I bet there are a lot of subscribers to the CNN feed and some of them log in pretty often. This probably makes its RSS feed refresh rate very high and the number of clicks that each article receives indicate their priority in crawling and has some influence in their <a title="PageRank" href="http://en.wikipedia.org/wiki/PageRank" target="_blank">PageRank</a>. After all, if 10,000 people looked at a the title/snippet of a piece of news and followed through, it must be interesting no? Conversely, if everyone skipped it, it must be not.</p>
<p>In addition to this, with every click that you do (or do not) they learn something more about you and which kind of content you like. Since Google Reader is hosted on the <a title="Google" href="http://www.google.com" target="_blank">same domain</a> as all the others Google product (i.e., www.google.com) the <a title="Cookies" href="http://en.wikipedia.org/wiki/HTTP_cookie" target="_blank">cookies</a> that they setup after your login will follow you everywhere there are Google ADs.</p>
<p>They will know even more about you and show &#8220;better and better&#8221; <a title="Contextual Advertising" href="http://en.wikipedia.org/wiki/Contextual_advertising" target="_blank">contextual advertisements</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.alessiosignorini.com/2010/05/google-reader-tells-google-what-you-like-and-what-to-index/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Facebook will use the Like Button to Personalize Search and Improve ADs</title>
		<link>http://blog.alessiosignorini.com/2010/05/facebook-will-use-the-like-button-to-personalize-search-and-improve-ads/</link>
		<comments>http://blog.alessiosignorini.com/2010/05/facebook-will-use-the-like-button-to-personalize-search-and-improve-ads/#comments</comments>
		<pubDate>Mon, 17 May 2010 12:19:53 +0000</pubDate>
		<dc:creator>Alessio Signorini</dc:creator>
				<category><![CDATA[Internet & Search]]></category>
		<category><![CDATA[ADs]]></category>
		<category><![CDATA[crawling]]></category>
		<category><![CDATA[discovery]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[Geolocation]]></category>
		<category><![CDATA[like]]></category>
		<category><![CDATA[like button]]></category>
		<category><![CDATA[pages traffic]]></category>
		<category><![CDATA[personalization]]></category>
		<category><![CDATA[Ranking]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://blog.alessiosignorini.com/?p=386</guid>
		<description><![CDATA[There has been a lot of chatting around the new Facebook&#8217;s Like Button. Some people believe it will be great for SEO, other that will increase distribution on Facebook, etc&#8230; But Facebook is smarter than that: they want to create a great search engine, possibly a personalized one, and improve their ADs platform. Traditional search [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft" style="margin: 0px 5px 5px 0px;" src="http://frtim.files.wordpress.com/2008/11/facebook-logo.jpg" alt="Facebook Logo" width="215" height="123" />There has been a lot of chatting around the new <a title="Facebook Like Button" href="http://www.facebook.com/sitetour/connect.php" target="_blank">Facebook&#8217;s Like Button</a>. Some people believe it will be great for SEO, other that will increase distribution on Facebook, etc&#8230;</p>
<p>But Facebook is smarter than that: they want to create a great search engine, possibly a personalized one, and improve their ADs platform.</p>
<p>Traditional search engines spend a lot of time <a title="Crawling the Web" href="http://en.wikipedia.org/wiki/Crawl" target="_blank">crawling the web</a>, discovering new pages or updating old ones, and computing the pagerank of each one. What if you could have a graph of the web, updated in real time, with the counts of how many people have been on each page?</p>
<p>This is what Facebook is aiming to do. They do not care if you click that button or not. When the browser renders the page (the button is in a iFrame), it sends a <a title="HTTP Request Example" href="http://www.explainth.at/en/misc/httpreq.shtml" target="_blank">request to Facebook&#8217;s server</a> telling them a lot of info about you (e.g., your browser, which page are you on, which language you understand, your IP, your screen resolution, &#8230;) and possibly even who you are (e.g., because you logged on Facebook and you still have the cookies around).</p>
<p>Since <a title="Facebook's Like Plugin for WordPress" href="http://wordpress.org/extend/plugins/facebook-like-button-plugin/" target="_blank">plugins for the Like button</a> are already widespread for popular content management softwares (e.g., <a title="WordPress" href="http://www.wordpress.org" target="_blank">WordPress</a>, <a title="Blogger" href="http://www.blogger.com" target="_blank">Blogger</a>, &#8230;) I am sure there will be a wide adoption by content creators.</p>
<p>Facebook will know in real-time about all the new pages created, how many people are visiting them, and who they are (even if you do not have a Facebook account since <a title="Browser Information are Pretty Unique" href="https://panopticlick.eff.org/" target="_blank">your browser information are pretty unique</a>). This will allow them to prioritize the crawling and refresh of the pages, compute the ranking based on the popularity (discounting the click on their search results) and also personalize your search results (e.g., ranking higher results visited/liked by your friends, neighbors, etc&#8230;).</p>
<p>Augment that with <a title="Faebook Location Sharing" href="http://bits.blogs.nytimes.com/2010/03/09/facebook-will-allow-users-to-share-location/" target="_blank">their geo-location project</a> and you also have a pretty good platform for <a title="Behavioral ADs Targeting" href="http://en.wikipedia.org/wiki/Behavioral_targeting" target="_blank">behavioral ADs targeting</a>. They already have a profile of you (you wrote it!), they are about to know where you are (geolocation), and with this they will know the sites that you and your friends visited. This is heaven for the ADs folks.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.alessiosignorini.com/2010/05/facebook-will-use-the-like-button-to-personalize-search-and-improve-ads/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google Collected Wifi Data for Geo-location Purposes</title>
		<link>http://blog.alessiosignorini.com/2010/05/google-collected-wifi-data-for-geo-location-purposes/</link>
		<comments>http://blog.alessiosignorini.com/2010/05/google-collected-wifi-data-for-geo-location-purposes/#comments</comments>
		<pubDate>Mon, 17 May 2010 03:07:04 +0000</pubDate>
		<dc:creator>Alessio Signorini</dc:creator>
				<category><![CDATA[Internet & Search]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[coordinates]]></category>
		<category><![CDATA[eye-fi]]></category>
		<category><![CDATA[Geolocation]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[street view]]></category>
		<category><![CDATA[Wi-Fi Positioning System]]></category>
		<category><![CDATA[Wifi]]></category>

		<guid isPermaLink="false">http://blog.alessiosignorini.com/?p=379</guid>
		<description><![CDATA[In the past days there have been a lot of discussions about the public admission of collecting Wifi data from Google. This has been labeled as &#8220;mistake&#8221; but do not you wonder why Google was collecting those Wifi data in the first place? They were collecting MAC addresses and network SSIDs for geo-location purposes. Since [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft" style="margin: 0px 5px 5px 0px" src="http://www.google.com/intl/en_ALL/images/srpr/logo1w.png" alt="Google Logo" width="220" height="76" />In the past days there have been a lot of discussions about the public admission of <a title="Google WiFi data collection" href="http://googleblog.blogspot.com/2010/05/wifi-data-collection-update.html" target="_blank">collecting Wifi data from Google</a>. This has been labeled as &#8220;mistake&#8221; but do not you wonder why Google was collecting those Wifi data in the first place? They were collecting <a title="MAC Address" href="http://en.wikipedia.org/wiki/MAC_address" target="_blank">MAC addresses</a> and <a title="Service Set ID" href="http://en.wikipedia.org/wiki/Service_set_(802.11_network)" target="_blank">network SSIDs</a> for geo-location purposes.</p>
<p>Since wireless networks are pretty popular, and the combination MAC/SSID is unique, associating those with the car&#8217;s GPS coordinates allowed Google to create a pretty detailed map. This map could then be used to figure out your coordinates given the MAC/SSIDs around you. The technology is generally called <a href="http://en.wikipedia.org/wiki/Skyhook_Wireless" target="_blank">Wi-Fi Positioning System</a> (WPS).</p>
<p>A possible use of that is the <a title="Google Maps Application" href="http://www.google.com/mobile/maps/" target="_blank">Google Maps</a> application. If the device does not have a GPS, it uses <a title="GSM Cells Triangulation" href="http://en.wikipedia.org/wiki/Mobile_phone_tracking" target="_blank">GSM cells triangulation</a> (cells coordinates have probably be obtained in a similar fashion) to figure out the location. While it generally works, it cannot be very accurate and often has a 2 miles range approximation. However, if some wireless networks are detected in the surroundings, they can be used to produce a much better estimation of the location.</p>
<p>These data are probably also sold/used by the <a title="Eye-Fi Automatic Geo-tagging" href="http://www.eye.fi/how-it-works/features/geotagging" target="_blank">Automatic Geo-Tagging</a> feature of the <a title="Eye-Fi" href="http://www.eye.fi" target="_blank">Eye-Fi</a> memory card. Not surprisingly, those cards are sold in <a title="Google Picasa Eye-Fi" href="http://www.eye.fi/google?postTabs=1" target="_blank">promotion with Google Picasa</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.alessiosignorini.com/2010/05/google-collected-wifi-data-for-geo-location-purposes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Twitter stopped tracking URLs clicks on Twitter.com</title>
		<link>http://blog.alessiosignorini.com/2010/04/twitter-stopped-tracking-urls-clicks/</link>
		<comments>http://blog.alessiosignorini.com/2010/04/twitter-stopped-tracking-urls-clicks/#comments</comments>
		<pubDate>Thu, 29 Apr 2010 06:36:05 +0000</pubDate>
		<dc:creator>Alessio Signorini</dc:creator>
				<category><![CDATA[Internet & Search]]></category>
		<category><![CDATA[analytics]]></category>
		<category><![CDATA[clicks]]></category>
		<category><![CDATA[Clicks Tracking]]></category>
		<category><![CDATA[twitter]]></category>
		<category><![CDATA[url shorteners]]></category>

		<guid isPermaLink="false">http://blog.alessiosignorini.com/?p=375</guid>
		<description><![CDATA[It looks like Twitter stopped tracking clicks on the URLs shared by the users on its own website. Given the recent run of the URLs shorteners towards analytics products and the promises of Twitter about the &#8220;resonance&#8221; on the ADs platform this seems a very odd move. Maybe they were overwhelmed with data/logs and thought [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://blog.alessiosignorini.com/wp-content/uploads/2010/04/twitter-not-tracking-clicks.png"><img class="size-medium wp-image-374 alignleft" style="margin: 0px 5px 5px 0px;" title="twitter-not-tracking-clicks" src="http://blog.alessiosignorini.com/wp-content/uploads/2010/04/twitter-not-tracking-clicks-300x148.png" alt="Twitter is not Tracking Clicks" width="300" height="148" /></a>It looks like Twitter stopped tracking clicks on the URLs shared by the users on its own website.</p>
<p>Given the recent run of the URLs shorteners towards analytics products and the promises of Twitter about the &#8220;<a title="Twitter Resonance" href="http://help.twitter.com/entries/142161-faq-advertisers#cost" target="_blank">resonance</a>&#8221; on the ADs platform this seems a very odd move.</p>
<p>Maybe they were <a title="Twitter overwhelmed with data" href="http://blog.twitter.com/2010/02/measuring-tweets.html" target="_blank">overwhelmed with data/logs</a> and thought about turning this feature off for now since they were not using it. It is also possible that the traffic on the website (and thus the clicks on the links) is <a title="Twitter.com receives 1M visitors per day" href="http://www.quantcast.com/twitter.com" target="_blank">so small</a> that it was not worth to keep the service up for all the URLs (maybe they will just track clicks on ADs links).</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.alessiosignorini.com/2010/04/twitter-stopped-tracking-urls-clicks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Web Pages Language Classification: Bayes, Characters and n-grams</title>
		<link>http://blog.alessiosignorini.com/2010/03/web-pages-language-classification/</link>
		<comments>http://blog.alessiosignorini.com/2010/03/web-pages-language-classification/#comments</comments>
		<pubDate>Thu, 18 Mar 2010 14:23:43 +0000</pubDate>
		<dc:creator>Alessio Signorini</dc:creator>
				<category><![CDATA[Internet & Search]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[characters frequency]]></category>
		<category><![CDATA[language classification]]></category>
		<category><![CDATA[n-grams]]></category>
		<category><![CDATA[naive bayes]]></category>

		<guid isPermaLink="false">http://blog.alessiosignorini.com/?p=360</guid>
		<description><![CDATA[Most search engines start focusing on only one language (e.g., English) because it is simpler, requires almost no characters encoding, and has a wide audience. Wherever you want to index only pages written in English or support all the language of the world, a fast page language classification is one of the first tasks that [...]]]></description>
			<content:encoded><![CDATA[<p>Most search engines start focusing on only one language (e.g., <a title="English Alphabet" href="http://en.wikipedia.org/wiki/English_alphabet" target="_blank">English</a>) because it is simpler, requires almost <a title="Latin Alphabet" href="http://en.wikipedia.org/wiki/Latin_alphabet" target="_blank">no characters encoding</a>, and has a wide audience. Wherever you want to index only pages written in English or support all the language of the world, a fast page language classification is one of the first tasks that you will have to deal with.</p>
<p>Simple word-based classification techniques like <a title="Naive Bayes Classification" href="http://en.wikipedia.org/wiki/Naive_bayes" target="_blank">Naive Bayes</a> will do the trick but require a very big training set especially for the foreign languages. Even if in the last years memory and processing power became less and less expensive, they are not free and especially in a startup you may need to optimize every single function.</p>
<p>For this reason, you may want to consider <a title="Classification of natural language based on character frequency" href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.61.5806&amp;rep=rep1&amp;type=pdf" target="_blank">characters classification</a>. In its simplest form, you just want to compute the <a title="Letter Frequency" href="http://en.wikipedia.org/wiki/Letter_frequency" target="_blank">frequency</a> in which alphabet characters appear in each language and then compute the distance (e.g., geometric distance) between the text and your models. The memory requirement of this solution are very small (i.e., a float for each of the 26 letters) and CPU can be easily bounded as well (e.g., you can stop after N characters of the input text) trading off precision for speed.</p>
<p>If that is not enough, <a title="N-gram based text categorization" href="http://www.dcs.fmph.uniba.sk/diplomovky/obhajene/getfile.php/Ng-based-tc.pdf?id=1&amp;fid=3&amp;type=application%2Fpdf" target="_blank">n-grams of characters</a> (e.g., sequences of 2 or 3 adjacent characters) will probably work even better but require an higher memory footprint.</p>
<p>The following graph shows the <a title="Letters Distribution across European Languages" href="http://blog.alessiosignorini.com/wp-content/uploads/2010/03/letters-distribution-across-languages.png" target="_blank">frequency of the alphabet letters</a> across the 5 most common European languages. For some letters the difference in usage is pretty high, e.g., the letter &#8220;A&#8221; is used twice as much in Spanish than in German while &#8220;H&#8221; is frequently used in German and English but almost never used in the other languages.</p>
<p style="text-align: center;"><a href="http://blog.alessiosignorini.com/wp-content/uploads/2010/03/letters-distribution-across-languages.png"></a><a href="http://blog.alessiosignorini.com/wp-content/uploads/2010/03/letters-distribution-across-languages.png"><img class="aligncenter size-full wp-image-359" title="Distribution of Alphabet Letters across European Languages" src="http://blog.alessiosignorini.com/wp-content/uploads/2010/03/letters-distribution-across-languages.png" alt="Distribution of Alphabet Letters across European Languages" width="507" height="281" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.alessiosignorini.com/2010/03/web-pages-language-classification/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>TurboTax costs $5 less without Cookies</title>
		<link>http://blog.alessiosignorini.com/2010/03/turbotax-costs-5-dollars-less-without-cookies/</link>
		<comments>http://blog.alessiosignorini.com/2010/03/turbotax-costs-5-dollars-less-without-cookies/#comments</comments>
		<pubDate>Tue, 16 Mar 2010 12:22:11 +0000</pubDate>
		<dc:creator>Alessio Signorini</dc:creator>
				<category><![CDATA[Internet & Search]]></category>
		<category><![CDATA[cookies]]></category>
		<category><![CDATA[discount]]></category>
		<category><![CDATA[turbotax]]></category>

		<guid isPermaLink="false">http://blog.alessiosignorini.com/?p=351</guid>
		<description><![CDATA[With the exception of last year (changing job and state, I needed professional help) I always filed my taxes using TurboTax. It is nice, simple to use, and reasonably priced. This year was no exception, but before finding out that Chase and  Bank of America customers have a 35% discount on it, I went on [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft" style="margin: 0px 5px 2px 0px;" title="TurboTax Logo" src="http://www.plu.edu/iss/U.S.%20Tax/Turbotax%20logo.jpg" alt="TurboTax Logo" width="175" height="126" /></p>
<p>With the exception of last year (changing job and state, I needed professional help) I always filed my taxes using <a title="TurboTax" href="http://www.turbotax.com" target="_blank">TurboTax</a>. It is nice, simple to use, and reasonably priced.</p>
<p>This year was no exception, but before finding out that <a title="TurboTax discounted for Chase customers" href="http://turbotax.intuit.com/affiliate/chaseret" target="_blank">Chase</a> and  <a title="TurboTax discounted for Bank of America customers" href="http://turbotax.intuit.com/affiliate/bofa" target="_blank">Bank of America</a> customers have a 35% discount on it, I went on their website to checkout the prices. Oddly, I discovered that visiting <a title="TurboTax" href="http://www.turbotax.com" target="_blank">www.turbotax.com</a> without accepting cookies shows lower prices ($5 less) than the ones offered to whom visits it with cookies enabled.</p>
<p>Glitch in the system? Naaaaa, I say it is an A/B comparison to see what customers are willing to pay.</p>
<p><a href="http://blog.alessiosignorini.com/wp-content/uploads/2010/03/turbotax-site-no-cookies.png"><img class="size-medium wp-image-349 alignnone" title="Turbotax Prices with No Cookies" src="http://blog.alessiosignorini.com/wp-content/uploads/2010/03/turbotax-site-no-cookies-271x300.png" alt="Turbotax Prices with No Cookies" width="271" height="300" /></a><a href="http://blog.alessiosignorini.com/wp-content/uploads/2010/03/turbotax-site-with-cookies.png"><img class="alignnone size-medium wp-image-350" title="TurboTax Prices with Cookies" src="http://blog.alessiosignorini.com/wp-content/uploads/2010/03/turbotax-site-with-cookies-278x300.png" alt="TurboTax Prices with Cookies" width="278" height="300" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.alessiosignorini.com/2010/03/turbotax-costs-5-dollars-less-without-cookies/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google integrates Profile Results and search links to Social Networks</title>
		<link>http://blog.alessiosignorini.com/2010/03/google-integrates-profile-search/</link>
		<comments>http://blog.alessiosignorini.com/2010/03/google-integrates-profile-search/#comments</comments>
		<pubDate>Mon, 15 Mar 2010 05:56:16 +0000</pubDate>
		<dc:creator>Alessio Signorini</dc:creator>
				<category><![CDATA[Internet & Search]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[profile search]]></category>

		<guid isPermaLink="false">http://blog.alessiosignorini.com/?p=341</guid>
		<description><![CDATA[Looking for names on Google now shows a &#8220;Profile Search&#8221; results box with the two best hits, an invite to create your Google Profile (in case you are doing a vanity search) and quick links to MySpace, Facebook, Classmates and LinkedIn search pages.]]></description>
			<content:encoded><![CDATA[<p><a href="http://blog.alessiosignorini.com/wp-content/uploads/2010/03/google-profile-search.png"><img class="alignleft size-medium wp-image-342" title="Google Profile Search" src="http://blog.alessiosignorini.com/wp-content/uploads/2010/03/google-profile-search-300x70.png" alt="Google Profile Search" width="300" height="70" /></a>Looking for names on <a title="Google" href="http://www.google.com" target="_blank">Google</a> now shows a &#8220;Profile Search&#8221; results box with the two best hits, an invite to create your Google Profile (in case you are doing a vanity search) and quick links to <a title="MySpace" href="http://www.myspace.com" target="_blank">MySpace</a>, <a title="Facebook" href="http://www.facebook.com" target="_blank">Facebook</a>, <a title="Class Mates" href="http://www.classmates.com" target="_blank">Classmates</a> and <a title="LinkedIn" href="http://www.linkedin.com" target="_blank">LinkedIn</a> search pages.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.alessiosignorini.com/2010/03/google-integrates-profile-search/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google Search Suggestions: Men are More Worried about Manhood than IQ</title>
		<link>http://blog.alessiosignorini.com/2010/03/google-search-suggestions-men-are-more-worried-about-manhood-than-iq/</link>
		<comments>http://blog.alessiosignorini.com/2010/03/google-search-suggestions-men-are-more-worried-about-manhood-than-iq/#comments</comments>
		<pubDate>Wed, 03 Mar 2010 11:16:25 +0000</pubDate>
		<dc:creator>Alessio Signorini</dc:creator>
				<category><![CDATA[Internet & Search]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[Height]]></category>
		<category><![CDATA[Manhood]]></category>
		<category><![CDATA[Query Logs]]></category>
		<category><![CDATA[Salary]]></category>
		<category><![CDATA[Search Suggestions]]></category>
		<category><![CDATA[Weight]]></category>

		<guid isPermaLink="false">http://blog.alessiosignorini.com/?p=313</guid>
		<description><![CDATA[Analysing query logs is really amusing some times. This is a screenshot from Google&#8217;s Search Suggestions for queries which start with the word &#8220;average&#8221;. Here are a few extrapolation from these suggestions: 1) Men are more worried about the length of their penis than their IQ. 2) Height is more important for men, weight for [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://blog.alessiosignorini.com/wp-content/uploads/2010/02/suggestions-for-query-average.png"><img class="size-medium wp-image-314 alignleft" style="margin: 0px 10px 10px 0px;" title="Google Suggestions for query &quot;average&quot;" src="http://blog.alessiosignorini.com/wp-content/uploads/2010/02/suggestions-for-query-average-300x283.png" alt="Google Suggestions for query &quot;average&quot;" width="300" height="283" /></a>Analysing <a title="Web Search Query Logs" href="http://en.wikipedia.org/wiki/Web_search_query" target="_blank">query logs</a> is really amusing some times. This is a screenshot from <a title="Google Search Suggestions" href="http://www.google.com/support/websearch/bin/answer.py?hl=en&amp;answer=106230" target="_blank">Google&#8217;s Search Suggestions</a> for queries which start with the word &#8220;average&#8221;.</p>
<p>Here are a few extrapolation from these suggestions:</p>
<p>1) Men are more worried about the length of their penis than their IQ.</p>
<p>2) Height is more important for men, weight for women.</p>
<p>3) Salary is a concern only once you have established that you have a big penis, you are smart and tall.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.alessiosignorini.com/2010/03/google-search-suggestions-men-are-more-worried-about-manhood-than-iq/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Want your URL Shortener? Buy a Domain and Use Bit.ly, like Techcrunch and New York Times</title>
		<link>http://blog.alessiosignorini.com/2010/03/techcrunch-and-new-york-times-exploit-bitly-for-urls-shortening/</link>
		<comments>http://blog.alessiosignorini.com/2010/03/techcrunch-and-new-york-times-exploit-bitly-for-urls-shortening/#comments</comments>
		<pubDate>Tue, 02 Mar 2010 16:58:39 +0000</pubDate>
		<dc:creator>Alessio Signorini</dc:creator>
				<category><![CDATA[Internet & Search]]></category>
		<category><![CDATA[API]]></category>
		<category><![CDATA[Bit.ly]]></category>
		<category><![CDATA[Fox News]]></category>
		<category><![CDATA[fxn.ws]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[New York Times]]></category>
		<category><![CDATA[nyti.ms]]></category>
		<category><![CDATA[tcrn.ch]]></category>
		<category><![CDATA[TechCrunch]]></category>
		<category><![CDATA[URLs Shorteners]]></category>

		<guid isPermaLink="false">http://blog.alessiosignorini.com/?p=332</guid>
		<description><![CDATA[With the diffusion of Twitter and other social sharing communities there is a lot of buzz over URLs shortening. Every major company wants to have its own URLs shortening domain: Google (goo.gl), TechCrunch (tcrn.ch), New York Times (nyti.ms), FourSquare (4sq.com), Fox News (fxn.ws), Delicious (icio.us), Bing (fa.il), &#8230; Did they all really setup some highly [...]]]></description>
			<content:encoded><![CDATA[<p>With the diffusion of Twitter and other social sharing communities there is a <a title="Most Used URLs Shortener on Twitter" href="http://blog.alessiosignorini.com/2010/02/most-used-urls-shortener-on-twitter/" target="_blank">lot of buzz over URLs shortening</a>. Every major company wants to have its own URLs shortening domain: Google (<a title="Google URL Shortening Service" href="http://goo.gl" target="_blank">goo.gl</a>), TechCrunch (<a title="TechCrunch URL Shortening Service" href="http://tcrn.ch" target="_blank">tcrn.ch</a>), New York Times (<a title="New York Times URL Shortening Service" href="http://nyti.ms" target="_blank">nyti.ms</a>), FourSquare (<a title="FourSquare URL Shortening Service" href="http://4sq.com" target="_blank">4sq.com</a>), Fox News (<a title="Fox News URL Shortening Service" href="http://fxn.ws" target="_blank">fxn.ws</a>), Delicious (<a title="Delicious URL Shortening Service" href="http://icio.us" target="_blank">icio.us</a>), Bing (<a title="Bing URL Shortening Service" href="http://fa.il" target="_blank">fa.il</a>), &#8230;</p>
<p>Did they all really setup some highly reliable servers and databases to do that? The answer is no.</p>
<p>Most of them just bought a short domain and <a title="Apache 2.0 Redirects" href="http://httpd.apache.org/docs/2.0/mod/mod_alias.html" target="_blank">setup their web server to redirects all the requests</a> to Bit.ly. Some examples:</p>
<div style="margin-left: 50px;">
<p><em>TechCrunch:</em> <a rel="nofollow" href="http://tcrn.ch/cNYWLR" target="_blank">http://tcrn.ch/cNYWLR</a> -&gt; <a rel="nofollow" href="http://bit.ly/cNYWLR" target="_blank">http://bit.ly/cNYWLR</a></p>
<p><em>New York Times:</em> <a href="http://nyti.ms/dzy2b7" target="_blank">http://nyti.ms/dzy2b7</a> -&gt; <a href="http://bit.ly/dzy2b7" target="_blank">http://bit.ly/dzy2b7</a></p>
<p><em>Fox News:</em> <a href="http://fxn.ws/cH1usB" target="_blank">http://fxn.ws/cH1usB</a> -&gt; <a href="http://bit.ly/cH1usB" target="_blank">http://bit.ly/cH1usB</a></p>
</div>
<p>Now that you know the trick, building your own URL shortening service may be easier than you thought. And you get statistics too, simply adding a plus (+) at the end of the URL (e.g., <a href="http://fxn.ws/cH1usB+" target="_blank">http://fxn.ws/cH1usB+</a>) or using <a href="http://code.google.com/p/bitly-api/wiki/ApiDocumentation" target="_blank">Bit.ly API</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.alessiosignorini.com/2010/03/techcrunch-and-new-york-times-exploit-bitly-for-urls-shortening/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
