Annoyed by Google’s helpful improvements? Try the verbatim tool

Posted by Don Arthur on Saturday, December 3, 2011

Sometimes the words I type into Google’s search box are the words I want to appear in the results. For years now I’ve been using the ‘+’ operator to ensure that every result includes a particular term. But recently, without warning, it stopped working. Fortunately Google have introduced a fix with the verbatim tool. According to Google’s search blog:

In most cases, Google’s algorithms make things better for our users – but in some rare cases, we don’t find what you were looking for. In the past, we provided users with the “+” operator to help you search for specific terms. However, we found that users typed the “+” operator in less than half a percent of all searches, and two thirds of the time, it was used incorrectly. A couple of weeks ago we removed the “+” operator, encouraging the use of the double quotes, which are more likely to be used correctly.

Since then, we’ve received a lot of requests for a more deliberate way to tell Google to search using your exact terms. We’ve been listening, and starting today you’ll be able to do just that through verbatim search. With the verbatim tool on, we’ll use the literal words you entered without making normal improvements such as

  • making automatic spelling corrections
  • personalizing your search by using information such as sites you’ve visited before
  • including synonyms of your search terms (matching “car” when you search [automotive])
  • finding results that match similar terms to those in your query (finding results related to “floral delivery” when you search [flower shops])
  • searching for words with the same stem like “running” when you’ve typed [run]
  • making some of your terms optional, like “circa” in [the scarecrow circa 1963]

You can access the verbatim search tool under “More search tools” on the left-hand side.

According to Andy Baio at Wired, Google wouldn’t disclose why they phased out the ‘+’ operator "though it seems obvious that they’re paving the way for Google+ profile searches."

A Toy Model of the Indo -Asia Pacific

Posted by Richard Tsukamasa Green on Friday, November 18, 2011

Like Paul Krugman part of what originally drew me into Economics was the premise behind Asimov’s Foundation books. This premise was a far future where a discipline had managed to formalise and model human society, shed light on what would happen and create preconditions for a better society to develop. It’s an absurd conceit, and even in the confines of fiction Asimov was compelled to say it could only work in populations numbering in the Quadrillions, so the peculiarities of individuals would be as unimportant as the idiosyncrasies of individual molecules of air in the laws of thermodynamics.

Economics tends to attempt to do something similar, albeit with agents rather than masses.  The results are mixed of course, but it’s a pleasure to create a few simple rules, and see how closely it approximates observed behavior in a market. It’s also true in non market settings.

This comes up because of the announced US Marine base in Darwin. Some crude simplifications of security policy might be fun even if the conclusions are not novel. I shouldn’t be posting on International Relations again, but I’m starting a new economics job soon, and I’ll have to feel out how much blogging on the topic of economics policy is appropriate. Until then more IR (or food, political theory, transport etc.).

Here’s some simplifications for behaviour.

1/ Let’s start by defining agents as sovereign states.

2/ States are primarily loss averse and seek mainly to preserve the status quo.

  • a) They seek to first to limit the chances of being invaded
  • b)to preserve existing lines of import and export (particularly goods with inelastic supply such as oil).

3/ Limiting the chance of invasion is pursued thus.

  • a) Identifying the greatest external threat and allying with it’s greatest threat. This makes invasion less attractive in accordance with 2. My enemy’s enemy is my friend.
  • b) Securing potential platforms for invasion away from the greatest threat (except where this conflicts with 2/) (Continued)

Kaggle brilliantly explained on Catalyst

Posted by Nicholas Gruen on Thursday, August 18, 2011

Well the ABC God bless its cotton socks can’t quite bring itself to mount videos that can be embedded elsewhere – or I can’t see a way to do it, but they did a great story on Kaggle tonight – so I thought I’d post it here. Just click here and all will be revealed.

Update: someone has emailed me some code which enables me to frame the video here.

The cred you get from a bit of technical talk

Posted by Nicholas Gruen on Wednesday, August 3, 2011

Benford’s Law: around 30% of the first digits in many real world data-sets are “1″.

Posted by Nicholas Gruen on Wednesday, August 3, 2011

Yes, folks it’s Benford’s Law – from Kaggle’s website.

One fun aspect of working with real data is that you get to observe real-life phenomenon. For example, Benford’s Law (also known as the “first-digit law”) states:

“in lists of numbers from many (but not all) real-life sources of data, the leading digit is distributed in a specific, non-uniform way. According to this law, the first digit is 1 about 30% of the time, and larger digits occur as the leading digit with lower and lower frequency, to the point where 9 as a first digit occurs less than 5% of the time.”

A simple SQL query on the training dataset gives us the raw data with which we can compare the data:

digit count  actual_probability benford_expected_probability abs diff
1 3368866  27.9% 30.1% 2.3%
2 1912850  15.8% 17.6% 1.8%
3 1483366  12.3% 12.5% 0.2%
4 1258157  10.4% 9.7% 0.7%
5 1109766  9.2% 7.9% 1.3%
6 933048  7.7% 6.7% 1.0%
7 787636  6.5% 5.8% 0.7%
8 668351  5.5% 5.1% 0.4%
9 573359  4.7% 4.6% 0.1%

Sure enough, the data from millions of shopping visits demonstrates the validity of this law.

I just thought this was an interesting application of something you hear about all the time in statistics discussions.

A new Big Idea for China

Posted by Richard Tsukamasa Green on Saturday, July 16, 2011

Disclaimer: This ended up roughly 4500 words longer than I expected when I sat down.

A while ago, following the start of the Arab Spring, John Quiggin wrote a post declaring “Fukuyama, F*** Yeah“. Apart from showcasing an appreciation of both late 20th century political thought and early 21st century scatalogical humour, it also presents an interesting intellectual exercise, or at least diversion.

Let’s let Fukuyama restate his oft caricatured argument

“The End of History” is in the end an argument about modernization. What is initially universal is not the desire for liberal democracy but rather the desire to live in a modern — that is, technologically advanced and prosperous — society, which, if satisfied, tends to drive demands for political participation. Liberal democracy is one of the byproducts of this modernization process, something that becomes a universal aspiration only in the course of historical time.”

 

I’ll restate (or misrepresent) in my own terms with an emphasis on political legitimacy. The end of the cold war left Liberal Democracy, as defined by rule of law (encompassing impartiality, rights and government limited by law) and democratic accountability was the last standing Big Idea on the ideological battlefield. There are no other options to reach for to legitimise an existing or prospective regime. Even if not universal in practice, even in regimes that formally espoused it, this Big Idea’s lonely status is what represented an end of history in Kojèvian-Hegelian sense.

The game I want to play is to ask what a alternative Big Idea might look like. I should mention here that I don’t think this is a particularly useful or pertinent exercise. I would place it only slightly higher than asking “Would Batman beat x in a fight?”, and that’s only because a sensible person recognises that Batman always wins. Nonetheless… (Continued)

Back of the envelope demography.

Posted by Richard Tsukamasa Green on Tuesday, May 10, 2011

A warning, this is pretty much a shaggy dog story.

A while ago I had an idle thought about migrant settlement patterns. If there was a slight tendency amongst Chinese Australians to settle in ways that reflected subnational cultures from China (I was prompted by the Sydney suburb of Ashfield which is distinctly Shanghainese, not just Chinese), would the same tendency be visible in Indian Australians. After all, India is also vast and linguistically diverse, but has a far shorter history of unified statehood. Were there Punjabi and Bengali districts to go with the Shanghainese or Cantonese districts? I asked some bemused shopkeepers who did not have this impression. I then asked someone who may have looked at this as a professional (having published work on Indian migrants to Australia), a Professor Supriya Singh at RMIT. She kindly replied to my query (and I quote in part)

We have asked the question also but found there is no predominantly Indian suburb, and no  Punjabi, Malyali, Gujerati or Andhra concentration.

In the media there has been comment that Point Cook is developing into a very Indian suburb, with every third house being Indian. But there is no hint that it is concentrated in any one region of India. However when you look at Census distribution maps, there are no areas of Indian concentration in the way that there are Chinese, Italian or Greek cultural precincts or clusters.

This was striking in another way. Not only no clustering of subnational groups, but no clustering at all. Not only did this seem unusual compared to other migrant groups, it also seemed unusual compared to Sydney. Afterall my subjective experience would cite suburbs like Parramatta and the adjacent Harris Park, as well as other places as having a distinct Indian presence – I’d go there to try subcontinental sweets – and they were used as natural sites for cultural events like Parramasala, or a A.R Rahman concert. Maybe there was a difference between the cities. So I knocked up some maps of people born in India recorded in the 2006 census.

(Continued)

Two updates – Real time bus maps and Filipino restaurants

Posted by Richard Tsukamasa Green on Thursday, March 3, 2011

This post is merely two additions to previous posts, neither of which warranted a post on their own.

The first relates to this post from September where I talked about the idea of realtime mapping of bus services using GPS data.  Better people than I had the same idea and, through the Apps4NSW competition, Flink Labs has produced this prototype for Sydney and Newcastle buses. I think it’s great. I may have anticipated the means by which it would come (Google maps and Government 2.0) but I got the timing way out – I thought it would take years. Hopefully the new government will run with it so it becomes more phone friendly.

The other relates to my speculations on the paucity of Filipino restaurants. One hypothesis I didn’t mention is that Filipino migrants might be less prone than  other migrant groups to cluster into certain suburbs (the way we can see suburbs that are notably “Greek” or “Vietnamese” for instance), so that that a given restaurant would struggle to have a local returning customer base within it’s own community. This could be plausible if Filipino migrants have better English skills (due to American colonialism) and are therefore less likely to seek other speakers of their language to live near. Alternatively, the gender imbalance and associated exogamy may mean they are more geographically spread out.

I didn’t feel this hypothesis explained much (hence I didn’t mention it), but I kept it in mind. The other day I was using CData to map 2006 census data on migrant groups for an unrelated question (on which I’ll probably post in future), but this gave me the opportunity to compare Filipino settlement to some other groups. Notably I compared residency in Sydney and Melbourne by people born in the Phillipines with those born in two other countries, Korea and India. I chose these two because their periods of migration roughly coincide with Filipino migration, so they’d be facing similar house prices and job opportunities which would alter their choices relative to post war migrants. Additionally, unlike the Vietnamese or Lebanese (or more recently East Africans), there’d be no refugee aspect where settlement would be dictated by government decisions. Furthermore, Korean and Indian restaurants are abundant. The comparison is still flawed of course.

The maps (and some notes) are below the fold. I can see some element of greater concentration amongst Koreans and Indians, at least in Sydney (and in places where you’d find many restaurants in said cuisines), but not nearly enough to explain the disparity. The concentration of Filipinos in the spur of settlement between Blacktown and Penrith is notable – half the Filipino restaurants I know of in Sydney are in Blacktown (i.e two). Maybe there’s a lack of suitable commercial real estate there?

I don’t think there’s more for this hypothesis though, but you can look for yourself.

(Continued)

Holiday fun times: Define Asia

Posted by Richard Tsukamasa Green on Friday, January 14, 2011

Given it’s still the offseason, I thought we might want to revisit an passtime of a previous time. When I was a child in the 90s, during the Keating era, there was a fairly pointless question (they never bothered to actually debate it); Is Australia part of Asia? Whilst the question did have implications for membership in various diplomatic clubs, here it was usually framed as part of culture wars inanity. For me, finding the implications rather mild, it’s mainly an academic diversion.

And the problem, as I see it, isn’t determining where Australia belongs, or whether belonging in one category precludes belonging in others (like “The West” or “The Anglosphere”). It’s working out what “Asia” is anyway. Can we really come up with a non-arbitrary definition that includes every country we usually call Asia without including Australia?

The most basic definition is geographic. Things within certain bounds are “Asia”. Things outside it are not Asian. This is the basis for the map at right. There’s obvious problems here though. Oceans are big, so drawing a border at say, the Pacific (excluding North America) or the Indian Ocean (excluding Antarctica), but if you can jump the Malacca straits or the Richard Green Sea [fn1] or any of the other innumerable straits and seas that separate islands from the continental mass, why suddenly say that the Timor Sea or Torres Strait is too far, let alone the tiny rivulet of the Suez Canal? And if you can cross the Himalayas, taller than any other, why balk at the modesty of the Urals, or the Caucasus mountains. If there was something beneath it all, as is literally the case with plate tectonics, we might have something, but there is a mass of plates underneath “Asia”, Australia shares a plate with parts of Indonesia (“Asian” by common consent) and almost all of Europe and all of China is on a single plate.

So geographically there is little case for excluding Australia from Asia, and even less for excluding Europe. To exclude them would be to determine that Asia is defined by whatever boundaries we draw, and on that basis we may as well include Mars.

Even so, the map is too broad for the debate of my childhood. They weren’t asking how Australia related to Tajikistan (with whom we do not have an embassy) or the “Asia” referred to by the ancient Mediterraneans (which made more sense given the limited geographic knowledge of the times) – now better known as “The Middle East”. What the 90s debates referred to was more likely something called “East and South East Asia”. The “Asia” closest to us. (Continued)

X marks the trust spot

Posted by Julia on Tuesday, November 2, 2010

Here is a story about the internet working the way tech utopians think it should. Technology is as good or as bad as the social conditions of which it is a part, but this is one of the good stories. It can be read either as a perfect example of self interest working well in the aggregate, or less cynically as a kind of altruism when there may not be a payoff.

Some years ago I subscribed to a Firefox addon called Xmarks (which used to be Foxmarks). This program syncs bookmarks not just to the cloud but to other computers, cross platform. Kind of useful, but for me it soon dropped into the background like teapots or car keys; used frequently but not very front of consciousness.
I was surprised therefore to get an email from the company a couple of months ago. Sorrowfully, it informed me that the service would be discontinued as from January next year. Could I move to one of a number of competitor alternatives as they were shutting down the whole service?   (Continued)