Statement questions

"Why would anyone do this."

Sometimes people ask a rhetorical question, and deliver it as a sentence, without the upwards inflection at the end that we see in genuine questions. Not all rhetorical questions are delivered in this way, but it's an important enough difference in tone that when writing, the question mark is often replaced by a full stop.

That is how I used to write these statement-questions, but a friend objected, finding the full stop mentally jarring having just processed the words as a question. Did I misread? Did he mistype? And so I started putting a question mark after the full stop in these cases: "Why would anyone do this.?"

It's a solution that's just satisfying enough for me to keep using it – for people who get what I'm doing, it makes the statement-question unambiguous, and should hopefully reduce the mental jarring. But even assuming that people know what I'm doing, it is unsatisfactory: I've tried to internalise the full-stop-question-mark over a period of years, but that question mark still makes me start upwardly inflecting the end of the sentence that I've just typed specifically to not get upwardly inflected.

It occurred to me this evening that a better solution would be to borrow and then modify from Spanish. Upward-inflecting questions would get an inverted question mark ¿ at the start and a regular question mark at the end ? . As soon as the question starts, the reader would prepare to upwardly inflect at the end.

Statement-questions, on the other hand, would only get the final question mark.

Upwardly inflect: ¿Why would anyone do this?
Don't upwardly inflect: Why would anyone do this?

So, the final question mark merely confirms that gramatically, the words just written are a question. The presence or otherwise of the inverted question mark primes the reader for the appropriate inflection.

As practical suggestions go, this is somewhere beyond the "utterly useless" end of the spectrum that covers everything I've ever suggested before. It would require the internalisation of a different punctuation system by all English speakers, forgetting entirely the cues that come with the single question mark at the end (cues that would live on in all the centuries of books written before my idea becomes standard). But it seems an elegant solution, and I thought it was worth documenting, albeit in a post which I'll sneakily upload to LiveJournal in the dead of night Australian time and won't link to elsewhere.

Popular below-the-line candidates

Following the publicisation of Joe Bullock's speech just prior to the April 5 WA Senate election, I saw many Labor supporters say that they'd vote below the line for Louise Pratt, who had the number-two spot on the Labor ticket behind Bullock.

At the time of writing, we don't have the breakdown of BTL votes from the WA election, so I'm taking this opportunity to see how we'd expect to see a "below-the-line for Pratt" movement, by looking at some results from 2013. The most interesting thing to me is that Pratt was already very popular BTL relative to Bullock last September.

For each of the major parties and Greens (and all the other parties, but I haven't bothered posting them here), I calculated the ratio of BTL votes for the first candidate on the ticket to the votes for each subsequent candidate. For example, the first entry in the numbers below reads "Brown to Bilyk, 5.5": in Tasmania, Carol Brown received 5.5 times as many BTL votes as Catryna Bilyk.

I'm looking at the sample of voters who supported a party and chose to vote below the line; perhaps this was because they didn't like their party's group voting ticket, or perhaps they didn't like the order of candidates their party had chosen. I'm interested in this latter case.

The ratio of BTL votes between first and second (or third, fourth, ...) candidates is not a perfect way to capture supporter dissatisfaction with the leading candidate – if the GVT in a state is unpopular, then that would lead to more of that party's supporters voting below the line, and they would likely be giving the top candidate their first preference, thus inflating the ratio that I calculate. The natural way to capture dissatisfaction with the GVT is to just look at the percentage of votes for the party that were below the line, but this is problematic for the inter-state comparisons that I would like to make – ballot paper lengths and ticket preferences can be significantly different between states, leading to large relative changes in BTL voting rates.

With those caveats out of the way, here are the ratios of a party's lead candidate to subsequent candidates, in order of increasing relative popularity of a non-lead candidate.

Labor:

Tas: Brown to Bilyk, 5.5; Brown to Thorp 1.8, Brown to Dowling 4.3

NT: Peris to Foley, 2.1

WA: Bullock to Pratt, 2.4; Bullock to Foster, 10.7; Bullock to Ali, 13.8

Vic: Marshall to Collins, 5.7; Marshall to Tillem, 29.0; Marshall to Psaila, 36.5; Marshall to Larkins, 41.4; Marshall to Mileto, 21.1

Qld: Ketter to Moore, 5.7; Ketter to Furner, 21.7; Ketter to Boyd, 9.3

NSW: Carr to Cameron, 8.1; Carr to Stephens, 25.5; Carr to Kolomeitz, 134.2; Carr to Nelmes, 121.1; Carr to Chhibber, 35.0

ACT: Lundy to Sant, 19.0.


Liberals or some sort of LNP:

ACT: Seselja to Nash, 3.5

Tas: Colbeck to Bushby, 4.2; Colbeck to Chandler, 8.9; Colbeck to Courtney, 8.7

Vic: Fifield to Ryan, 8.4; Fifield to Kroger, 4.2; Fifield to Corboy, 6.9

NSW: Payne to Williams, 9.0; Payne to Sinodinos, 4.5; Payne to Hay, 24.0; Payne to C Cameron, 21.6; Payne to A Cameron, 11.0

SA: Bernardi to Birmingham, 5.5; Bernardi to Webb, 16.1; Bernardi to Burgess, 19.8; Bernardi to Cochrane, 67.7; Bernardi to Weaver, 56.1

NT: Scullion to Falzdeen, 6.0

Qld: MacDonald to McGrath, 20.5; MacDonald to Canavan, 31.9; MacDonald to Goodwin, 19.7; MacDonald to Craig, 25.4; MacDonald to Stoker, 13.3

WA: Johnston to Cash, 14.1; Johnston to Reynolds, 15.2; Johnston to Brockman, 29.7; Johnston to Thomas, 22.3; Johnston to Oughton, 14.9


Greens:

NT: Williams to Brand, 6.5

Tas: Whish-Wilson to Burnet, 8.3; Whish-Wilson to Ann, 24.6

WA: Ludlam to Davis, 10.1; Ludlam to Duncan, 38.2

Qld: Stone to Bayley, 10.9; Stone to Yeaman, 47.5

ACT: Sheikh to Esguerra, 22.0

NSW: Faehrmann to Ryan, 58.1; Faehrmann to Blatchford, 36.9; Faehrmann to Ho, 38.4; Faehrmann to Findley, 53.5; Faehrmann to Spies-Butcher, 45.1

Vic: Rice to McCarthy, 41.7; Rice to Truong, 55.1; Rice to Christoe, 124.7; Rice to Sekhon, 132.7; Rice to Humphreys, 40.3

SA: Hanson-Young to Mortier, 74.9; Hanson-Young to Carey, 66.8


A few obvious things spring out to me. As mentioned earlier, Pratt was already popular with BTL Labor voters compared to Bullock. (She also received more BTL votes relative to Labor ticket votes than any other non-lead Labor Senate candidate outside the Tasmania and the ACT.) Lin Thorp, formerly a minister at state level in Tasmania, was the most popular non-lead candidate by this metric. On the conservative side, both Arthur Sinodinos and Helen Kroger were high-profile candidates who appear to attract a bit of a personal vote. Greens voters seem pretty happy with their lead candidates. On a lighter note, there's more than a hint that voters disproportionately give their first preference to the last candidate in a large group.

Anyway, the little sociology experiment on WA Labor voters will be looking at the Bullock-Pratt ratio as it is counted in the coming weeks; if there was a decent swing against Bullock to Pratt, then we should see the ratio of their BTL votes fall from 2.4 to something lower. My guess is that it will fall a little below 2; I'd put the under-over at around 1.9. (Update: A few hours after posting, and I see that the first couple of hundred BTL's show Pratt getting more votes than Bullock! At this stage it looks like I'd have been closer with 0.9 rather than 1.9.)

Notes on Flappy Bird and clones

The other week I got (from work) one of these for the first time. It was just after Flappy Bird had been taken down from the App/Play Store, but I'd played a Flash clone of it (then-highest score: 15, with no other scores greater than 10), and so I downloaded Clumsy Bird, then and now the top free game on the Play Store.

Clumsy Bird is a little bit easier than Flappy Bird: in the former, you don't get as much impulse from flapping, relative to the size of the gaps you have to fly through. This leaves more margin for error, since a slightly early tap won't necessarily see you crashing into the roof of one of the gaps. Still, it takes some getting used to.

I played it by feel for a while – I tried to avoid any rages and just mindlessly tap away, almost always dying before I scored 5, hoping that eventually my brain would work things out. And it did, eventually. I passed 10 points, I passed 20 points, and suddenly the game was quite fun – an endless series of scrolling trees to fly my way through in some kind of Zen-like state.

It didn't last. I learned to read the score in my peripheral vision, a disastrous new skill that caused me to choke whenever I approached a new record, followed by the inevitable anger at dying all the times I failed to set that new record. After at least half a dozen scores in the 60's – an absurd deviation from the distribution of scores I should have had, and indeed would have had with some better sports psychology – I eventually broke the 70 barrier, and then without the pressure of a long-standing best score, I soon made over 100, with a score of 140. That's a number that will be satisfyingly large for as long as I don't talk to anyone who's done better.

After all this practice and skill-learning on Clumsy Bird, I wondered what would happen if I tried the Flash Flappy Bird clone again. My early efforts were predictably poor – having built up my feel for the game with the relatively small flaps of Clumsy Bird, the big Flappy Bird flaps saw me crash into the top of the gaps very quickly, typically before I'd scored 4 points.

I thought that this showed something interesting about a lack of transferability of skills in at least one direction between Clumsy Bird and Flappy Bird, but I was wrong. I tried a strategy of, whenever possible, flapping only when the bird had fallen below the bottom of the next gap. The idea was that, to get through the gap itself, I needed to flap near the bottom of it, and it would be easier to do this (greater margin of error in time) if the bottom of the gap was only a little below the peak of the bird's pre-flap trajectory.

And it worked. The skills I'd built up by feel on Clumsy Bird transferred to Flappy Bird once I had a sound playing strategy. I very soon had a score of over 20, and now my highest is 43. I found this transfer of skills quite interesting to experience.

---

Above, I mentioned in passing the distribution of scores I should have had. The simplest way to model Flappy-Bird-type games is to say that, at every point during the game, you have some probability p of dying before you score another point. Over a game, this probability is treated as a constant; your hope as a player is that, with practice and/or strategy, you decrease p so that your skill level rises. Of course, p isn't constant throughout the game – some transitions are harder than others, and the probability of making it through the next gap is a function of the position of the previous flap. But as a toy model, a constant-p looks OK to me.

It immediately follows from constant-p that your scores will be geometrically distributed (with scores of zero allowed). In other words, they will look a lot like a batsman's cricket scores, only you won't have as many scores of zero in Flappy Bird.

But while in cricket we calculate a batsman's average and can directly infer from that the average probability of being dismissed before scoring the next run, in Flappy Bird we usually only keep track of the highest score. If you've played Flappy Bird N times, all at a constant p, what high score "should" you have?

Let Xi be N independent geometrically-distributed random variables with failure probability p; these are the scores of each game of Flappy Bird you play. Let H = max(Xi); this is your highest score. Then

P(H ≤ h) = PRODUCT P(Xi ≤ h) = [1 - (1-p)h+1]N.

This gives us the basic relation between a high score h, skill level p, and the number of times N that you've played the game at your current skill level.

For a given p and N, what high score should we "expect"? I think that the natural way to answer this is to ask for the value of h such that P(H ≤ h) = 0.5. We get

h = ln[1 - 0.51/N] / ln(1-p) - 1.

Alternatively, given your high score of h, you could impute your skill level from

p = 1 - (1 - 0.51/N)1/(h+1).

An interesting exercise for a data-minded Flappy Bird player would be to plot a histogram of their scores and see how it compares to some of these theoretical values. Choking will show up as a spike in the probability density just below the highest score.

Assuming that I've played 100 games at my current skill level (this could be wrong by a lot), my Clumsy Bird highest score of 140 suggests p ≈ 0.035, and a cricket-style average = (1-p)/p of around 28. By the model I should have scored three centuries by now, and I've only scored one, so I suppose you can call me the Shane Watson of Clumsy Bird.

What percentage of nouns encountered in French text are feminine/masculine?

The question I ask in the title (an answer to which I didn't find on Google, though I didn't search too hard) arose out of studying the statistics of a corpus of 400,000 French-language tweets*. I noticed that sa was about three times more common than son. Looking at some of the tweets showed that this discrepancy was caused, at least in large part, by people writing ça as sa.

*For some technical details on that, see here.

Collapse )

Notes on GiveWell's recommended charities

GiveWell recently announced the latest update of their top charities list. This post is a collection of my notes on the changes, mostly focusing on the cost-effectiveness estimates.

A year ago I put a lot of hours into understanding the cost-effectiveness calculations for deworming and cash transfers, the results of which I wrote up in this post. I had planned to make this a fairly short post, merely indicating the important changes from last year, but I ended up going into a bit of detail (it's still a lot shorter than last year's).

Collapse )

The part of my brain that stores French is getting eaten by the part of my brain that stores English

The French language has quite a few pairs of vowels which aren't distinguished in English. The easiest (to me) of these pairs is what is written in IPA as /u/ (written in French as ou) and /y/ (written u and pronounced nothing like a 'y' in either English or French).* It's not super-important for communication if you get them mixed up (or, perhaps more likely, substitute the English "oo" vowel for both) – context is usually enough to work out which one you mean, and it'll just sound like a foreign accent. Still, the words vous and vu have totally different meanings, and the difference between nous and nus is even starker, so it's good to keep the two vowels separate.

*Wikipedia tells me that they're the close back rounded vowel and close front rounded vowel respectively. In Australian English (and in most parts of the Commonwealth) we use a close central rounded vowel. I've always been too intimidated by the vocabulary of phonology to understand what all those words mean. For a few moments I thought I'd finally understood at least one part of it – when I said those front, central, and back vowels in order, it really did feel like I was speaking from the front, middle, and back of my mouth respectively. But while I remain satisfied that the central vowel is somehow "in between" the front and the back vowels, I now think that I'd have been just as convinced if the front vowel was called the back vowel and vice versa.

Tina Arena's French is better than mine, but I think I detect a slight centring (?) of the first vowel in Tous les jours on en parle when she sings Entends-tu le monde ?. By contrast, Annie Lennox sings back vowels in the Eurythmics' cover of Tous les garçons et les filles: Tous les garçons et les filles de mon âge se promènent dans la roue deux par deux.

It will perhaps further betray my interest in musical English-French crossovers if I say that these musings were inspired this afternoon when, at the bus stop and apropos of nothing, the bilingual version of Bonnie Tyler's 'It's a Heartache' popped into my head.* Si tu s'arrêtes, my mind began to sing. It may have been more than five years since I spoke French regularly, but I can still pick a grammar error that egregious, and I was very sure that the opening line didn't start Si tu t'arrêtes. It took several minutes of considering various possible pronouns before I hit upon Si tout s'arrête, confirmed as the correct lyrics when I got home and checked YouTube.

*It's a duet with Kareen Antonn, whose singing career success seems to extend only to her duets with Tyler – Si demain... is a bilingual version of 'Total Eclipse of the Heart' – and being eliminated mid-way through this year's season of the French version of The Voice. If only the market for bilingual covers of decades-old pop songs was larger than seems to be....

This particular manifestation of the decay of my French skills is a curious one. My brain, at least occasionally (and perhaps only in memory), is merging /u/ and /y/, even though the two vowels remain clearly distinct to my ear. But no part of my brain considers tu and tout to be homophones, or in any sense "the same word". It took minutes before I made the connection between the incorrect tu the correct tout.

A further conclusion I can make (and one that very much surprises me at first glance), implicit in what I've written above, is that my memory is at least sometimes storing sounds rather words. When I first heard the song Si tout s'arrête, I saw the title before it started, and there was no confusion about what the second word was. But rather than storing tout, my brain seems to have broken it down into phonetic chunks, one of which got mangled by neurones too used to processing English. It would surprise me if I made such a mistake remembering a line of spoken French, so I'll conjecture that this sort of "store sounds rather than words" is more common with remembering songs, but I have no idea if that is correct, and if so how common it is.

Monte Carlo for substitution ciphers

The Markov chain Monte Carlo revolution by Persi Diaconis opens with an example now used as an exercise for undergraduates. The goal is to decrypt some text that's been encrypted by a simple substitution cipher. Al-Kindi was using frequency analysis to break substitution ciphers in the 9th century, so the present application of Monte Carlo is very much in the "toy problem" category, but it's a fun idea and I enjoyed seeing it in action.

We start by gathering frequencies of letter-pairs from a suitably large block of English text; I used the KJV Bible, about 4.3MB worth of text, though I haven't given much thought to how much text we actually need. With the letter-pair frequencies, we can associate a "plausibility" for each map of cipher-alphabet to real-alphabet: use the map to generate the possibly-decrypted text, then find the product of the frequencies of every letter-pair in that text. The procedure is to randomly swap pairs of letters in the map; if it improves the plausibility, then the swap is accepted, otherwise a weighted coin is flipped to decide if the swap is accepted or rejected. (The latter will usually reject the swap of letters, but the occasional acceptances allow the algorithm to escape from a local optimum.)

Using single-letter frequencies would defeat the purpose of a Monte Carlo exercise, since we may as well just order the letters by frequency. By using letter-pairs, we're effectively looking at how the letters fit together. While I haven't investigated it, I suspect that the letter-pair Monte Carlo approach will work on smaller enciphered texts than a simple ordering by single-letter frequency.

The text I enciphered and then deciphered is here; that sort of length seems to be necessary to stop the code from getting stuck in wrong local optima (though that might still happen sometimes anyway). When I started with a smaller enciphered text, I sometimes got some quite bad results, such as "D TS R ARINLRNE RIC EIPTDUIMEIO BUD SOROTSOTHRA...". Here is the trajectory of a solution that found the right answer:

R code with fancy syntax-highlighting is here.

PCA on BTL's

In 2010 I discovered that the AEC publish all below-the-line Senate votes, and had a bit of a poke around with them here, here, here, and here. The posts weren't earth-shattering, but I think as curiosities, they stand up pretty well.

Since 2010 I've learned some new statistical words, and since yesterday I've also learned a bit of R. Putting these together, this post shows some cookbook principal-component analysis with graphs made in R.

Collapse )