I blog about statistics and research design with an audience consisting of researchers in bilingualism, multilingualism, and applied linguistics in mind.
Latest blog posts
29 October 2018
In this blog post, I take a closer look at the results of a classic study I sometimes discuss in my classes on second language acquisition. As I’ll show below, the strength of this study’s findings is strongly overexaggerated, presumably owing to a mechanical error.
26 September 2018
I’ve put my first R package on GitHub!
cannonball and contains a couple of functions that I use for teaching;
perhaps others will follow.
12 September 2018
I’ve written a paper titled Checking the assumptions of your statistical model without getting paranoid and I’d like to solicit your feedback. The paper is geared towards beginning analysts, so I’m particularly interested in hearing from readers who don’t consider themselves expert statisticians if there is anything that isn’t entirely clear to them. If you’re a more experienced analyst and you spot an error in the paper or accompanying tutorial, I’d be grateful if you could let me know, too, of course.
27 July 2018
In this follow-up to the blog post Baby steps in Bayes: Piecewise regression, I’m going to try to model the relationship between two continuous variables using a piecewise regression with not one but two breakpoints. (The rights to the movie about the first installment are still up for grabs, incidentally.)
6 July 2018
I’m currently working on a large longitudinal project as a programmer/analyst. Most of the data are collected using paper/pencil tasks and questionnaires and need to be entered into the database by student assistants. In previous projects, this led to some minor irritations since some assistants occasionally entered some words with capitalisation and others without, or they inadvertently added a trailing space to the entry, or used participant IDs that didn’t exist – all small things that cause difficulties during the analysis.
Anyway, you can download a slimmed-down version
of this platform
The comments in the PHP files should tell you what
I try to accomplish; if something’s not clear, there’s
a comment section at the bottom of this page.
You’ll need a webserver that supports PHP,
and you’ll need to change the permissions of the
You can also check out the demo.
To log in, use one of the following
(You can change the accepted e-mail addresses in
The password is
Then enter some data. You can only enter
data for participants you’ve already created an ID for,
though. For this project, the participant IDs
consist of the number 4 or 5 (= the participant’s grade), followed by a dot,
followed by a two digit number between 0 and 39 (= the participant’s class),
followed by a dot and another two digit number between
0 and 99. The entry for
Grade needs to match the
first number in
If you enter task data for a participant for whom
someone has already task data at that data collection wave,
you’ll receive an error. You can override this error
by ticking the
Correct existing entry? box at the bottom.
This doesn’t overwrite the existing entry, but adds the
new entry, which is flagged as the accurate one.
During the analysis, you can then filter out data that
was later updated.
Hopefully this is of some use to some of you!
4 July 2018
Inspired by Richard McElreath’s excellent book Statistical rethinking: A Bayesian course with examples in R and Stan, I’ve started dabbling in Bayesian statistics. In essence, Bayesian statistics is an approach to statistical inference in which the analyst specifies a generative model for the data (i.e., an equation that describes the factors they suspect gave rise to the data) as well as (possibly vague) relevant information or beliefs that are external to the data proper. This information or these beliefs are then adjusted in light of the data observed.
I’m hardly an expert in Bayesian statistics (or the more commonly encountered ‘orthodox’ or ‘frequentist’ statistics, for that matter), but I’d like to understand it better – not only conceptually, but also in terms of how the statistical model should be specified. While quite a few statisticians and methodologists tout Bayesian statistics for a variety of reasons, my interest is primarily piqued by the prospect of being able to tackle problems that would be impossible or at least awkward to tackle with the tools I’m pretty comfortable with at the moment.
In order to gain some familiarity with Bayesian statistics, I plan to set myself a couple of problems and track my efforts in solving them here in a Dear diary fashion. Perhaps someone else finds them useful, too.
The first problem that I’ll tackle is fitting a regression model in which the relationship between the predictor and the outcome may contain a breakpoint at one unknown predictor value. One domain in which such models are useful is in testing hypotheses that claim that the relationship between the age of onset of second language acquisition (AOA) and the level of ultimate attainment in that second language flattens after a certain age (typically puberty). It’s possible to fit frequentist breakpoint models, but estimating the breakpoint age is a bit cumbersome (see blog post Calibrating p-values in ‘flexible’ piecewise regression models). But in a Bayesian approach, it should be possible to estimate both the regression parameters as well as the breakpoint itself in the same model. That’s what I’ll try here.
27 June 2018
All too often, empirical studies in applied linguistics are run in order to garner evidence for a preordained conclusion. In such studies, the true, perhaps unstated, research question is more of a stated aim than a question: “With this study, we want to show that [our theoretical point of view is valuable; this teaching method of ours works pretty well; multilingual kids are incredibly creative; etc.].” The problem with aims such as these is that they take the bit between square brackets for granted, i.e., that the theoretical point of view is indeed valuable; that our teaching method really does work pretty well; or that multilingual kids indeed are incredibly creative – the challenge is merely to convince readers of these assumed facts by demonstrating them empirically. I think that such a mentality leads researchers to disregard evidence contradicting their assumption or explain it away as an artifact of a method that, in hindsight, wasn’t optimal.
A healthier attitude is to formulate research questions as, well, questions: “We carried out this study since we wondered whether [our theory explains the data better than extant theories; our teaching method yields better results that the current one; multilingual kids are more creative than their peers; etc.].” Genuine research questions at least leave open the possibility that the theory doesn’t explain the data better than extant theories; that the new teaching method isn’t any better than the current one; or that multilingual kids aren’t more creative than their peers. I think that consciously phrasing research questions as genuine questions puts the emphasis on evaluating different possibilities rather than on convincing the audience of an assumed fact.
Yes/no questions obviously invite yes/no answers. When the answer to a yes/no question isn’t trivial, this is fine. But when the question boils down to a vague “Are there some differences between these groups?”, it’s often highly likely that the answer will be “yes”. In such cases, it may be more fruitful to phrase the research question as a wh-question instead: “We wondered how/when/under which circumstances/in which respects/to what extent these groups differ?” The answers to questions such as these may still be “very little”, “rarely”, “in hardly any”, etc., but that’s more informative than a trivial “yes”.
25 April 2018
Statistical models come with a set of assumptions, and violations of these assumptions can render irrelevant or even invalid the inferences drawn from these models. It is important, then, to verify that your model’s assumptions are at least approximately tenable for your data. To this end, statisticians commonly recommend that you check the distribution of your model’s residuals (i.e., the difference between your actual data and the model’s fitted values) graphically. An excellent piece of advice that, unfortunately, causes some students to become paranoid and see violated assumptions everywhere they look. This blog post is for them.
12 February 2018
A good question to ask yourself when designing a study is, “Who and what are any results likely to generalise to?” Generalisability needn’t always be a priority when planning a study. But by giving the matter some thought before collecting your data, you may still be able to alter your design so that you don’t have to smother your conclusions with ifs and buts if you do want to draw generalisations.
The generalisability question is mostly cast in terms of the study’s participants: Would any results apply just to the participants themselves or to some wider population, and if so, to which one? Important as this question is, this blog post deals with a question that is asked less often but is equally crucial: Would any results apply just to the materials used in the study or might they stand some chance of generalising to different materials?
20 November 2017
In recent years, psychologists have started to run large-scale replications of seminal studies. For a variety of reasons, which I won’t go into, this welcome development hasn’t quite made it to research on language learning and bi- and multilingualism. That said, I think it can be interesting to scrutinise how these large-scale replications are conducted. In this blog post, I take a closer look at a replication attempt by O’Donnell et al. with some 4,500 participants that’s currently in press at Psychological Science and make five suggestions as to how I think similar replications could be designed to be even more informative.