(disclaimer - I work at Kaggle) I would recommend downloading the dataset[1] and...

gammarator · on June 10, 2012

I participated in the Kaggle competition, but I'm less optimistic about the pedagogic value of these systems. A black box algorithm won't explain to a student what their rudimentary mistakes are.

I wrote about this more extensively here: http://bellm.org/blog/2012/04/23/arguing-to-the-algorithm-ma...

numlocked · on June 10, 2012

Great write up! I hadn't seen it, thanks for sharing.

Two things: 1) Agree about the explanation bit. In fairness, that was not part of the competition. But I do believe that, given the way the features were constructed, many of the algorithms could be modified to also provide "explanation" of their scoring. The value is highly dependent on the features though (length, for instance, would probably be a prominent "explanation"), and I haven't actually tried it.

2) The "This would grade Dickinson/Hemingway/DFW/etc poorly" argument is true but, again, read the essays. These are serious edge cases - ones that should be addressed (by, for instance, correlating algorithmic scoring with human scoring), but this argument doesn't diminish the applicability of the algorithm to the vast majority of essays and students.

Jach · on June 10, 2012

Will this help the nearly-illiterate masses? I think so. Will it hurt those who are one standard deviation or more above the norm? I think that's a problem, too. The solution then is to use the algorithm on the first set and more individual-oriented grading methods for the other set. Effort should be spent finding those in the other set and that isn't restricted to a single course or assignment. This challenges the assumption that students are equal and deserve equal grading, but I don't really see another way forward that doesn't hurt any of the best, mediocre, or poor writers.