I would recommend downloading the dataset[1] and taking a look at some of the essays. There is a tendency for an educated audience to (justifiably) worry about exactly what you've articulated, but when you read through the essays, you get an appreciation of how valuable a tool like this could be. These are elementary and middle school essays rife with rudimentary mistakes (which are at times extremely charming) and generally not filled with subtle arguments prone to misinterpretation.
If algorithms can help every seventh grader in the country write a solid 5 paragraph essay, that's progress.
I participated in the Kaggle competition, but I'm less optimistic about the pedagogic value of these systems. A black box algorithm won't explain to a student what their rudimentary mistakes are.
Great write up! I hadn't seen it, thanks for sharing.
Two things:
1) Agree about the explanation bit. In fairness, that was not part of the competition. But I do believe that, given the way the features were constructed, many of the algorithms could be modified to also provide "explanation" of their scoring. The value is highly dependent on the features though (length, for instance, would probably be a prominent "explanation"), and I haven't actually tried it.
2) The "This would grade Dickinson/Hemingway/DFW/etc poorly" argument is true but, again, read the essays. These are serious edge cases - ones that should be addressed (by, for instance, correlating algorithmic scoring with human scoring), but this argument doesn't diminish the applicability of the algorithm to the vast majority of essays and students.
Will this help the nearly-illiterate masses? I think so. Will it hurt those who are one standard deviation or more above the norm? I think that's a problem, too. The solution then is to use the algorithm on the first set and more individual-oriented grading methods for the other set. Effort should be spent finding those in the other set and that isn't restricted to a single course or assignment. This challenges the assumption that students are equal and deserve equal grading, but I don't really see another way forward that doesn't hurt any of the best, mediocre, or poor writers.
I would recommend downloading the dataset[1] and taking a look at some of the essays. There is a tendency for an educated audience to (justifiably) worry about exactly what you've articulated, but when you read through the essays, you get an appreciation of how valuable a tool like this could be. These are elementary and middle school essays rife with rudimentary mistakes (which are at times extremely charming) and generally not filled with subtle arguments prone to misinterpretation.
If algorithms can help every seventh grader in the country write a solid 5 paragraph essay, that's progress.
[1]: http://www.kaggle.com/c/asap-aes/data