I’ve been experiencing a persistent urge to data-mine my own blog lately. Without speculating as to why this might be, or whether or not any of all y’alls would be interested in such a thing, here are some notes to that end:

  • Someone ought to WordPressify this weighted word list script; should be easy. What I’d really like is something more akin to Amazon’s Statistically Improbable Phrases, but that would be less easy.
  • I’d also like to modify the Word Statistics plugin to produce a graph of reading level vs. time. I expect it to show that my writing has improved, for some convenient definition of “improved” which I will decide on after seeing the results.
  • I’d also like to see posts-per-comment vs. time.

Update: I turned the weighted word list script into the Weighted Words plugin.


  1. wolfangel wrote:

    That would be cool. I think someone should do those, and then I too will take them. I will also — eventually — ask you how you move all your archivey stuff onto a different page. This is sort of asking, I guess.

  2. yami wrote:

    I’m working on the first one. And as for the archivey stuff, basically you create a Page and assign it a special template, but I wrote it up in more detail on Codex. I can send you my template, too, if you want.

  3. wolfangel wrote:

    Remarkably, you seem to talk about cheese more than chocolate.

  4. yami wrote:

    There are more kinds of cheese in the supermarket than kinds of chocolate (that are any good), for one. And I talk about beer more than either.

  5. yami wrote:

    Also, I’m amazed that I’ve used the word “science” more than “shit” (though I suspect this result wouldn’t hold if the code were clever enough to separate compound words).

