Researchers at the Illinois Institute of Technology have written a program which identifies the sex of an author by their word usage frequency. Apparently women use relationship-related words like "with" and "for", whereas men use more specific and absolute words like "the", "this" and "as"; which brings us back to the old rock-logic/water-logic cliché.
The results showed that the words favoured most heavily by men were what grammarians call determinative words such as "the," "a," "as," "that" and "one." Female writers favoured "she" and relationship words such as "for," "with," "in," "and" and "not."
"This is surprising, since, unlike conversation, writing a book or an article does not involve direct social interaction"

Hmmm; if one wrote up such a program and applied it to, say, blogs on the web, I wonder what proportion it would sex accurately.

Update: the paper may be found here (though you have to subscribe to get the PDF). However, there is also a copy on the personal page of Prof. Moshe Koppel, one of the authors. And it appears that they're from Israel, not Illinois. (Perhaps the journalist confused the abbreviations?)

Posted by: gjw | http://the-fix.org | Wed May 28 00:53:29 2003

I wish this program was running on a CGI so I could give it a go - it looks like fun. I wonder if it's capable of discriminating in technical writing, where your choice of words is much more restricted.

Posted by: acb | http://dev.null.org | Wed May 28 09:09:08 2003

It appears that the paper may be found here:

http://www3.oup.co.uk/litlin/current/170401.sgm.abs.html

The abstract also mentions that the same technique may be used to determine whether a text is fiction or non-fiction.

Want to say something? Do so here.

Note to spammers: This comment system applies the rel=nofollow attribute to the poster's URL and all links. Posting links to this page will not improve their search engine rankings.

Display name:
URL:(optional)
To prove that you are not a bot, please enter the text in the image on the right in the field below it.

Your Comment:

Remember my details.

Please keep comments on topic and to the point. Inappropriate comments may be deleted.

Note that markup is stripped from comments; URLs will be automatically converted into links.