Some time ago I wrote about sentence counting in Emacs buffers (or regions). I promised a sequel, and here it is.

The real reason to count sentences is that I wanted to be able to automatically put a signature in my email referring to one of the http://sentenc.es/ webpages.

It turns out, however, that counting sentences in email messages is much more difficult than it seems. First of all, you don’t want to count the headers and the signature – but that is trivial to accomplish. What is definitely non-trivial is how to exclude the quotations.

The reason behind the difficulties lies in the fact that quotations are line-based and sentences are not. In fact, while an Emacs “sentence” cannot transcend paragraph boudaries (i.e., blank lines), Emacs does not consider a line consisting only of the quotation prefix (“>”) blank.

There are two possible solutions to that problem. One may be changing the rules the paragraphs are separated. It might work, though I didn’t try that. One problem I’d expect with this approach is this: do we check whether we are in quotation at the beginning or at the end of each sequence? Either way, some partial sequences would be excluded.

I went with another – maybe a bit more complicated – approach. I loop over all parts of the email between quotations, and count sequences in each of them separately. I also take care for all the blank lines and some other things, like the singular/plural issue I mentioned in the previous post. Also, the citation line (which usually says something like “Aunt Milly wrote this on that day:”) is counted among the quotations.

One thing I’m not particularly happy about is that the user has to customize the variable message-quotation-regex . (I define it using a defcustom so that the user can use the Customize feature to set it.) On the other hand, I see no way of generating it programmatically, especially that the citation line may look very differently across setups. (Probably I could use a regex matching just the citation line and attach the "> \\|>$\\|"​ part myself. Since there are other possible quotation styles, e.g. using indentation instead of the greater-than-sign, I decided it’s not worth it.)

(defcustom message-quotation-regex "> \\|>$\\|On.*wrote:$" "A regular expression matching at the beginning of a quotation line. Most probably should be an alternative of a quotation prefix (usually \"> \"), an empty quotation line (usually \">$\") and a citation line (e.g., \"On.*wrote:$\").") (defun message-in-quotation-p () "Return t if the point is within a quotation, including the citation line." ;; We can't use `message-yank-prefix', since the quotation line may ;; be just the single ">", and the default value of ;; `message-yank-prefix' is "> ". (save-excursion (beginning-of-line) (looking-at-p message-quotation-regex))) (defun message-count-sentences (&optional print-message) "Count the sentences in the current message. Exclude headers, signature and quotation lines. Print the resulting number if PRINT-MESSAGE is non-nil." (interactive "p") (save-excursion (save-restriction (narrow-to-region (progn (goto-char (point-min)) (search-forward (concat "

" mail-header-separator "

") nil t) (point)) (progn (goto-char (point-max)) (re-search-backward message-signature-separator nil t) (skip-chars-backward " \t

") (point))) (goto-char (point-min)) (let ((sentences 0)) (while (not (eobp)) (save-restriction (narrow-to-region (point) (save-excursion (while (not (or (message-in-quotation-p) (eobp))) (forward-line 1)) (skip-chars-backward " \t

") (point))) (while (not (eobp)) (forward-sentence 1) (setq sentences (1+ sentences)))) (unless (eobp) (forward-char)) (while (and (not (eobp)) (or (message-in-quotation-p) (looking-at-p "^[ \t]*$"))) (forward-line 1))) (if print-message (message "%s sentence%s in this message." sentences (if (= 1 sentences) "" "s")) sentences)))))

This is not the end (yet), though. In the future, I’m going to show how I used this function to automatically insert a suitable signature in my email. Also, I’ll most probably define a hydra for insertion of boilerplate stuff like greetings, dividing the quotation into blank-separated paragraphs and moving between them. This is not yet written, but I have a strong incentive to do that: I spend some time reading and writing/responding to email (sometimes on the order of half an hour a day – I know, businesspeople would laugh at this, but I consider it twice as much as I’d like…), and every bit of Elisp which could streamline that is worth quite a bit for me.

CategoryEnglish, CategoryBlog, CategoryEmacs