Page 2 of 3 << [1] [2] [3] >> From: (Anonymous)

2008-01-07 06:06 am (UTC)

In situations like these, I'd prefer line.strip() to (line.split())[0]. The latter doesn't need the extra parentheses, anyway (and I don't know the layout of the file, if there is more than one thing on every line my replacement doesn't work). From: ciphergoth

2008-01-07 11:34 pm (UTC)

It doesn't - see the comment in the Perl code, there are other fields in the file. I think this is the cleanest way. From: mart

2008-01-07 10:19 am (UTC)

I thought it might be fun to do it in C#, just for kicks: using System; using System.IO; using System.Linq; using System.Collections.Generic; public class AnagramNames { public static void Main(string[] args) { var names = from name in File.ReadAllLines("dist.male.first") select name.Split(' ')[0]; var pairs = from pair in ( from name in names group name by SortCharsInString(name) ) where pair.Count() > 1 select pair; foreach (var pair in pairs) { Console.WriteLine(String.Join(",", pair.ToArray())); } } public static string SortCharsInString(string s) { char[] arr = s.ToArray(); Array.Sort(arr); return new string(arr); } } It could be even shorter and more memory-efficient if I could find a way in the standard library to: Get an IEnumerable<string> for "lines in a file" without reading the whole file into an array (it's easy to write such a thing, so I'm sure it's in the standard library somewhere...)

for "lines in a file" without reading the whole file into an array (it's easy to write such a thing, so I'm sure it's in the standard library somewhere...) Sort the characters in a string in a functional way, rather than in-place No timings, since I had to write this in Windows; Mono doesn't have stable LINQ support yet. From: mtbg

2008-01-07 05:42 pm (UTC)





So, by deduction, it must be really really hard to write a Pair class in Java...



I have very little to say about the Python version that ciphergoth letters = [letter for letter in name] is better spelled letters = list(name) . (it's easy to write such a thing, so I'm sure it's in the standard library somewhere...)So, by deduction, it must be really really hard to write a Pair class in Java...I have very little to say about the Python version thathasn't said. I probably would've implemented it as the list comprehension version. One comment about your version:is better spelled (no subject) - brad Expand (no subject) - mart Expand (Deleted comment) (no subject) - mart Expand C# Goodness - youngoat Expand From: free_meal

2008-01-07 02:12 pm (UTC)

Hello. I'll be a bit offtop. Sorry. I'm writing a work for russian university, work is called "livejournal.com as unofficial massmedia", and i'd like to know your opinion about this. What do you think of the fact that livejournal.com is becoming or became already fourth authority? And can you predict lj's future? What do you prefer: newspapers and tv news or it? Thank you. Danil. From: (Anonymous)

2008-01-07 04:00 pm (UTC)



(require 'cl) (with-current-buffer (find-file-noselect "input") (let ((hash (make-hash-table :test #'equal))) (while (and (not (eobp)) (re-search-forward "^\\w*" nil t)) (setf str (match-string 0)) (sort* str #'<) (push (match-string 0) (gethash str hash))) (maphash (lambda (k v) (when (cdr v) (print v))) hash))) ;; Emacs Lisp version :D From: (Anonymous)

2008-01-07 05:19 pm (UTC)

A php version I know there will be haters but here is a php version.



#!/usr/local/php5/bin/php

$line) {

$parts = explode(" ", $line);

$letters = str_split($parts[0]);

sort($letters);

$letters = implode('', $letters);

if (!array_key_exists($letters, $anagrams)) {

$anagrams[$letters] = array();

}

$anagrams[$letters][] = $parts[0];

}



foreach ($anagrams as $key => $value) {

if (count($value) < 2) {

continue;

}

echo implode(", ", $value)."

";

} From: (Anonymous)

2008-01-10 04:12 pm (UTC)

Re: A php version Well, it seems you be a hater of indenting.

Please stop the hate! (Deleted comment) From: ciphergoth

2008-01-07 11:46 pm (UTC)

Test this - it will throw a TypeException; in addition you need to print only the entries with more than one name in the list. "setdefault" is easier. Also, compare my "list comprehensions" version above. From: crw

2008-01-07 07:21 pm (UTC)

Alice / Celia, sadly missing (from Jeff Noon's "Automated Alice") From: ext_76681

2008-01-07 08:51 pm (UTC)

Nice post, it's fun to see different approaches. Here's mine in Common Lisp: (let ((hash (make-hash-table :test #'equal))) (with-open-file (stream "names") (loop for line = (read-line stream nil nil) for key = (sort (copy-seq line) #'char<) until (null line) do (push line (gethash key hash nil)))) (loop for v being the hash-value of hash do (when (cdr v) (format t "~{~A~^, ~}~%" v))))

From: ext_76693

2008-01-08 12:56 am (UTC)

Generalize the solution This problem is one of many that involve clustering values based on their signatures. If you factor out the signature-computing method as a parameter, what's left is a handy, reusable function that solves a large family of related problems:



ClusterBy: a handy little function for the toolbox (http://blog.moertel.com/articles/2007/09/01/clusterby-a-handy-little-function-for-the-toolbox).



Cheers,

Tom

From: stillsheryl

2008-01-08 05:54 am (UTC)

hey, it was great to meet you and I had a great time at the party. I hope I didn't freak you out or anything. My brother was telling me I hugged you and beau goodbye like 8 times. I guess a little absinthe will do that, hey? From: brad

2008-01-08 07:47 am (UTC)

Heh, it's okay... I was amused. After the 5th hug or so I figured they'd keep coming for awhile. :-) (no subject) - stillsheryl Expand (no subject) - brad Expand From: (Anonymous)

2008-01-08 07:23 pm (UTC)

Finding Anagrams with XSLT See how this is done in pure XSLT here:



http://dnovatchev.spaces.live.com/blog/cns!44B0A32C2CCF7488!357.entry





Cheers,

Dimitre Novatchev

From: (Anonymous)

2008-01-09 10:36 pm (UTC)

naive scala implementation import scala.io.Source import scala.collection.mutable.HashMap var nameset = new HashMap[String,String] for ( line <- Source.fromURL("http://www.census.gov/genealogy/names/dist.male.first").getLines ) { val name = line.split(" ")(0) val sname = name.split("").toList.filter(x => x.length > 0).sort(_ < _).reduceLeft[String]((x,y) => x + y) if ( nameset.contains(sname) ) { println(name + " - " + nameset(sname)) } else { nameset(sname) = name } } still figuring scala out so this is probably dumb but still within a reasonable #LOC... From: (Anonymous)

2008-01-10 05:16 am (UTC)

Now in the Io programming language Just for fun, same thing in the Io programming language (http://www.iolanguage.com):



#! /usr/bin/env io



names := File open("dist.male.first") readLines map(line, line split at(0))



by_anagram := Map clone

sorted := Object clone



names foreach(name,

sorted = name clone;

sorted sort;

(by_anagram hasKey(sorted)) ifFalse(by_anagram atPut(sorted, List clone));

by_anagram at(sorted) append(name))



by_anagram values select(li,li size > 1) map(asString) join("

") println From: (Anonymous)

2008-01-10 05:46 am (UTC)

shorter python code #!/usr/bin/env python My shorter python version (also at http://pastebin.ca/849068): by_anagram = {} names_file = open('dist.male.first') for line in names_file: name, _ = line.split(1) sorted_name = ''.join(sorted(letter for letter in name)) if sorted_name not in by_anagram: by_anagram[sorted_name] = [] by_anagram[sorted_name].append(name) print '

'.join(x for x in by_anagram if len(x) < 2) From: (Anonymous)

2008-01-10 10:02 am (UTC)

Idiomatic Perl version :-) perl -lne'push@{$a{join"",sort/\G\w/g}},/(\w+)/}{@$_>1&&print"@$_"for values%a' dist.male.first Page 2 of 3 << [1] [2] [3] >>