leonardo

View: Recent Entries. View: Archive. View: Friends. View: Profile. View: Website (My Website). October 6th, 2008 Tags: programming Security: Subject: Merging sequences Time: 11:16 pm

http://www.obfuscated.org/2008/09/29/help-me-write-idiomatically-correct-python/



This is my modified solution to that small problem, solved in Python and Haskell. Note that in Python operations among sets and dicts items are very fast, so it's better to perform them on already created sets. This takes two sorted sequence of items, and returns a list of pairs in sorting order:

def cmpseq(seqa, seqb): a = set(seqa) b = set(seqb) s = sorted(s1 | s2) return [((x if x in a else None), (x if x in b else None)) for x in s] print cmpseq([1,2,3,4,5], [2,4,6,8,10]) Output:



[(1, None), (2, 2), (3, None), (4, 4), (5, None), (None, 6), (None, 8), (None, 10)]



A partially lazy version of that Python code can be produced by returning:

return (((x if x in s1 else None), (x if x in s2 else None)) for x in union) Now I try to translate that code in D, using my libs. There are several ways to do it. D language has no boxed integers. You can define them, but no other parts of the language use them. So it may be better to use the simplest possible thing, so I've used pairs of pointers to items (integers here). The D GC will take care of not delete items referenced still.



This works only with arrays as inputs, but it's easy to generalize (this code requires the Set data structure of the libs to have the opOr() method too, it's implemented but commented out still waiting for more testing):

import d.all; Struct!(T*,T*) cmpArr(T)(T[] seq_a, T[] seq_b) { auto a = set(seq_a); auto b = set(seq_b); auto s = sorted(a | b); return map((T x){return toStruct(x in a ? &x : cast(T*)null, x in b ? &x : cast(T*)null);}, s); } void main() { putr(cmpArr([1,2,3,4,5], [2,4,6,8,10])); } A possible output:



[toStruct(0x12FEC8, null), toStruct(0x12FEC8, 0x12FEC8), toStruct(0x12FEC8, null), toStruct(0x12FEC8, 0x12FEC8), toStruct(0x12FEC8, null), toStruct(null, 0x12FEC8), toStruct(null, 0x12FEC8), toStruct(null, 0x12FEC8)]



It can be modified to accept general iterables:

Struct!(BaseType1!(Tya)*, BaseType1!(Tya)*)[] cmpArr(Tya, Tyb)(Tya seqa, Tyb seqb) { alias BaseType1!(Tya) TyItem; static assert(is(TyItem == BaseType1!(Tyb)), "Different base type"); auto a = set(seqa); auto b = set(seqb); auto s = sorted(a | b); TyItem* none = null; return map((TyItem x){return toStruct(x in a ? &x : none, x in b ? &x : none);}, s); } The code can be improved more to allow it to digest for example an iterable of ints and an iterable of longs.



That's O(n) code, but input data is sorted, so it can be solved in O(n). The following works only on arrays as inputs, but it can be generalized some (not much tested):

Struct!(T*, T*)[] cmpArr(T)(T[] seq_a, T[] seq_b) { T* pa = seq_a.ptr; T* pb = seq_b.ptr; T* enda = pa + len(seq_a); T* endb = pb + len(seq_b); ArrayBuilder!(Struct!(T*, T*)) result; result.reserve = max(len(seq_a), len(seq_b)); while (pa != enda && pb != endb) if (*pa < *pb) { result ~= toStruct(pa, cast(T*)null); pa++; } else if (*pa > *pb) { result ~= toStruct(cast(T*)null, pb); pb++; } else { result ~= toStruct(pa, pb); pa++; pb++; } while (pa != enda) { result ~= toStruct(pa, cast(T*)null); pa++; } while (pb != endb) { result ~= toStruct(cast(T*)null, pb); pb++; } return result.toarray; } But as you can see the code is quite more hairy.

Update: see following post:

http://leonardo-m.livejournal.com/70252.html Inspired by this blog post:This is my modified solution to that small problem, solved in Python and Haskell. Note that in Python operations among sets and dicts items are very fast, so it's better to perform them on already created sets. This takes two sorted sequence of items, and returns a list of pairs in sorting order:Output:[(1, None), (2, 2), (3, None), (4, 4), (5, None), (None, 6), (None, 8), (None, 10)]A partially lazy version of that Python code can be produced by returning:Now I try to translate that code in D, using my libs. There are several ways to do it. D language has no boxed integers. You can define them, but no other parts of the language use them. So it may be better to use the simplest possible thing, so I've used pairs of pointers to items (integers here). The D GC will take care of not delete items referenced still.This works only with arrays as inputs, but it's easy to generalize (this code requires the Set data structure of the libs to have the opOr() method too, it's implemented but commented out still waiting for more testing):A possible output:[toStruct(0x12FEC8, null), toStruct(0x12FEC8, 0x12FEC8), toStruct(0x12FEC8, null), toStruct(0x12FEC8, 0x12FEC8), toStruct(0x12FEC8, null), toStruct(null, 0x12FEC8), toStruct(null, 0x12FEC8), toStruct(null, 0x12FEC8)]It can be modified to accept general iterables:The code can be improved more to allow it to digest for example an iterable of ints and an iterable of longs.That's O(n) code, but input data is sorted, so it can be solved in O(n). The following works only on arrays as inputs, but it can be generalized some (not much tested):But as you can see the code is quite more hairy.Update: see following post: comments: Leave a comment

leonardo

View: Recent Entries. View: Archive. View: Friends. View: Profile. View: Website (My Website).