Efficiency is doing things right; effectiveness is doing the right things. - Peter F. Drucker

Summary

Foreword

You get three choices, and may choose two:

Note: This is in no way 100% accurate representation of these text editors. A text editor is a complex application, so there are many pros (and cons) that this review ignores or does not cover.

September 6th 2005





"GNU Moe is a powerful, 8-bit clean, console text editor for ISO-8859 and ASCII character encodings. It has a modeless, user-friendly interface, online help, multiple windows, unlimited undo/redo capability, unlimited line length, global search/replace (on all buffers at once), block operations, automatic indentation, word wrapping, file name completion, directory browser, duplicate removal from prompt histories, delimiter matching, text conversion from/to UTF-8, romanization, etc."

data[line][col]

(vector*(10000*lines))

"Sam is an interactive multi-file text editor intended for bitmap displays. A textual command language supplements the mouse-driven, cut-and-paste interface to make complex or repetitive editing tasks easy to specify."

KiB

n

KiB

n

KiB

n

KiB

KiB

((20GB/8KiB) - 8KiB)

"Vi

(visual) is a display oriented text editor based on

ex

(1).

Ex

and

vi

run the same code; it is possible to get to the command mode of

ex

from within

vi

and vice-versa."

"GTK+ has an extremely powerful framework for multiline text editing. The primary objects involved in the process are

, which represents the text being edited, and

, a widget which can display a

. Each buffer can be displayed by any number of views.

One of the important things to remember about text in GTK+ is that it's in the UTF-8 encoding."

" GNU Emacs is an extensible, customizable text editor—and more. At its core is an interpreter for Emacs Lisp, a dialect of the Lisp programming language with extensions to support text editing."

" SciTE is a SCIntilla based Text Editor. Originally built to demonstrate Scintilla, it has grown to be a generally useful editor with facilities for building and running programs. It is best used for jobs with simple configurations - I use it for building test and demonstration programs as well as SciTE and Scintilla, themselves. "

Scintilla unfortunately opts for inferior textual data management. It is very similar to GNU Moe, using a list of lines. Unlike GNU Moe though,

"Author of Scintilla here. Scintilla does not use a list of lines. The text is stored in a gap buffer (the substance field in the code shown), like EMACS. Line start positions (added by the InsertLine method) are also stored in a gap buffer but with a 'step' which enables modifications in close proximity to affect few elements."

Conclusion

Not quite chinese, but sounds wise enoughA review of how GNU Moe, Plan 9 Sam, Bill Joy's Ex/Vi, GNOME GtkTextView/GtkTextBuffer, GNU Emacs, and due to popular demand, Scintilla, manage their textual data. This article is subject to updates.This is a short review of how various text editors manage their textual data within a single buffer. Like many programmers, I have a fascination with text editors. We only want the best tools for the job - so what does the best mean? For some, they want editing to have high physical efficiency - using keyboard shortcuts and command keys, maybe the odd time the mouse comes in handy. Others want their editors to be virtually efficient - to make the most out of their resources (RAM, disk space) - or to be "small". There is a race though to have all of these - and the editors mentioned in this article allow you to see what their authors decided to go with.Each editor section includes a copy of this diagram with the properties they have highlighted.And with that, lets begin our journey.First public release:TL;DR: Almost like GNU Nano (on the surface).From my observations, Moe is a small C++ console text editor. Iteasy to use. The code base is easy to navigate, as it remains small and simple. It took me around 5-10 minutes to find insertion-related code. Like it says in the TL;DR, if Nano didn't exist, Moe would certainly be its replacement.Moe sees its text as aof. It accesses the data as a 2D array:The buffer is initialized as awith 10,000 lines. If a file is loaded with more than 10,000 lines, or surpasses this limit while editing, it will add 1/10th of the amount of max lines to the buffer.This means Moe always useswhen it's started. It isn't often I open files that are 10,000 lines, so this would be wasteful for my uses. This behavior only becomes useful after the 10,000 line mark, although a little wasteful. If say while typing we hit the 1,000,000 line mark, Moe will allocate space for 100,000 lines. This is not very resource efficient, unless you are someone who typically types an additional 100,000 lines after your millionth line. Besides that though, Moe has a simple interface, easy and memorable keyboard shortcuts, and small program size, which makes it suitable for embedded systems, but only if they are equipped with a generous amount of RAM.First public release: 1980sSam is relatively small when compared to Vim and Emacs - in fact it is probably the second or most smallest editor mentioned in this review. It is ed's successor, but much more generic. Sam works on the character-level instead of the line-level. It is one of the first editors to separate its UI from the actual editor - Sam can be used on both the command-line and as a graphical text editor. Its command language makes it scriptable too.Sam's data is linear.Sam manages a buffer as a variable-sized block, starting at 8It will expand itself bybytes after the 8point, or ifbytes is greater than 8, however many timesfits into 8blocks. The remainder of the bytes are put into a final 8block.This makes Sam a fairly efficient editor. From what I've read, it will cache and write blocks to disk when navigating through its data to preserve RAM. An issue that is only apparent is when you are working with huge files, but unlikely to occur. As a user, if you decide to open up a 20 GB file (bare with me), Sam will write allblocks to disk. Your disk may be 40GB large, and half of that is taken up by this monstrous 20GB file, that leaves us with 20GB, which lets say 10GB is taken by the operating system and other files. Now we have 10GB, which is not enough for Sam's caching. This is probably a rare case though, and is most likely safe to ignore. Overall I'd say Sam is fairly efficient at managing the growth of its buffer, but requires O(n) time to insert/delete text. It is not a very physically efficient either. Its command language definitely helps (and is the best part about Sam), but leaves for more to be desired. As mentioned earlier, its program size is fairly small too, even for being statically linked on Plan 9.First public release: 1976I was shocked to see what the original vi's code looked like. It was full of hacks and tried to be clever by managing the terminal (it handles 2 separate real terminals) and using it to read lines back to the buffer. It took me almost an hour to find the code that actually placed text into the buffer. On the bright side though......Vi is still small when compared to its bigger brother Vim. It is fast and efficient because of its design for 300 baud connections and terminals. Like Sam, Vi also has its own command language that can both be scripted and keyed, meaning the commands happen has you type.As a descendant of Ed, Vi is also a line-based text editor.Vi represents its buffer as an array of lines. It simply adds or deletes lines where it has to, which is very similar to how GNU Moe does it, as we discussed earlier.When Vi is at or past the end of its buffer, it will request more lines. If the system supports pages, and the page isn't zero or a ridiculously high number, then Vi will allocate the page size divided by the size of a line in bytes, and resize the buffer.If the page size is zero or a ridiculous number, then it will extend the buffer by 4KiB.Otherwise if none of the above happens, Vi will extend the buffer by 1KiB worth of lines.So Vi uses a similar method to Sam to grow its buffer. This is very efficient, but the consequence is Vi will become slow when working with huge files, because it has to traverse through a bigger linear array. It will have a hard time working with large lines. Luckily in the real world, large lines are not a common thing to encounter in source code files. Vi is mostly known for its powerful physical efficiency though, so people who use it rarely care for either of these.First public release: April 14th 1998Featureful and developed over the years, GtkTextView coupled with GtkTextBuffer makes extending functionality of this editor easy. A good example of extending these is GtkSourceView, which adds source code support. It is easy to hack and build an editor from scratch in a night (see: tyreese , my very simple editor). Unlike majority of editors, it supports UTF-8 and separates UI from the editor (actually in this case, it goes a step further and separates the UI from the editor and the editor from the buffer!). It is very well made. Essentially the definition of modern text editing.GtkTextBuffer actually uses a GtkTextBTree. This is simply a binary tree data structure - or the proper term: rope data structure.When text is inserted, GtkTextBTree creates new segments (nodes) and puts the new text into them.When text is deleted, the segments are simply removed.The only issue with this is that you have to build an editor out of these components. This makes the virtual efficiency very high, but consequently has terrible physical efficiency, and you have to invest the time into creating a reliable interface (most likely just a vi overlay). I'm surprised programmers haven't created overlays for Vi or Emacs for the GtkTextView. Definitely an opportunity for someone there. Another disadvantage is in order to use these components, you have to include the entire GTK+ library. This makes any GTK+ based editor larger than other editors (although rare, surprisingly!), but very portable. A GTK+ editor has a good chance of running on Windows, Mac, GNU/Linux, *BSD, and even in web browsers, thanks to the Broadway project First public release: March 20th 1985A better description would probably be "Emacs can be anything you want it to be".I actually began to re-use Emacs again. Its extensibility is just unparalleled by anything. It also separates its UI from the editor part, allowing for different UIs and not being restricted to terminals. This is absolutely awesome for people coming from Vi. I'm one of those people. As soon as I installed Emacs 24, all I had to do to get Vi functionality was M-x list-packages, navigate to the Evil package, install, restart Emacs, and boom. I was right at home. This makes Emacs dangerously powerful. To be able to slap on any front end and get the same Emacs functionality too is just awesome.Emacs uses a data structure called a gap buffer. It splits the text into three parts - the first part, the gap, and the second part.The gap grows and shrinks depending on insert and deletions.When the gap becomes too small, a new gap is created. This is very efficient for insertions and deletions, and O(n) for navigation. This makes it slightly worse than ropes, but better than the majority of solutions out there. In fact, because of its simplicity, editor creators opt for gap buffers because they make their code easier to maintain and to work with the data.Emacs is both physically (if you for example install the Evil package) and virtually efficient, but at the price of bigger program size. In the age of computers with at least 100GB disks and at least 1GB RAM, this is not an issue. This is why I've opted for Emacs as my editor for the time being...First public release: March 14th 1999Due to popular demand, I've added Scintilla to this page. Scintilla is the basis for many text editors on Windows. It is architecturally very similar to GTKTextView/GtkTextBuffer. Scintilla separates itself into 3 parts: Editor, Document, and CellBuffer. As the authors of Scintilla have noted, it is best suited for prototypes ("simple configurations").Scintilla is thread-safe and uses a reader-writer architecture.Huge apology to Neil, I messed up. Here's his comment that clarifies what Scintilla really does:Scintilla parses a string until it sees a return carriage or new line control code (trying to compensate for UNIX and Windows line-endings). Once it does, it simply adds the line to the buffer.This means Scintilla doesn't waste space while growing its buffer. It will always use the optimum amount of space for storing text in memory.What is inefficient about this is if you have a giant line, Scintilla will slow right down, because it goes through every character in the block to determine where new lines are. So if you copy & paste from an external source, you could potentially see a slow down. As a code editor though, this should rarely be a problem. Like GtkTextView, Scintilla's Editor interface is lacking key binding overlays to allow users to use Vi or Emacs keybindings out of the box.It was interesting to see how these editors each approached the textual data management problem. What amazes me even more is that this was only feasible because of the FOSS world we have. If these projects were not open source, I would have never done this study to improve my knowledge of how the tools I use every day work. Thank you FOSS enthusiasts.If you'd like me to add an editor, email me or leave a comment here. I'd be glad to look at it for you :)Keep being awesome