Many services nowadays ask people to “copy and paste cells from excel”. Maybe they should consider using the pure-JS XLS or XLSX libraries :) But that’s for another day.

Somehow, Excel seems to know more than it lets on. For example, consider

Copying and pasting to a plaintext buffer reveals that tab characters are used as delimiters (highlighted tab chars)

And pasting back works perfectly fine! At least, that’s the common impression.

However, there is one area of nastiness: double quotes. consider:

Copying this into a plaintext buffer reveals no magic for the quotes:

And copying this text back to excel gives you different output:

WHAT HAPPENED TO MY DATA?

Well, Excel doesn’t properly wrap the text. It should have generated something like

“”“foo” “”“bar” “”“baz” “”“qux”

(wrapping fields in quotes, properly rendering double-quotes). But somehow, copying and pasting the cells works perfectly.

How does that work? To investigate further, I used a tool that lets you see the pasteboards on OSX (source on github). The answer is that Excel actually populates multiple clipboards. For example, the HTML clipboard shows:

<html xmlns:v=“urn:schemas-microsoft-com:vml”

xmlns:o=“urn:schemas-microsoft-com:office:office”

xmlns:x=“urn:schemas-microsoft-com:office:excel”

xmlns=“http://www.w3.org/TR/REC-html40”> <head>

<meta http-equiv=Content-Type content=“text/html; charset=utf-8”>

<meta name=ProgId content=Excel.Sheet>

<meta name=Generator content=“Microsoft Excel 14”>

<link id=Main-File rel=Main-File

href=“file://localhost/Users/SheetJS/Library/Caches/TemporaryItems/msoclip/0/clip.htm”>

<link rel=File-List

href=“file://localhost/Users/SheetJS/Library/Caches/TemporaryItems/msoclip/0/clip_filelist.xml”>

<style>

<!–table

{mso-displayed-decimal-separator:“.”;

mso-displayed-thousand-separator:“\,”;}

@page

{margin:1.0in .75in 1.0in .75in;

mso-header-margin:.5in;

mso-footer-margin:.5in;}

td

{padding-top:1px;

padding-right:1px;

padding-left:1px;

mso-ignore:padding;

color:black;

font-size:12.0pt;

font-weight:400;

font-style:normal;

text-decoration:none;

font-family:Calibri, sans-serif;

mso-font-charset:0;

mso-number-format:General;

text-align:general;

vertical-align:bottom;

border:none;

mso-background-source:auto;

mso-pattern:auto;

mso-protection:locked visible;

white-space:nowrap;

mso-rotate:0;}

–>

</style>

</head> <body link=blue vlink=purple> <table border=0 cellpadding=0 cellspacing=0 width=260 style=‘border-collapse:

collapse;width:260pt’>

<col width=65 span=4 style='width:65pt’>

<tr height=15 style='height:15.0pt’>

<!–StartFragment–>

<td height=15 width=65 style='height:15.0pt;width:65pt’>"foo</td>

<td width=65 style='width:65pt’>"bar</td>

<td width=65 style='width:65pt’>"baz</td>

<td width=65 style='width:65pt’>"qux</td>

<!–EndFragment–>

</tr>

</table> </body> </html>

(See the session transcript)

Woah! HTML! and seemingly complete HTML too (no extraneous stylesheet dependencies).

Can I use this in the browser?

There are APIs for reading the various clipboards, but they are vendor specific. For example, Chrome supports the onpaste event, as demonstrated in this example.