No implementations of these methods are added to the builtin or standard library types. However, a number of projects have reached consensus on the recommended semantics for these operations; see Intended usage details below for details.

Executive summary In numerical code, there are two important operations which compete for use of Python's * operator: elementwise multiplication, and matrix multiplication. In the nearly twenty years since the Numeric library was first proposed, there have been many attempts to resolve this tension ; none have been really satisfactory. Currently, most numerical Python code uses * for elementwise multiplication, and function/method syntax for matrix multiplication; however, this leads to ugly and unreadable code in common circumstances. The problem is bad enough that significant amounts of code continue to use the opposite convention (which has the virtue of producing ugly and unreadable code in different circumstances), and this API fragmentation across codebases then creates yet more problems. There does not seem to be any good solution to the problem of designing a numerical API within current Python syntax -- only a landscape of options that are bad in different ways. The minimal change to Python syntax which is sufficient to resolve these problems is the addition of a single new infix operator for matrix multiplication. Matrix multiplication has a singular combination of features which distinguish it from other binary operations, which together provide a uniquely compelling case for the addition of a dedicated infix operator: Just as for the existing numerical operators, there exists a vast body of prior art supporting the use of infix notation for matrix multiplication across all fields of mathematics, science, and engineering; @ harmoniously fills a hole in Python's existing operator system.

harmoniously fills a hole in Python's existing operator system. @ greatly clarifies real-world code.

greatly clarifies real-world code. @ provides a smoother onramp for less experienced users, who are particularly harmed by hard-to-read code and API fragmentation.

provides a smoother onramp for less experienced users, who are particularly harmed by hard-to-read code and API fragmentation. @ benefits a substantial and growing portion of the Python user community.

benefits a substantial and growing portion of the Python user community. @ will be used frequently -- in fact, evidence suggests it may be used more frequently than // or the bitwise operators.

will be used frequently -- in fact, evidence suggests it may be used more frequently than or the bitwise operators. @ allows the Python numerical community to reduce fragmentation, and finally standardize on a single consensus duck type for all numerical array objects.

Background: What's wrong with the status quo? When we crunch numbers on a computer, we usually have lots and lots of numbers to deal with. Trying to deal with them one at a time is cumbersome and slow -- especially when using an interpreted language. Instead, we want the ability to write down simple operations that apply to large collections of numbers all at once. The n-dimensional array is the basic object that all popular numeric computing environments use to make this possible. Python has several libraries that provide such arrays, with numpy being at present the most prominent. When working with n-dimensional arrays, there are two different ways we might want to define multiplication. One is elementwise multiplication: [[1, 2], [[11, 12], [[1 * 11, 2 * 12], [3, 4]] x [13, 14]] = [3 * 13, 4 * 14]] and the other is matrix multiplication : [[1, 2], [[11, 12], [[1 * 11 + 2 * 13, 1 * 12 + 2 * 14], [3, 4]] x [13, 14]] = [3 * 11 + 4 * 13, 3 * 12 + 4 * 14]] Elementwise multiplication is useful because it lets us easily and quickly perform many multiplications on a large collection of values, without writing a slow and cumbersome for loop. And this works as part of a very general schema: when using the array objects provided by numpy or other numerical libraries, all Python operators work elementwise on arrays of all dimensionalities. The result is that one can write functions using straightforward code like a * b + c / d , treating the variables as if they were simple values, but then immediately use this function to efficiently perform this calculation on large collections of values, while keeping them organized using whatever arbitrarily complex array layout works best for the problem at hand. Matrix multiplication is more of a special case. It's only defined on 2d arrays (also known as "matrices"), and multiplication is the only operation that has an important "matrix" version -- "matrix addition" is the same as elementwise addition; there is no such thing as "matrix bitwise-or" or "matrix floordiv"; "matrix division" and "matrix to-the-power-of" can be defined but are not very useful, etc. However, matrix multiplication is still used very heavily across all numerical application areas; mathematically, it's one of the most fundamental operations there is. Because Python syntax currently allows for only a single multiplication operator * , libraries providing array-like objects must decide: either use * for elementwise multiplication, or use * for matrix multiplication. And, unfortunately, it turns out that when doing general-purpose number crunching, both operations are used frequently, and there are major advantages to using infix rather than function call syntax in both cases. Thus it is not at all clear which convention is optimal, or even acceptable; often it varies on a case-by-case basis. Nonetheless, network effects mean that it is very important that we pick just one convention. In numpy, for example, it is technically possible to switch between the conventions, because numpy provides two different types with different __mul__ methods. For numpy.ndarray objects, * performs elementwise multiplication, and matrix multiplication must use a function call ( numpy.dot ). For numpy.matrix objects, * performs matrix multiplication, and elementwise multiplication requires function syntax. Writing code using numpy.ndarray works fine. Writing code using numpy.matrix also works fine. But trouble begins as soon as we try to integrate these two pieces of code together. Code that expects an ndarray and gets a matrix , or vice-versa, may crash or return incorrect results. Keeping track of which functions expect which types as inputs, and return which types as outputs, and then converting back and forth all the time, is incredibly cumbersome and impossible to get right at any scale. Functions that defensively try to handle both types as input and DTRT, find themselves floundering into a swamp of isinstance and if statements. PEP 238 split / into two operators: / and // . Imagine the chaos that would have resulted if it had instead split int into two types: classic_int , whose __div__ implemented floor division, and new_int , whose __div__ implemented true division. This, in a more limited way, is the situation that Python number-crunchers currently find themselves in. In practice, the vast majority of projects have settled on the convention of using * for elementwise multiplication, and function call syntax for matrix multiplication (e.g., using numpy.ndarray instead of numpy.matrix ). This reduces the problems caused by API fragmentation, but it doesn't eliminate them. The strong desire to use infix notation for matrix multiplication has caused a number of specialized array libraries to continue to use the opposing convention (e.g., scipy.sparse, pyoperators, pyviennacl) despite the problems this causes, and numpy.matrix itself still gets used in introductory programming courses, often appears in StackOverflow answers, and so forth. Well-written libraries thus must continue to be prepared to deal with both types of objects, and, of course, are also stuck using unpleasant funcall syntax for matrix multiplication. After nearly two decades of trying, the numerical community has still not found any way to resolve these problems within the constraints of current Python syntax (see Rejected alternatives to adding a new operator below). This PEP proposes the minimum effective change to Python syntax that will allow us to drain this swamp. It splits * into two operators, just as was done for / : * for elementwise multiplication, and @ for matrix multiplication. (Why not the reverse? Because this way is compatible with the existing consensus, and because it gives us a consistent rule that all the built-in numeric operators also apply in an elementwise manner to arrays; the reverse convention would lead to more special cases.) So that's why matrix multiplication doesn't and can't just use * . Now, in the rest of this section, we'll explain why it nonetheless meets the high bar for adding a new operator.

Why should matrix multiplication be infix? Right now, most numerical code in Python uses syntax like numpy.dot(a, b) or a.dot(b) to perform matrix multiplication. This obviously works, so why do people make such a fuss about it, even to the point of creating API fragmentation and compatibility swamps? Matrix multiplication shares two features with ordinary arithmetic operations like addition and multiplication on numbers: (a) it is used very heavily in numerical programs -- often multiple times per line of code -- and (b) it has an ancient and universally adopted tradition of being written using infix syntax. This is because, for typical formulas, this notation is dramatically more readable than any function call syntax. Here's an example to demonstrate: One of the most useful tools for testing a statistical hypothesis is the linear hypothesis test for OLS regression models. It doesn't really matter what all those words I just said mean; if we find ourselves having to implement this thing, what we'll do is look up some textbook or paper on it, and encounter many mathematical formulas that look like: S = (Hβ − r)T(HVHT) − 1(Hβ − r) Here the various variables are all vectors or matrices (details for the curious: ). Now we need to write code to perform this calculation. In current numpy, matrix multiplication can be performed using either the function or method call syntax. Neither provides a particularly readable translation of the formula: import numpy as np from numpy.linalg import inv, solve # Using dot function: S = np.dot((np.dot(H, beta) - r).T, np.dot(inv(np.dot(np.dot(H, V), H.T)), np.dot(H, beta) - r)) # Using dot method: S = (H.dot(beta) - r).T.dot(inv(H.dot(V).dot(H.T))).dot(H.dot(beta) - r) With the @ operator, the direct translation of the above formula becomes: S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r) Notice that there is now a transparent, 1-to-1 mapping between the symbols in the original formula and the code that implements it. Of course, an experienced programmer will probably notice that this is not the best way to compute this expression. The repeated computation of Hβ − r should perhaps be factored out; and, expressions of the form dot(inv(A), B) should almost always be replaced by the more numerically stable solve(A, B) . When using @ , performing these two refactorings gives us: # Version 1 (as above) S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r) # Version 2 trans_coef = H @ beta - r S = trans_coef.T @ inv(H @ V @ H.T) @ trans_coef # Version 3 S = trans_coef.T @ solve(H @ V @ H.T, trans_coef) Notice that when comparing between each pair of steps, it's very easy to see exactly what was changed. If we apply the equivalent transformations to the code using the .dot method, then the changes are much harder to read out or verify for correctness: # Version 1 (as above) S = (H.dot(beta) - r).T.dot(inv(H.dot(V).dot(H.T))).dot(H.dot(beta) - r) # Version 2 trans_coef = H.dot(beta) - r S = trans_coef.T.dot(inv(H.dot(V).dot(H.T))).dot(trans_coef) # Version 3 S = trans_coef.T.dot(solve(H.dot(V).dot(H.T)), trans_coef) Readability counts! The statements using @ are shorter, contain more whitespace, can be directly and easily compared both to each other and to the textbook formula, and contain only meaningful parentheses. This last point is particularly important for readability: when using function-call syntax, the required parentheses on every operation create visual clutter that makes it very difficult to parse out the overall structure of the formula by eye, even for a relatively simple formula like this one. Eyes are terrible at parsing non-regular languages. I made and caught many errors while trying to write out the 'dot' formulas above. I know they still contain at least one error, maybe more. (Exercise: find it. Or them.) The @ examples, by contrast, are not only correct, they're obviously correct at a glance. If we are even more sophisticated programmers, and writing code that we expect to be reused, then considerations of speed or numerical accuracy might lead us to prefer some particular order of evaluation. Because @ makes it possible to omit irrelevant parentheses, we can be certain that if we do write something like (H @ V) @ H.T , then our readers will know that the parentheses must have been added intentionally to accomplish some meaningful purpose. In the dot examples, it's impossible to know which nesting decisions are important, and which are arbitrary. Infix @ dramatically improves matrix code usability at all stages of programmer interaction.

Transparent syntax is especially crucial for non-expert programmers A large proportion of scientific code is written by people who are experts in their domain, but are not experts in programming. And there are many university courses run each year with titles like "Data analysis for social scientists" which assume no programming background, and teach some combination of mathematical techniques, introduction to programming, and the use of programming to implement these mathematical techniques, all within a 10-15 week period. These courses are more and more often being taught in Python rather than special-purpose languages like R or Matlab. For these kinds of users, whose programming knowledge is fragile, the existence of a transparent mapping between formulas and code often means the difference between succeeding and failing to write that code at all. This is so important that such classes often use the numpy.matrix type which defines * to mean matrix multiplication, even though this type is buggy and heavily disrecommended by the rest of the numpy community for the fragmentation that it causes. This pedagogical use case is, in fact, the only reason numpy.matrix remains a supported part of numpy. Adding @ will benefit both beginning and advanced users with better syntax; and furthermore, it will allow both groups to standardize on the same notation from the start, providing a smoother on-ramp to expertise.

But isn't matrix multiplication a pretty niche requirement? The world is full of continuous data, and computers are increasingly called upon to work with it in sophisticated ways. Arrays are the lingua franca of finance, machine learning, 3d graphics, computer vision, robotics, operations research, econometrics, meteorology, computational linguistics, recommendation systems, neuroscience, astronomy, bioinformatics (including genetics, cancer research, drug discovery, etc.), physics engines, quantum mechanics, geophysics, network analysis, and many other application areas. In most or all of these areas, Python is rapidly becoming a dominant player, in large part because of its ability to elegantly mix traditional discrete data structures (hash tables, strings, etc.) on an equal footing with modern numerical data types and algorithms. We all live in our own little sub-communities, so some Python users may be surprised to realize the sheer extent to which Python is used for number crunching -- especially since much of this particular sub-community's activity occurs outside of traditional Python/FOSS channels. So, to give some rough idea of just how many numerical Python programmers are actually out there, here are two numbers: In 2013, there were 7 international conferences organized specifically on numerical Python . At PyCon 2014, ~20% of the tutorials appear to involve the use of matrices . To quantify this further, we used Github's "search" function to look at what modules are actually imported across a wide range of real-world code (i.e., all the code on Github). We checked for imports of several popular stdlib modules, a variety of numerically oriented modules, and various other extremely high-profile modules like django and lxml (the latter of which is the #1 most downloaded package on PyPI). Starred lines indicate packages which export array- or matrix-like objects which will adopt @ if this PEP is approved: Count of Python source files on Github matching given search terms (as of 2014-04-10, ~21:00 UTC) ================ ========== =============== ======= =========== module "import X" "from X import" total total/numpy ================ ========== =============== ======= =========== sys 2374638 63301 2437939 5.85 os 1971515 37571 2009086 4.82 re 1294651 8358 1303009 3.12 numpy ************** 337916 ********** 79065 * 416981 ******* 1.00 warnings 298195 73150 371345 0.89 subprocess 281290 63644 344934 0.83 django 62795 219302 282097 0.68 math 200084 81903 281987 0.68 threading 212302 45423 257725 0.62 pickle+cPickle 215349 22672 238021 0.57 matplotlib 119054 27859 146913 0.35 sqlalchemy 29842 82850 112692 0.27 pylab *************** 36754 ********** 41063 ** 77817 ******* 0.19 scipy *************** 40829 ********** 28263 ** 69092 ******* 0.17 lxml 19026 38061 57087 0.14 zlib 40486 6623 47109 0.11 multiprocessing 25247 19850 45097 0.11 requests 30896 560 31456 0.08 jinja2 8057 24047 32104 0.08 twisted 13858 6404 20262 0.05 gevent 11309 8529 19838 0.05 pandas ************** 14923 *********** 4005 ** 18928 ******* 0.05 sympy 2779 9537 12316 0.03 theano *************** 3654 *********** 1828 *** 5482 ******* 0.01 ================ ========== =============== ======= =========== These numbers should be taken with several grains of salt (see footnote for discussion: ), but, to the extent they can be trusted, they suggest that numpy might be the single most-imported non-stdlib module in the entire Pythonverse; it's even more-imported than such stdlib stalwarts as subprocess , math , pickle , and threading . And numpy users represent only a subset of the broader numerical community that will benefit from the @ operator. Matrices may once have been a niche data type restricted to Fortran programs running in university labs and military clusters, but those days are long gone. Number crunching is a mainstream part of modern Python usage. In addition, there is some precedence for adding an infix operator to handle a more-specialized arithmetic operation: the floor division operator // , like the bitwise operators, is very useful under certain circumstances when performing exact calculations on discrete values. But it seems likely that there are many Python programmers who have never had reason to use // (or, for that matter, the bitwise operators). @ is no more niche than // .

So @ is good for matrix formulas, but how common are those really? We've seen that @ makes matrix formulas dramatically easier to work with for both experts and non-experts, that matrix formulas appear in many important applications, and that numerical libraries like numpy are used by a substantial proportion of Python's user base. But numerical libraries aren't just about matrix formulas, and being important doesn't necessarily mean taking up a lot of code: if matrix formulas only occurred in one or two places in the average numerically-oriented project, then it still wouldn't be worth adding a new operator. So how common is matrix multiplication, really? When the going gets tough, the tough get empirical. To get a rough estimate of how useful the @ operator will be, the table below shows the rate at which different Python operators are actually used in the stdlib, and also in two high-profile numerical packages -- the scikit-learn machine learning library, and the nipy neuroimaging library -- normalized by source lines of code (SLOC). Rows are sorted by the 'combined' column, which pools all three code bases together. The combined column is thus strongly weighted towards the stdlib, which is much larger than both projects put together (stdlib: 411575 SLOC, scikit-learn: 50924 SLOC, nipy: 37078 SLOC). The dot row (marked ****** ) counts how common matrix multiply operations are in each codebase. ==== ====== ============ ==== ======== op stdlib scikit-learn nipy combined ==== ====== ============ ==== ======== = 2969 5536 4932 3376 / 10,000 SLOC - 218 444 496 261 + 224 201 348 231 == 177 248 334 196 * 156 284 465 192 % 121 114 107 119 ** 59 111 118 68 != 40 56 74 44 / 18 121 183 41 > 29 70 110 39 += 34 61 67 39 < 32 62 76 38 >= 19 17 17 18 <= 18 27 12 18 dot ***** 0 ********** 99 ** 74 ****** 16 | 18 1 2 15 & 14 0 6 12 << 10 1 1 8 // 9 9 1 8 -= 5 21 14 8 *= 2 19 22 5 /= 0 23 16 4 >> 4 0 0 3 ^ 3 0 0 3 ~ 2 4 5 2 |= 3 0 0 2 &= 1 0 0 1 //= 1 0 0 1 ^= 1 0 0 0 **= 0 2 0 0 %= 0 0 0 0 <<= 0 0 0 0 >>= 0 0 0 0 ==== ====== ============ ==== ======== These two numerical packages alone contain ~780 uses of matrix multiplication. Within these packages, matrix multiplication is used more heavily than most comparison operators ( < != <= >= ). Even when we dilute these counts by including the stdlib into our comparisons, matrix multiplication is still used more often in total than any of the bitwise operators, and 2x as often as // . This is true even though the stdlib, which contains a fair amount of integer arithmetic and no matrix operations, makes up more than 80% of the combined code base. By coincidence, the numeric libraries make up approximately the same proportion of the 'combined' codebase as numeric tutorials make up of PyCon 2014's tutorial schedule, which suggests that the 'combined' column may not be wildly unrepresentative of new Python code in general. While it's impossible to know for certain, from this data it seems entirely possible that across all Python code currently being written, matrix multiplication is already used more often than // and the bitwise operations.