The idea that evaluation can affect types of a program may seem unconventional at first, but this is just a matter of perspective. A Histogram program is a list of interactions and evaluation is one such interaction. The type of a value represented by a reference thus only depends on the interactions -- or parts of a program -- involving the reference. The way this behaviour is implemented in Histogram is quite simple. Each library, such as the one providing the global `data` value can define a \textit{refine} function that looks as follows: ```mathjax \textit{refine} : \textit{type} \times \textit{value} \rightarrow \textit{type} ``` When processing the $\text{evaluate}$ interaction, Histogram computes the value for a specified reference, stores it as part of the current program state and invokes the $\textit{refine}$ function to obtain a new more specific type for the reference. In our prototype, the types returned by the $\textit{refine}$ operation are always subtypes of the original type and so this preserves the property that programs created using well-typed interactions remain well-typed. ### _5.3_ User experience of type refinement Furthermore, the user experience that is made possible thanks to our approach is, in fact, familiar in a number of existing developer environments. Recall the motivating example of using the pandas library in a Jupyter notebook to load and process data. Python defers all checking to runtime, but the way you typically interact with the system is that you write code to load an input file (or a sample input file), run it and only then continue writing code to process it. In our example, we only wrote `raw.value` after we run code to load the `raw` data table and after we saw the available columns in the preview. Histogram captures this way of working and makes it an inherent part of its type system. Another case of similar user experience is when writing JavaScript code in the console window of [Firefox Developer tools](https://developer.mozilla.org/en-US/docs/Tools). When you type `"test".length` followed by a `.` you get a completion list with members of a number value. However, if you define a function `length` that returns the `length` member of its arguments and type `length("test")` followed by a `.` you do not get any suggestions. The difference is that, in the first case, the editor silently evaluates the code, because it assumes this will not have unexpected consequences, but in the second case, it does not. This is a somewhat ad-hoc decision made by the tool authors and Histogram provides a more principled way of capturing it. Finally, our approach is also related to the way type providers ([Syme et al. 2013](#refs), [Christiansen, 2013](#refs)) work. Type providers execute code at compile-time to generate static types from external data sources. They can also take static parameters such as database name or service URL, but those have to be constants that are known at compile-time. Provided types are then used to generate auto-complete suggestions when writing code. Type providers are similar to the way Histogram works in that some code is evaluated in order to give more precise type information. However, type providers are limited to a two-stage way of working -- they run at compile-time and can only depend on constant parameters. The mechanism in Histogram can be seen as a generalisation that supports multiple stages. This can be useful in data science scenarios that dynamically transform data and might, for example, drop certain columns dynamically based on the values they contain. **_6._ Implementation:** Building an interactive essay ------------------------------------------------------ Before discussing the design choices and future work, it is worth adding a few notes on how has this interactive essay been implemented. This essay uses the capabilities provided by the web to allow the reader to experience some aspects of the design of the Histogram system. We believe that this way of presenting the work lets us focus on the important aspects of our work -- rather than developing theoretical foundations of Histogram or conducting user studies, we want to communicate the style of interaction and some of the consequences of our system design. This essay thus follows some of the ideas discussed in a recent position paper on evaluating programming systems design by [Edwards et al. (2019)](#refs). The code behind the present essay is available [on GitHub](https://github.com/tpetricek/histogram). The implementation is written in F# and uses the [Fable compiler](https://fable.io) to compile it into JavaScript. One technically interesting aspect of our implementation is how the scrollytelling effect is implemented, especially because our approach could be likely reused in other similar essays. Our system uses the Elm architecture, also known as model-view-update ([Czaplicki et al., 2019](#refs)) where a user interface consists of a type representing the current state (model) together with a type representing events (updates) that can occur in the user interface, a function that updates the current state when an event occurs and a function that can render the current state. In our case, the model type includes the Histogram program and various properties of the user interface. The updates include various user interface events to highlight and select references in the Histogram program and choose items in a menu, but also an event that represents a Histogram interaction as defined in section 3.2. The interactive demos in this essay are represented simply as lists of events that are passed to the update function of our system as the reader scrolls through the page. We want to show what interacting with the prototype user interface looks like and so those events include both Histogram interactions and also other user interface events. Although our current implementation is not directly reusable, the general pattern of using Elm architecture, representing demos as lists of events and exposing them via scrollytelling seems to be an easy to reuse method for building interactive essays. **_7._ Remarks:** Design choices and future work ------------------------------------------------ This essay does not aim to present a fully developed and functional programming system. Instead, it should be seen as an exploration of ideas in a certain design space. As such, it leaves many open questions, unexplored links and possible directions for future work. In this section, we briefly review some of those. ### _7.1_ From constructing to editing The most obvious limitation of our system is that it does not allow the user to modify a program once it has been constructed. You can create a program, but the only way of changing it is to undo interactions and redo them differently. This lets users correct small mistakes, but it is inadequate for a real-world system. There are several ways in which editing of programs could be supported. We could either rewrite the history and treat program edits as meta-interactions, or we could add new kinds of interactions to represent different edit operations. In the future, we plan to explore the latter approach, but both of these pose the same interesting problem. Program edits can make programs (at least temporarilly) invalid. This can be addressed by introducing holes as done by Hazel ([Omar et al., 2019](#refs), [Omar et al., 2017](#refs)), but our program representation allows another possible approach. Imagine that we add an interaction that removes a formula at a given reference. We can keep the displayed program valid and so we also recursively remove all formulas that depend on the one removed by the user. This can remove quite a lot of code, but this code is not lost. The environment knows about it, because it is a part of the history and it could then offer the user to restore the temporarily removed code once the errors caused by the initial deletion are resolved. The user experience of this approach remains to be investigated, but we believe this might be a compelling (or a complementary) alternative to an approach based on holes. ### _7.2_ Simplifying the spreadsheet interface Our prototype comes with a somewhat more programmer-focused code-based interface and a somewhat more user-focused interface akin to a spreadsheet. Both of these could be further improved to better support their respective users. Our spreadsheet view displays code in cells, whereas a typical spreadsheet displays values. We could do this for references that have been evaluated, but many of our values would not fit in a single spreadsheet cell and we would need to be able to map a reference to a range in the spreadsheet. We could then display a data table or a row as an actual data table in the spreadsheet. There are other values that might not be easy to display in a spreadsheet such as operations -- we briefly discuss these in section 6.5. Finally, our spreadsheet uses automatic layout and it would be interesting to find ways of giving the user (more) control over how this is done. ### _7.3_ Experts and keyboard-based interactions A complementary problem is how to better serve expert users. Our current prototype can display programs as code, but the interactions with those are through an arguably cumbersome user interface. An appealing alternative would be to offer a command prompt where programmers can enter interactions using a keyboard. This could be made efficient by using suitable shortcuts and auto-complete. Interestingly, this way of interacting with the system is somewhat similar to the way in which thorem provers like Coq ([Barras et al., 1997](#refs)) work. To prove a theorem in Coq, one invokes a series of tactics that transform the proof obligations. A proof is a list of tactic invocations, which is similar to our case where a program is a list of programming interactions. ### _7.4_ Histogram programs as Histogram values In a recent discussion, [Basman et al. (2018)](#refs) and [Petricek (2018)](#refs) discuss how to design a programming substrate, or programming environment, that would enable open authorship, i.e. allow the users to gradually progress from using a system to making small changes and, eventually, to modifying the system itself. Smalltalk [(Goldberg and Robson, 1983)](#refs) is a canonical example of a system that can be modified through itself, although this requires expert programming skills. The current Histogram prototype is very simple and does not even let you create new objects such as the built-in `data` value. However, it is interesting to consider how this could be changed. The first interesting step would be to make a Histogram program, i.e. the list of interactions a value that could, itself, be edited and modified in the Histogram environment. More specifically, we could represent individual interactions as members that are invoked on an object that represents the program. This could make Histogram more powerful by enabling an interesting form of meta-programming in the language. ### _7.5_ Lowering the cost of abstraction Finally, the current Histogram prototype makes it easier to construct functions by starting with concrete data, writing code to process it and then extracting a function by choosing some inputs as parameters. This is more concrete than writing a function in a traditional programming language, but we are still left with conceptually challenging function values in our program. We would like to make this experience more akin to using the "copy down" functionality in Excel that copies a formula to a table of inputs. One possible approach is to use something akin to linked editing ([Toomim et al., 2014](#refs)) or managed copy-and-paste as known from Subtext ([Edwards, 2006](#refs)). This would make it possible to reuse the same code in multiple contexts, perhaps with modifications. It would also fit well with our approach as copy-and-paste could be just another kind of interaction. Another approach would be to represent functions as hypothetical interactions -- a function constructed from a concrete computation would be a sequence of interactions that copy its arguments to the input references, followed by an evaluate interaction and an interaction that copies the result back to a reference specified by the caller. This is, in fact, how our current prototype implements functions, but we do not currently expose this to the user. ### _7.6_ Exploring the design space This essay should be primarily seen as an exploration of design space in the sense discussed by [Edwards et al. (2019)](#refs). We choose simple data exploration as our problem domain and the idea of representing programs as lists of interactions as our starting point. We then followed a path towards a minimal demo-able prototype, making a number of design decisions along the way. Numerous related projects occupy a similar space, but makes different design decisions. In Subtext ([Edwards, J., 2005](#refs)), programmers interact with trees and treats copying as a central operation; work on direct programming ([Edwards, J., 2018](#refs)) moves towards unifying program and data in a way where both can be edited by direct manipulation. Hazel and Hazelnut ([Cyrus et al., 2019](#refs), [Cyrus et al., 2017](#refs)) is a structured programming environment with a calculus of interactions at its core, but with focus on editing incomplete programs represented as abstract syntax trees. Our work briefly explored the idea of editing code via direct manipulation with a preview. We did so via a mechanism where user interface can trigger interactions, which are then added to the program. The Sketch-n-Sketch project ([Hempel et al. 2019](#refs)) takes the idea of direct manipulation much further and explores ways of synthesizing program updates to synchronize the code and preview ([Chugh et al. 2016](#refs)). This can be captured using the notion of bidirectional evaluation ([Mayer et al., 2018](#refs)). **_8._ Conclusions:** Glimpse of the future ------------------------------------------- Every now and then, somebody remarks that all programming languages have already been invented or, at least, that we have already explored most of the interesting corners of the desing space for programming languages. The purpose of this essay is to show that there still are unexplored corners and that we can make relatively simple but design decisions that will have interesting consequences. The key idea in this essay was to represent programs as lists of interactions that were used to create the program. This way, we want to shift focus from thinking about _programs_ to thinking about _programming_. The key concepts of a programming language stop being different kinds of expressions, but rather operations such as refactoring, the use of auto-complete and interactive evaluation of part of code. We explored the idea in the context of a simple programming environment for data exploration. Our way of representing programs has a number of interesting consequences. If we construct functions via a refactoring, we can keep the original inputs as sample values for previews; if we include evaluation as an interaction, our type system can give more precise information for parts of program that have been evaluated. Our way of representing programs also enables valuable user interface features. We can display the same program both as source code and as a spreadsheet and we can easily let users construct programs by directly manipulating data in previews. As the quote by Carl Sagan that we borrowed for the title of this essay says, "You have to know the past to understand the present". When it comes to programs, knowing their past might not be strictly necessary for understanding their present, but it certainly enables a range of interesting user experiences that is worth a further study or, perhaps, even a new programming paradigm. References ---------- 1. Aaron, S., & Blackwell, A. F. (2013). _From sonic Pi to overtone: creative musical experiences with domain-specific and functional languages_. In Proceedings of the first ACM SIGPLAN workshop on Functional art, music, modeling & design (pp. 35-46). ACM. 1. Barras, B., Boutin, S., Cornes, C., Courant, J., Filliatre, J. C., Gimenez, E., Herbelin, H., Huet, G., Munoz, C., Murthy, C., et al. (1997). _The Coq proof assistant reference manual: Version 6.1_. Research Report RT-0203, INRIA. 1997,pp.214. 1. Basman, A., Tchernavskij, P., Bates, S., & Beaudouin-Lafon, M. (2018). _An anatomy of interaction: co-occurrences and entanglements_. In Conference Companion of the 2nd International Conference on Art, Science, and Engineering of Programming (pp. 188-196). ACM. 1. Burnett, M. M., Atwood, J. W., & Welch, Z. T. (1998). _Implementing level 4 liveness in declarative visual programming languages_. In Proceedings. 1998 IEEE Symposium on Visual Languages (Cat. No. 98TB100254) (pp. 126-133). IEEE. 1. Christiansen, D. R. (2013). _Dependent type providers_. In Proceedings of the 9th ACM SIGPLAN workshop on Generic programming (pp. 25-34). ACM. 1. Chugh, R., Hempel, B., Spradlin, M., & Albers, J. (2016). _Programmatic and direct manipulation, together at last_. In ACM SIGPLAN Notices (Vol. 51, No. 6, pp. 341-354). ACM. 1. Czaplicki, E. and contributors (2019). _The Elm Architecture_. Available online at [https://guide.elm-lang.org/architecture/](https://guide.elm-lang.org/architecture/) 1. DeLine, R., Fisher, D., Chandramouli, B., Goldstein, J., Barnett, M., Terwilliger, J. F., & Wernsing, J. (2015). _Tempe: Live scripting for live data_. In VL/HCC (pp. 137-141). 1. Edwards, J. (2006). _First Class Copy & Paste_. Computer Science and Artificial Intelligence Laboratory Technical Report, MIT-CSAIL-TR-2006-037 1. Edwards, J. (2005). _Subtext: uncovering the simplicity of programming_. In ACM SIGPLAN Notices (Vol. 40, No. 10, pp. 505-518). ACM. 1. Edwards, J. (2018). _Direct programming_. In proceedings of the 29th Annual Workshop of the Psychology of Programming Interest Group, PPIG 2018. Available online at [https://vimeo.com/274771188](https://vimeo.com/274771188) 1. Edwards, J., Kell, S., Petricek, T., Church, L. (2019). _Evaluating programming systems design_. To appear in proceedings of the 30th Annual Workshop of the Psychology of Programming Interest Group, PPIG 2019 1. Goldberg, A., & Robson, D. (1983). _Smalltalk-80: the language and its implementation_. Addison-Wesley Longman Publishing Co., Inc.. 1. Hempel, B., Lubin, J., and Chugh, R. (2019). _Sketch-n-Sketch: Output-Directed Programming for SVG_. In Proceedings of the ACM Symposium on User Interface Software and Technology (UIST), New Orleans, LA, October 2019. 1. Hundhausen, C. D., Farley, S. F., & Brown, J. L. (2009). _Can direct manipulation lower the barriers to computer programming and promote transfer of training?: An experimental study_. ACM Transactions on Computer-Human Interaction (TOCHI), 16(3), 13. 1. Kluyver, T., Ragan-Kelley, B., Pérez, F., Granger, B.E., Bussonnier, M., Frederic, J., Kelley, K., Hamrick, J.B., Grout, J., Corlay, S. & Ivanov, P. (2016). _Jupyter Notebooks-a publishing format for reproducible computational workflows_. In ELPUB (pp. 87-90). 1. Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., ... & Zhao, Y. (2006). _Scientific workflow management and the Kepler system_. Concurrency and Computation: Practice and Experience, 18(10), 1039-1065. 1. Maloney, J. H., & Smith, R. B. (1995). _Directness and liveness in the morphic user interface construction environment_. In ACM Symposium on User Interface Software and Technology (Vol. 95, pp. 21-28). 1. Mayer, M., Kunčak, V., and Chugh, R. (2018). _Bidirectional Evaluation with Direct Manipulation_. In Proceedings of the ACM on Programming Languages (PACMPL), Issue OOPSLA, Boston, MA, November 2018. 1. McDirmid, S. (2013). _Usable live programming_. In Proceedings of the 2013 ACM international symposium on New ideas, new paradigms, and reflections on programming & software (pp. 53-62). ACM. 1. McDirmid, S. (2007). _Living it up with a live programming language_. In ACM SIGPLAN Notices (Vol. 42, No. 10, pp. 623-638). ACM. 1. McKinney, W. (2011). _pandas: a foundational Python library for data analysis and statistics_. Python for High Performance and Scientific Computing, 14. 1. Omar, C., Voysey, I., Chugh, R., & Hammer, M. A. (2019). _Live functional programming with typed holes_. Proceedings of the ACM on Programming Languages, 3(POPL), 14. 1. Omar, C., Voysey, I., Hilton, M., Aldrich, J., & Hammer, M. A. (2017). _Hazelnut: a bidirectionally typed structure editor calculus_. ACM SIGPLAN Notices, 52(1), 86-99. 1. Petricek, T. (2018). _Critique of 'An anatomy of interaction: co-occurrences and entanglements'_. In Conference Companion of the 2nd International Conference on Art, Science, and Engineering of Programming (pp. 197-201). ACM. 1. Petricek, T., Guerra, G., & Syme, D. (2016). _Types from data: Making structured data first-class citizens in F#_. In ACM SIGPLAN Notices (Vol. 51, No. 6, pp. 477-490). ACM. 1. Sandewall, E. (1978). _Programming in an interactive environment: the LISP experience_. ACM Computing Surveys, 10(1), 35-71. 1. Sarkar, A., Gordon, A. (2018). _How do people learn to use spreadsheets? (Work in progress)_. In Proceedings of the 29th Annual Conference of the Psychology of Programming Interest Group (PPIG 2018), pp28-35. 1. Scherlis, W. L., & Scott, D. S. (1983). _First steps towards inferential programming_. In Program Verification (pp. 99-133). Springer, Dordrecht, 1993. 1. Schiller, J., Turbak, F., Abelson, H., Dominguez, J., McKinney, A., Okerlund, J., & Friedman, M. (2014). _Live programming of mobile apps in App Inventor_. In Proceedings of the 2nd Workshop on Programming for Mobile & Touch (pp. 1-8). ACM. 1. Seyser, D., & Zeiller, M. (2018). _Scrollytelling -- An Analysis of Visual Storytelling in Online Journalism_. In 2018 22nd International Conference Information Visualisation (IV) (pp. 401-406). IEEE. 1. Shneiderman, B. (1981). _Direct manipulation: A step beyond programming languages_. In ACM SIGSOC Bulletin (Vol. 13, No. 2-3, p. 143). ACM. 1. Syme, D., Battocchi, K., Takeda, K., Malayeri, D., & Petricek, T. (2013). _Themes in information-rich functional programming for internet-scale data sources_. In Proceedings of the 2013 workshop on Data driven functional programming (pp. 1-4). ACM. 1. Tanimoto, S. L. (2013). _A perspective on the evolution of live programming_. In Proceedings of the 1st International Workshop on Live Programming (pp. 31-34). IEEE Press. 1. Toomim, M., Begel, A., & Graham, S. L. (2004). Managing duplicated code with linked editing. In 2004 IEEE Symposium on Visual Languages-Human Centric Computing (pp. 173-180). IEEE. 1. Victor, B. (2012). _Inventing on principle_. CUSEC 2012. Available online at: [https://vimeo.com/36579366](https://vimeo.com/36579366) 1. Victor, B. (2012). _Learnable programming? Designing a programming system for understanding programs_. Available online at: [http://worrydream.com/LearnableProgramming/](http://worrydream.com/LearnableProgramming/)