Application Intelligence

Product . How can we use this data to improve the product or better understand users? How can we allow users to learn from the data themselves?

. How can we use this data to improve the product or better understand users? How can we allow users to learn from the data themselves? Marketing . What does this data tell us about how users are using the application? Can this application data help us better understand marketing ROI?

. What does this data tell us about how users are using the application? Can this application data help us better understand marketing ROI? Support . How can this data be used to help identify and resolve issues that users are having?

. How can this data be used to help identify and resolve issues that users are having? Managers . What does this data tell us about resource allocation? How can we tie this data to sales and other data sets?

. What does this data tell us about resource allocation? How can we tie this data to sales and other data sets? IT. What type of data is being generated by the application, and how might we tune the database for this kind of data?





Coding. Have a developer write low-level, one-off code to interact with the MongoDB API to discover the data stored there and answer relevant questions. ETL. Develop a workflow to migrate, homogenize, and normalize the data into an RDBMS, where you can use existing analytics tooling (albeit on a data model which does not accurately represent the underlying data).





SELECT DISTINCT user_name, SUM(music[*].likes[*].strength) AS strength FROM collection WHERE music[*].likes[*].name = 'david bowie' GROUP BY user_name ORDER BY strength DESC LIMIT 10

Structural type inference. SlamData does not scan the database to learn the structure of the data. Instead, SlamData uses a structural type system, complete with bidirectional type inference, which allows SlamData to parse the intent of a query and generate an execution plan consistent with that intent. For example, if your query uses a field as if it were a string, then SlamData will look for documents in which the field is a string. SlamData will also warn you when you attempt to do nonsensical things, like adding 4 to a string, because even though SlamData doesn’t know what’s in the database, it does know what operations make sense on what data types. Multi-dimensional relational algebra. SlamSQL is built on a formal extension of relational algebra called multi-dimensional relational algebra (MRA). This more powerful (but backward-compatible) foundation allows slicing, dicing, and aggregating nested, non-uniform data. As a pleasant side-effect, it also gives a sensible semantic to many SQL queries which are not allowed in ANSI SQL (for example, SELECT price / SUM(price) AS percent FROM ORDERS). Advanced multi-staged compilation. MongoDB has three distinct mechanisms for executing a query (one of them being full-fledged map/reduce), and each has different strengths and weaknesses. In general, efficiently executing a complex query might require a combination of all three. SlamData has an advanced multi-stage, optimizing planner which attempts to find the optimal combination of all three mechanisms. In-database execution. SlamData is extremely aggressive about pushing execution of queries into the database. In fact, 100% of every query will be executed directly in the database, with no streaming back to the client for post-processing. Other attempts at solving this problem rely on client-side processing for most queries, because executing every part of every query inside the database is extremely difficult to do in a performant way (hence the need for the advanced, multi-staged compilation).





By John A. De Goes (SlamData), Dec 2014.SlamData is an open source tool that makes analytics on MongoDB easy and accessible to developers and non-developers alike. We just launched v1.1, which greatly increases the power of the tool and fixes a number of issues identified in the 1.0 release.MongoDB is currently the fastest growing and most successful NoSQL database. Companies are using the database primarily to build web and mobile applications.Successful applications built on MongoDB end up capturing or generating large amounts of data. The process of understanding this data, which I call, is vital to multiple stakeholders in the business:MongoDB does not have rigid schemas (every “row” may have a different structure from every other “row), and allows arbitrary nesting of data (“rows” can contain other “tables”).While this flexibility leads to faster application development and better performance and scaling properties, it comes at a cost:In the relational world, to answer these types of questions you’d simply use a data discovery and ad hoc analytics tool. But in the world of MongoDB, if you need Application Intelligence, you havetwo choices:Ultimately, neither approach is scalable, which is why we started SlamData , an open source project based on the premise that NoSQL data is, and analytics tooling needs towith modern data.SlamData provides a standard SQL interface to NoSQL data stored in MongoDB.Every SQL query is executed 100% in the database (or in a replica set), and operates on the actual structure of the data.This approach differs substantially from other solutions to the problem, which stream data from the database to handle complex queries, and which superimpose a fake relational view of the underlying data (even when it is not relational).SlamData’s dialect of SQL (called) extends ANSI SQL to support nested data, heterogeneous data, and aggregation over nested dimensions (for example, summing elements in an array stored inside a row).An example SlamSQL query is shown below:In this query, documents which are doubly-nested in arrays are being used to filter and sum values in the overall result. This query would be impossible in an RDBMS, and the equivalent code for the MongoDB API would be very difficult to write, troubleshoot, and understand.By leveraging industry standard SQL, SlamData makes it possible for a wide range of users and tools to interface with MongoDB, and helps teams quickly and easily understand the data generated or collected by their MongoDB applications.In the current 1.1 release, all standard SQL clauses are supported, including SELECT, AS, FROM, JOIN, WHERE, GROUP BY, HAVING, OUTER JOIN, CROSS, and more.The SlamData project innovates in several key ways:The combination of these features make SlamData “point and query”: point SlamData at your MongoDB database, and doyou want onof data. SlamData will generate the optimal query plan and execute it 100% in the database.If you are using MongoDB and would like to try SlamData, you can find installers on the official website , or you can compile the project from source code on Github.SlamData is a 100% open source project, so if you like what you see, please consider supporting the project in various ways:We also have a newsletter you can sign up for on the official website . Enjoy! John A. De Goes, @jdegoes , is a founder and CTO of SlamData, and a contributor to the open source SlamData project. Previously, he was General Manager of DataMesh, Principal Architect at RichRelevance, and CEO/CTO of Precog.