So we have a big vision here at Codebox to keep experimenting. We are looking to start pushing the boundaries of what dev tools with intelligence could look like.

One such experiment that we have a working prototype internally is something called Syntax AI.

Core Concepts

Syntax AI consists of the following:

An inline editor assistant that is context aware of your projects code and dependencies.

Using voice that allows for “natural” (more on this later) language, giving Syntax AI the context in which it is working, such as your project and the libraries you are using.

Use the open source community code that exists on such sites as GitHub, GitLab and others to help train and feed the Syntax AI assistant.

For the example below we have used React.js (not limited to) as the context in which we refer to as the framework.

Early Demo of Speech without ML

The above does not have machine learning enabled. Also excuse my brummie voice :D

The above has machine learning enabled and additional error checking for React DOM elements.

Phases

As per the demo let’s work with the phrase:

“create hello world component with props name, age and gender as a span element”

As an example of our current API this is currently how it looks, (prone to many changes):



import { terms, transform } from ' import lexer from ' @syntax -ai/language';import { terms, transform } from ' @syntax -ai/react'; const phrase = lexer(

'create hello world component with props name age gender as a span element',

terms

); const { code, errors } = transform(phrase);

In order to start digging into this and hooking it up so that we can generate code we need some how to start understanding the phrase in which your voice is saying.

Speech Recognition

Quite simply we can use the Google Cloud Speech API to stream audio to recognise phrases into text form.

In addition the Google Cloud Speech API allows us to feed it the relevant framework terms. In this case we pass in the following terms for our demo:

const TERMS = [

"component",

"components",

"props",

"prop",

"prop types",

"prop type",

"element"

]; export default TERMS;

We still need to work on stemming, so apologies for the dupes ;o)

The API allows it to improve accuracy when speaking such terms that are not in the normal dictionary for a specific language.

Lexing Phase

The lexing phase looks at our example phrase and starts tokenising it so that it understands the context in which it needs to then generate the relevant code.

So you end up with a tokenized string such as:

Code AST Phase

After we have this we can then move on to the next phase to then generate the syntax AST. For this we use Babel to generate the AST then output the code based on the tokenized phrase generated.

We for see each framework library such as ‘@syntax-ai/react’, ‘@syntax-ai/angular’, ‘@syntax-ai/vue’ etc an implementation of phrases it is aware off based on the tokenized phrases.

Machine Learning Phase

After the code is generated in parallel we also have a phase of machine learning that you can attach to each of the different parts of the code AST generated. For our example you notice we did not specify the types for the props “name, age and gender”.

We scrape the data from the public GitHub source code to generate a model that stores prop type names and the type in which they have been set. Once we have a huge set of sample data we then feed this into AWS Machine Learning as a multi-class classification model.

http://docs.aws.amazon.com/machine-learning/latest/dg/types-of-ml-models.html

Demo of our prediction, we send “age” and get a prediction it will be of type number:

This model is then sent the prop type name and we then get a probability of what the prop type “should” be. We believe this is the key in perhaps unlocking the ability to a more natural phrase for an assistant.

These models we hope to host and have many types that can help understand and assist developers when writing code within their editor as naturally as possible.

Potential Use Cases

Accessibility

An assistant such as this could help greatly with accessibility and helping people code with certain disabilities. It can also help lessen the load of repetition day to day and help alleviate RSI.

Teacher

A way of helping people learn to code, drawing of knowledge across the whole of the open source eco system and your local project (in the piepline) this could assist from less experienced to senior engineers generate code based on the frameworks and coding styles your project and the wider open source community.

Security

With all of this data potential advisories such as our integration with the Node Security Platform could recognize them and flag them immediately as you code.

Summary

We are only touching the surface of what this potentially could do to help assist engineers to get on and focus on building features rather than repetitive tasks day to day.

Watch this space for our new package cli that allows a consistent bi-directional stream to enhance your developer workflow at speed with Codebox registries such as a private npm registry.

https://codebox.sh