Natural language processing algorithms are a web developer’s best friend. These powerful tools easily integrate into apps, projects, and prototypes to extract insights and help developers understand text-based content.

We’ve put together a collection of the most common NLP algorithms freely available on Algorithmia. Think of these tools as the building blocks needed to quickly create smart, data rich applications.

New to NLP? Start with our guide to natural language processing algorithms.

Quickly get the tl;dr on any article. Summarizer creates a summary of a text document, while retaining the most important parts. Summarizer takes a large block of unstructured text as a string, and extracts the key sentences based on the frequency of topics and terms it finds. Try out Summarizer here.

Sample Input:

"A purely peer-to-peer version of electronic cash would allow online payments to be sent directly from one party to another without going through a financial institution. Digital signatures provide part of the solution, but the main benefits are lost if a trusted third party is still required to prevent double-spending. We propose a solution to the double-spending problem using a peer-to-peer network. The network timestamps transactions by hashing them into an ongoing chain of hash-based proof-of-work, forming a record that cannot be changed without redoing the proof-of-work. The longest chain not only serves as proof of the sequence of events witnessed, but proof that it came from the largest pool of CPU power. As long as a majority of CPU power is controlled by nodes that are not cooperating to attack the network, they'll generate the longest chain and outpace attackers. The network itself requires minimal structure. Messages are broadcast on a best effort basis, and nodes can leave and rejoin the network at will, accepting the longest proof-of-work chain as proof of what happened while they were gone."

Sample Output:

We propose a solution to the double-spending problem using a peer-to-peer network. The network timestamps transactions by hashing them into an ongoing chain of hash-based proof-of-work, forming a record that cannot be changed without redoing the proof-of-work.

Want to understand the tone or attitude of a piece of text? Use sentiment analysis to determine its positive or negative feelings, opinions, or emotions. You can quickly identify and extract subjective information from text simply by passing this algorithm a string. The algorithm assigns a sentiment rating from 0 to 4, which corresponds to very negative, negative, neutral, positive, and very positive. Get started with sentiment analysis here.

Sample Input:

"Algorithmia loves you!"

Sample Out:

3

Need to analyze the sentiment of social media? The Social Sentiment Analysis algorithm is tuned specifically for bite-sized content, like tweets and status updates. This algorithm also provides you with some additional information, like the confidence levels for the positive-ness, and negative-ness of the text.

Sample Input:

{ "sentence": "This old product sucks! But after the update it works like a charm!" }

Sample Output:

{ "positive": 0.335, "negative": 0.145, "sentence": "This old product sucks! But after the update it works like a charm!", "neutral": 0.521, "compound": 0.508 }

Want to learn more about sentiment analysis? Try our guide to sentiment analysis.

LDA is a core NLP algorithm used to take a group of text documents, and identify the most relevant topic tags for each. LDA is perfect for anybody needing to categorize the key topics and tags from text. Many web developers use LDA to dynamically generate tags for blog posts and news articles. This LDA algorithm takes an object containing an array of strings, and returns an array of objects with the associated topic tags. Try it here.

We have a second algorithm called Auto-Tag URL, which is dead-simple to use: pass in a URL, and the algorithm returns the tags with the associated frequency. Try both, and see what works for your situation.

Sample Input:

"https://raw.githubusercontent.com/mbernste/machine-learning/master/README.md"

Sample Output:

{ "algorithm": 4, "structure": 2, "bayes": 2, "classifier": 2, "naïve": 2, "search": 2, "algorithms": 2, "bayesian": 2 }

Search Twitter for a keyword, and then analyze the matching tweets for sentiment and LDA topics. Marketing and brand managers can use Analyze Tweets to monitor their brands and products for mentions to determine what customers are saying, and how they feel about a product. This microservice outputs a list of all the matching tweets with corresponding sentiment, and groups the top positive and negative LDA topics. It also provides positive-specific and negative-specific tweets, in case you want to just look for just one flavor of sentiment. Before using this algorithm, you’ll need to generate free API keys through Twitter.

Need to analyze a specific Twitter account? We also provide Analyze Twitter User, to help you understand what a specific user tweets about positively, or negatively. This is great for identifying influencers.

Sample Input:

{ "query": "seattle", "numTweets": "1000", "auth": { "app_key": "<YOUR TWITTER API KEY>", "app_secret": "<YOUR TWITTER SECRET>", "oauth_token": "<YOUR TWITTER OAUTH TOKEN>", "oauth_token_secret": "<YOUR TWITTER TOKEN SECRET>" } }

Sample Output:

{ "negLDA": [ { "seattle's": 2, "adults": 2, "breaking": 2, "oldest": 1, "liveonkomo": 1, "teens": 2, "seattle": 5, "charged": 3 } ], "allTweets": [ { "text": "RT @Harry_Styles: Thanks for having us Seattle. You were amazing tonight. Hope you enjoyed the show. All the love", "created_at": "Thu Feb 04 19:34:18 +0000 2016", "negative_sentiment": 0, "tweet_url": "https://twitter.com/statuses/695329550536949761", "positive_sentiment": 0.55, "overall_sentiment": 0.9524, "neutral_sentiment": 0.45 }, { "text": "@kexp is super moody, super fun. Really have dug listening to them while in #Seattle nnProbably most lined up with my listening prefs", "created_at": "Thu Feb 04 19:31:04 +0000 2016", "negative_sentiment": 0.077, "tweet_url": "https://twitter.com/statuses/695328736531451904", "positive_sentiment": 0.34, "overall_sentiment": 0.8625, "neutral_sentiment": 0.583 } ], "posLDA": [ { "urban": 2, "love": 3, "lil": 2, "gentrified": 2, "openingday": 2, "cores": 2, "seattle": 8, "enjoyed": 3 } ], "negTweets": [ { "text": "I couldn't be more excited about leading a missions team to Seattle, Washington this summer! Join us!", "created_at": "Thu Feb 04 19:52:51 +0000 2016", "negative_sentiment": 0.157, "tweet_url": "https://twitter.com/statuses/695334220764463104", "positive_sentiment": 0.122, "overall_sentiment": -0.1622, "neutral_sentiment": 0.721 } ], "posTweets": [ { "text": "RT @Harry_Styles: Thanks for having us Seattle. You were amazing tonight. Hope you enjoyed the show. All the love", "created_at": "Thu Feb 04 19:34:18 +0000 2016", "negative_sentiment": 0, "tweet_url": "https://twitter.com/statuses/695329550536949761", "positive_sentiment": 0.55, "overall_sentiment": 0.9524, "neutral_sentiment": 0.45 } ] }

We’ve truncated the above output, because there is wayyy too much to display on the blog (there were 1,000+ tweets analyzed!). Check out the Analyze Tweets page to play with the algorithm, and see the full output.

Need to quickly scrape a URL, and get the content and metadata back? Analyze URL is perfect for any web developer wanting to quickly turn HTML pages into structured data. Pass the algorithm a URL as a string, and it returns the text of the page, the title, a summary, the thumbnail if one exists, timestamp, and the status code as an object. Get Analyze URL here.

Sample Input:

["https://algorithmia.com/blog/2016/02/algorithm-economy-containers-microservices/"]

Sample Output:

{ "summary": "Algorithms as Microservices: How the Algorithm Economy and Containers are Changing the Way We Build and Deploy Apps Today", "date": "2016-02-22T11:38:11-07:00", "thumbnail": "https://algorithmia.com/blog/wp-content/uploads/2016/02/algorithm-economy.png", "marker": true, "text": "Build tomorrow's smart apps today In the age of Big Data, algorithms give companies a competitive advantage. Today’s most important technology companies all have algorithmic intelligence built into the core of their product: Google Search, Facebook News Feed, Amazon’s and Netflix’s recommendation engines. “Data is inherently dumb,” Peter Sondergaard, senior vice president at Gartner and global head of Research, said in The Internet of Things Will Give Rise To The Algorithm Economy. “It doesn’t actually do anything unless you know how to use it.” Google, Facebook, Amazon, Netflix and others have built both the systems needed to acquire a mountain of data (i.e. search history, engagement metrics, purchase history, etc), as well as the algorithms responsible for extracting actionable insights from that data. As a result, these companies are using algorithms to create value, and impact millions of people a day. “Algorithms are where the real value lies,” Sondergaard said. “Algorithms define action.” For many technology companies, they’ve done a good job of capturing data, but they’ve come up short on doing anything valuable with that data. Thankfully, there are two fundamental shifts happening in technology right now that are leading to the democratization of algorithmic intelligence, and changing the way we build and deploy smart apps today: The confluence of the algorithm economy and containers creates a new value chain, where algorithms as a service can be discovered and made accessible to all developers through a simple REST API. Algorithms as containerized microservices ensure both interoperability and portability, allowing for code to be written in any programming language, and then seamlessly united across a single API. By containerizing algorithms, we ensure that code is always “on,” and always available, as well as being able to auto-scale to meet the needs of the application, without ever having to configure, manage, or maintain servers and infrastructure. Containerized algorithms shorten the time for any development team to go from concept, to prototype, to production-ready app. Algorithms running in containers as microservices is a strategy for companies looking to discover actionable insights in their data. This structure makes software development more agile and efficient. It reduces the infrastructure needed, and abstracts an application’s various functions into microservices to make the entire system more resilient. The “algorithm economy” is a term established by Gartner to describe the next wave of innovation, where developers can produce, distribute, and commercialize their code. The algorithm economy is not about buying and selling complete apps, but rather functional, easy to integrate algorithms that enable developers to build smarter apps, quicker and cheaper than before. Algorithms are the building blocks of any application. They provide the business logic needed to turn inputs into useful outputs. Similar to Lego blocks, algorithms can be stacked together in new and novel ways to manipulate data, extract key insights, and solve problems efficiently. The upshot is that these same algorithms are flexible, and easily reused and reconfigured to provide value in a variety of circumstances. For example, we created a microservice at Algorithmia called Analyze Tweets, which searches Twitter for a keyword, determining the sentiment and LDA topics for each tweet that matches the search term. This microservice stacks our Retrieve Tweets With Keywords algorithm with our Social Sentiment Analysis and LDA algorithms to create a simple, plug-and-play utility. The three underlying algorithms could just as easily be restacked to create a new use case. For instance, you could create an Analyze Hacker News microservice that uses the Scrape Hacker News and URL2Text algorithms to extract the text for the top HN posts. Then, you’d simply pass the text for each post to the Social Sentiment Analysis, and LDA algorithms to determine the sentiment and topics of all the top posts on HN. The algorithm economy also allows for the commercialization of world class research that historically would have been published, but largely under-utilized. In the algorithm economy, this research is turned into functional, running code, and made available for others to use. The ability to produce, distribute, and discover algorithms fosters a community around algorithm development, where creators can interact with the app developers putting their research to work. Algorithm marketplaces function as the global meeting place for researchers, engineers, and organizations to come together to make tomorrow’s apps today. Containers are changing how developers build and deploy distributed applications. In particular, containers are a form of lightweight virtualization that can hold all the application logic, and run as an isolated process with all the dependencies, libraries, and configuration files bundled into a single package that runs in the cloud. “Instead of making an application or a service the endpoint of a build, you’re building containers that wrap applications, services, and all their dependencies,” Simon Bisson at InfoWorld said in How Containers Change Everything. “Any time you make a change, you build a new container; and you test and deploy that container as a whole, not as an individual element.” Containers create a reliable environment where software can run when moved from one environment to another, allowing developers to write code once, and run it in any environment with predictable results — all without having to provision servers or manage infrastructure. This is a shot across the bow for large, monolithic code bases. “[Monoliths are] being replaced by microservices architectures, which decompose large applications – with all the functionality built-in – into smaller, purpose-driven services that communicate with each other through common REST APIs,” Lucas Carlson from InfoWorld said in 4 Ways Docker Fundamentally Changes Application Development. The hallmark of microservice architectures is that the various functions of an app are unbundled into a series of decentralized modules, each organized around a specific business capability. Martin Fowler, the co-author of the Agile Manifesto, describes microservices as “an approach to developing a single application as a suite of small services, each running in its own process and communicating with lightweight mechanisms, often an HTTP resource API.” By decoupling services from a monolith, each microservice becomes independently deployable, and acts as a smart endpoint of the API. “There is a bare minimum of centralized management of these services,” Fowler said in Microservices: A Definition of this New Architectural Term, “which may be written in different programming languages and use different data storage technologies.” Similar to the algorithm economy, containers are like Legos for cloud-based application development. “This changes cloud development practices,” Carlson said, “by putting larger-scale architectures like those used at Facebook and Twitter within the reach of smaller development teams.” Diego Oppenheimer, founder and CEO of Algorithmia, is an entrepreneur and product developer with extensive background in all things data. Prior to founding Algorithmia he designed , managed and shipped some of Microsoft’s most used data analysis products including Excel, Power Pivot, SQL Server and Power BI. Diego holds a Bachelors degree in Information Systems and a Masters degree in Business Intelligence and Data Analytics from Carnegie Mellon University. More Posts - Website Follow Me:", "title": "The Algorithm Economy, Containers, and Microservices", "url": "https://algorithmia.com/blog/2016/02/algorithm-economy-containers-microservices/", "statusCode": 200 }

The site map algorithm starts with a single URL, and crawls all the pages on the domain before returning a graph representing the link structure of the site. You can use the site map algorithm for many purposes, such as building dynamic XML site maps for search engines, doing competitive analysis, or simply auditing your own site to better understand what pages link where, and if there are any orphaned, or dead-end pages. Try it out here.

Sample Input:

["https://algorithmia.com",1]

Sample Output:

{ "https://algorithmia.com/about": [ "https://algorithmia.com/reset", "https://algorithmia.com/about", "https://algorithmia.com/contact", "https://algorithmia.com/api", "https://algorithmia.com/privacy", "https://algorithmia.com/terms", "https://algorithmia.com/signin", "https://algorithmia.com/" ], "https://algorithmia.com/privacy": [ "https://algorithmia.com/reset", "https://algorithmia.com/about", "https://algorithmia.com/contact", "https://algorithmia.com/privacy", "https://algorithmia.com/terms", "https://algorithmia.com/signin", "https://algorithmia.com/" ], "https://algorithmia.com": [ "https://algorithmia.com/reset", "https://algorithmia.com/about", "https://algorithmia.com/contact", "https://algorithmia.com/privacy", "https://algorithmia.com/terms", "https://algorithmia.com/signin", "https://algorithmia.com/" ] }

For more information, visit our Introduction to Microservices blog post.

Ready to learn more? Our free eBook, Five Algorithms Every Web Developer Can Use and Understand, is short primer on when and how web developers can harness the power of natural language processing algorithms and other text analysis tools in their projects and apps.