Managing data from a massive stack of cloud applications has become a new challenge for many companies. The solution is to use an ETL tool to pipe all that data into a data warehouse that organizes and stores it at the same place.

There are a lot of great enterprise-grade tools such as Informatica, SAS, ODI, and Pentaho, as well as open source ones like Apache NiFi and StreamSets Data Collector. But for the scope of this post, we’ll provide an ETL tools list, which highlights tools that have shown to be good for our customers – fast-growing startups.

In our list, you’ll find an overview of the ETL tools with pricing, supported data sources and data warehouses, and other valuable info.

At the first sight of our ETL tools list, it seems that they are pretty similar. Most tools are able to pull data from different sources and push it into various warehouses. However, each of them helps to solve specific business needs. We tried to consider all these nuances and make an objective comparison.

Stitch

Stitch is a simple and powerful solution, which is great for cheaply loading data from Application Databases. It establishes connections between such data sources as MongoDB and MySQL, and SaaS tools like Zendesk and Salesforce.

Because Stitch is built on open source Singer, users can create their own integrations with a standard JSON-based format and run them. Automated monitoring and alerting provides a simple UI to check the number of rows synced by the data source, receive immediate notifications, and get activity reports.

Self-service and freemium makes Stitch a good and easy-to-start option.

Plan Name Price/month, $ Rows included Additional rows Free $0 5 million Starter $100 5 million $20 per million Basic $500 100 million $10 per million Premier $1,000 250 million $5 per million

Explore more

Segment

Segment is both an ETL and a data collection tool, which can collect events from your mobile apps, websites, and servers. You can read about how Segment routes events from different sources in our overview of the best data collection tools specifically for events data.

On the ETL side, Segment is a good option for extracting data from cloud apps like Stripe, Salesforce, or Intercom. Note that it doesn’t support Applications DB (such as MySQL, MongoDB, or PostgreSQL). Segment can be used alongside with Stitch, where Stitch is extracting data from Application DB, and Segment – from cloud apps.

Plan Name Price/month, $ Volume Sources Developer $0 1,000 MTU 2 Team $100 10,000 MTU Unlim

*Monthly Tracked Users - the number of anonymous and logged-in visitors that you track with Segment

Explore more

Blendo

Blendo focuses on integrating data from SaaS services and is built for the analyst and for the non technical user in mind. It supports around 30 different data sources. It can be used to handle data from all SaaS services or complement Segment. Blendo has several sources, which are not supported by Segment or Stitch. They focus on data completion, so the integrations that you get from Blendo have rich data models. All popular destinations, such as Redshift, BigQuery, or Snowflake are supported.

Blendo offers a $300 base plan, where you can select from a range of 50+ integrations like Zendesk, databases, or NetSuite.

Explore more

Fivetran

Fivetran is an ETL tool that offers a data connector to extract data from database and cloud services and load it into a data warehouse. In that way, business users gain access to up-to-date, row-level data.

Fivetran allows business users to manage many integrations without having to write or maintain any code. The list of integrations includes PostgreSQL, MySQL, Redshift, and many more.

Pricing is available on request.

Explore more

Alooma

Alooma provides a data pipeline as a service. You can use almost any input data source and such outputs as BigQuery, Redshift, and Snowflake. A good optimization of each connector to a particular data source gives a great throughput and allows to avoid data loss and duplicates even upon third-party failures.

Alooma offers a set of features which give visibility and control of the whole ETL process. Among them you can find real-time visualizations and querying of data streams, code engine, mapper, and restream for catching any errors.

You can get pricing only by request for your particular needs.

Explore more

Improvado

Improvado targets marketing data sources, such as Google Adwords, Facebook Ads, Youtube, and more. You can explore the full catalog of supported integrations. They have the biggest list so far for marketing data sources.

Improvado can load data into their managed warehouse, which is built on PostgreSQL, or into your own data warehouse.

Pricing is available on request.

Explore more

Flydata

Flydata is an ETL tool which can load data only to Amazon Redshift. It supports Amazon RDS, MySQL, PostgreSQL, MariaDB, Percona, and logs in CSV/TSV/JSON as data sources.

It’s a good choice if you want to move your data into a modern DB suited for aggregate processing. Often, AWS introduces FlyData to their prospects to help companies migrate to Amazon Redshift without any technical difficulties.

Plan Name Price/month, $ Rows included Additional rows FlyData Plan 5MR $199 5 Million $49/MR FlyData Plan 15MR $398 15 million $30/MR FlyData Plan 40MR $730 40 million $20/MR FlyData Plan 75MR $1,163 75 Million $14/MR FlyData Plan 150MR $1,994 150 Million $12/MR

Explore more

Singer

Singer is an open source project for ETL integrations. There are more than 20 open source integrations to data sources (so called “taps”), and more are being built all the time. If you need to develop a specific target source, you can reuse code from the existing taps and helper utilities.

Taps extract data from any source and write it to a standard stream in a JSON-based format. Targets consume data from taps and load it into a file, API, or database. For now, BigQuery, Stitch, Rakam, and some others are available.

Explore more

Conclusion

If this ETL tools list seem too extensive, start with defining the most important criteria for your company’s needs. They can be: support for data extraction, cleansing, aggregation, transformation or calculation, type and number of supported data streams, optimal relation of price, or scope of provided services.

You can always reach out to our data team to get help.

To make the challenge of choosing easier for you, we’ve prepared this comparison spreadsheet of ETL tools from our list. Enjoy!