10 Data Visualization Tools

Data Visualization Tools

There are no shortages of data analytics tools that deal with the entire process from generation to visualization, and its infrastructure is shown in Figure 1.  According to Truck (2016), the primary data analytics visualization tools are Tableau, Google Cloud Platform, Qlik, Looker, RoamBI, Chartio, datorama, Zoomdata, Sisense, and Zeppelin.  According to Machlis (2011), she lists 22 different data visualization tools: R, DataWrangler, Google Refine, Google Fusion Tables, Impure, Tableau Public, Many Eyes, VIDI, Zoho Reports, Choosel, Exhibit, Google Chart Tools, JavaScript InfoVis Toolkit, Protovis, Quantum GIS (QGIS), OpenHeatMap, OpenLayers, OpenStreetMap, TimeFlow, IBM Word-Cloud Generator, Gephi, and NodeXL. Then, Jones (2014) listed the top 10 tools: Tableau Public, OpenRefine, KNIME, RapidMiner, Google Fusion Tables, NodeXL, Import.io, Google Search Operators, Solver, and WolframAlpha.  Even, the California HealthCare Foundation (CHCF, 2014) recommended that data visualization tools that everyone in the healthcare industry could use would be: Google Charts & Maps, Tableau Public, Mapbox, Infogram, Many Eyes, iCharts, and Datawrapper. CHCF (2014) also recommended some data visualization tools for developers in the healthcare industry could use such as High Charts, TileMill, D3.js, FLOT, Fusion Charts, OpenLayers, and JSMap.  These four cases are all examples that no matter which data visualization software is discussed here, there is a plethora of others and there are currently no authoritative sources listing all of them.  This discussion is also not trying to compile a comprehensive or authoritative source either.

1

Figure 1: Big Data Landscape 2016 which categorizes big data tools and applications by Infrastructure, Analytics, Application, Open Source, Data Sources & APIs, Incubators & Schools, and Cross-Infrastructure/Analytics. (Adapted from Truck, 2016).

Ten Data Visualization Tools and their strengths and weaknesses

Based on the above subject matter experts the following ten visualization tools will be discussed:  Tableau & Tableau Public, Google Fusion Charts, OpenLayers, Chartio, Datorama, Zoomdata, NodeXL, Qlik, Looker, and RoamBI.

Tableau Desktop & Tableau Public

Tableau Desktop is a $1000-1200 product, whereas Tableau Public is free and it is marketed as an end-user interactive business intelligence software to help provide insights hidden in the data (Jones, 2014; Machlis, 2011; Phillipson, 2016; Tableau, n.d.).  Tableau is touted to be 10-100x faster than most other commercial visualization software through its intuitive no-coding drag and drop products (Tableau, n.d.).  Tableau can take in data from excel spreadsheets, Hadoop, cloud, etc. and bring them together for comprehensive data analysis (Jones, 2014; Phillipson, 2016; Tableau, n.d.).  The strength of Tableau Public is all the functionalities seen in Tableau Desktop is provided for free. However any data stored in Tableau Public is made freely available to others within the community (Jones, 2014; Machlis, 2011).  If data privacy is sought, Tableau Desktop allows the data scientist to analyze the data locally, without sharing key information to the world, but at a price (Machlis, 2011; Tableau, n.d.).

Google Fusion Charts

Fusion charts is a web-based tool that is assessable to all with a google drive account, and it allows for control over many different aspects over the data visualizations, where data scientists can limit the amount of data shown, summarize the data, choose from different chart types, and customize legends without the need to know how to code (Google, n.d.; Machlis, 2011). Jones (2014) calls Fusion Charts the “Google Spreadsheets cooler, larger, and much nerdier cousin.” Data could be found through the google search engine or imported quickly from CSV, TSV; UTF-8 encoded files, etc. (Google, n.d.; Jones, 2014). Data can even be exported into JSON files, and all the data could be analyzed in private or can be set free to the public (Machlis, 2011). Certain interactive charts provided by Fusion are Network graphs, zoomable line charts, map charts, heat maps, timeline, storyline, animation, pie charts, tables, scatter plots, etc. (Google, n.d.; Machlis, 2011).  The downsides of this tools are how tedious it can become to edit multiple cells entries, the customizations are quite limiting, and that for large data the API can demand a ton of resources slowing down the execution (Machlis, 2011).

OpenLayers

OpenLayers is a JavaScript library for displaying mapping geolocation data; that is easily customizable and extendable using cutting edge tiled or vector layer mapping formats (Machlis, 2011; OpenLayers, n.d.). OpenLayers is an open source code (Machlis, 2011). Some of the maps that can be created are animated, blended, attribution, cluster features, integration with Bing maps, d3 Integration, drag and drop interaction, dynamically added data, etc. (OpenLayers, n.d.).  One of the drawbacks is that it requires a bit amount of coding skill in JavaScript and certain integrations with popular maps are still under development. However, it can run on any web browser (Machlis, 2011).

Chartio

Chartio is a software as a service, visual query tool that pulls and joins data from multiple sources easily, without knowledge of SQL (Rist & Strom, 2016).   Chartio can process the data into visualizations to aid in building a case with data-driven analytics and dashboard all for $2000 (Chartio, n.d.; Rist & Strom, 2016). Chartio’s commercial product can pull Amazon RDS, Cassandra, CSV files, DB2, Google Cloud SQL, Google BigQuery, Hadoop, MongoDB, Oracle, Rackspace Cloud, Microsoft SQL Server, Windows Azure Cloud, etc. data (Chartio, n.d.; Rist & Strom, 2016). Unfortunately, the user interface is poorly designed and has a learning curve that is greater than other data visualization tools (Rist & Strom, 2016). Chartio (n.d.) boasts that connecting to any of the databases above just requires two terminal commands and that the data pulled from these databases are done through read-only to protect the data. However, Rist and Strom (2016) had initial problems uploading data onto their tool, mostly due to the responsiveness of the API.

Datorama

An Israeli-based company, Datorama is a cloud-based system and tool that allows for marketing analytics (Gilad, 2016).  Data sources that Datorama uses can come from Facebook, Google, ad exchanges, networks, direct publisher sites, affiliate programs, etc. and can visually demonstrate and monetize the marketing data (Datorama, n.d., Gilad, 2016).  Datorama allows for multi-level authentication for advanced security (Phillipson, 2016). According to Datorama (n.d.), their tool allows for comparisons between online and offline marketing analysis on a single dashboard. Unfortunately, marketing/sales data is the primary use of this tool, and there are other tools in existence that analyze marketing/sales data and much more (Gilad, 2016).  To know the cost of this software one must obtain a quote (Phillipson, 2016).

Zoomdata

Zoomdata is an intuitive and collaborative way to visualize data that was built with HTML5, JavaScripts, WebSockets and CSS and expandable libraries such as D3, Leaflet, NVD3, etc. (Zoomdata, n.d.)  Graphing features can include dynamic dashboards with drill down capabilities on, tabular, geodata, pie charts, line graphs, scatter plots, bar charts, stacked bar charts, etc. (Darrow, 2016; Zoomdata, n.d.).  Zoomdata allows for web browsing and touch-oriented analysis and can handle real-time data streams and billions of rows of data (Zoomdata, n.d.). The downside is that this software as a service is a commercial software, which can set you back $1.91/hour (Darrow, 2016). However, Zoomdata can connect to Hadoop, Cloudera MongoDB, Amazon, NoSQL, MPP, and SQL databases, cloud applications, etc. (Darrow, 2016; Zoomdata, n.d.).

NodeXL

NodeXL basic is an open source software that is a Microsoft Excel 2007-2016 plug-in, helps in making it easy to graph and explore network graphs and relationships through entering network edge lists (Jones, 2014; Machlis, 2011; NodeXL, 2015).  NodeXL Pro ($29/year-$749/year) offers extended features from the basic, like dealing with data streams for social networks, text and sentiment analysis, etc. (NodeXL, 2015).  Data pulled from Facebook, Flickr, YouTube, LinkedIn, and Twitter could be represented through this tool (Jones, 2014; Machlis, 2011). Graph Metrics like degree, closeness centrality, PageRank, clustering, graph density is all available in NodeXL (Jones, 2014; NodeXL, 2015).  Editing the appearance of the graphs like color, shape, size, label and opacity can be done through both versions (NodeXL, 2015). Unfortunately, the tool is limited mostly to network analysis (Jones, 2014; Machlis, 2011; NodeXL, 2015).

Qlik

This free self-service data visualization tool, which allows you to create dynamic and interactive visualizations that one could keep the data on their desktop, without having to release their data to the public (Machlis, 2015; Qlik, n.d.).  It is free for both personal and internal business use (Qlik, n.d.).  Unfortunately, it isn’t easy to share data or visualizations with peers but, Qlik also allows sharing data for up to 5 people privately through their cloud services (Machlis, 2015). Qlik allows integration without a data warehouse from data sources likes Hadoop, Microsoft Excel, LinkedIn, Twitter, Facebook, cloud, databases, etc. (Qlik, n.d). Though there is a learning curve to this software, it is not insurmountable, and a user can quickly learn how to do basic graphs through with multiple filters (Machlis, 2015).

Looker

Looker aims to be a data visualization and exploration tool to be used by multiple people and aims to remove the data analytics bottleneck caused by data scientists controlling all the data (Looker, n.d.).  With these data models, it can help define all the measures and dimensions behind the data (Software Advice, n.d.). This tool allows for data models, custom metrics, real-time analysis, and blending between data sets to produce drill down dashboards with the basic charts, graphs, and maps (Looker, n.d.; Software Advice, n.d.). Data inputs can come from commercial off the shelf products like Salesforce or from internally created software and applications (Looker, n.d.). According to a customer of the Looker software, documentation is behind, making it hard to do certain tasks, and another customer says that once a data model is in the application, it becomes hard to edit (Software Advice, n.d.).

RoamBI

A data visualization tool that can be taken anywhere, and primarily built for mobile devices, which can include data from Microsoft Excel, CSV data, SQL Server, Cognos, Salesforce, SAP, Box data, etc. (Bigelow, 2016; MeLLmo Inc., n.d.).  It has been designed for mobile devices to allow for data sharing, exploration, and presentation (MeLLmo Inc., n.d.). It is such a popular software that all ten major pharmaceutical companies are using RoamBI on their iPads (Bigelow, 2016). Visualizations capitalize on tabular data, spark-lines, bar charts, line charts, stacked bar charts, pie charts, bubble charts, KPIs, etc. all on a dashboard, but are not customizable and reporting dimensions are limited (Authors, 2010; MeLLmo Inc., n.d.).  The free version of the application allows for localized data to be uploaded and used, whereas the Pro ($99/year or $795 perpetual) version of the application allows for data connections from online sources (Authors, 2010).

 

In the end, each of the ten data visualization tools has their advantages and disadvantages along with different price points.  The best way to select the right tool is knowing what one’s data visualizations needs are and compare these and other tools based on those needs.  The tool that meets most or all of the needs should then be selected.

Resources:

One thought on “10 Data Visualization Tools”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: