ShareInsights 3.0: The First No-Code Platform for Analytics on AWS Data Lakes
At this year’s AWS re:Invent, I asked all visitors to our booth if they had a data lake. Invariably, the answer was “yes”. As I dug deeper, I realized that what they meant by a data lake was lots of useful data stored in S3 buckets. But having useful data stored in S3 buckets isn’t the same as having a data lake. A useful data lake comprises cataloged, cleaned, well-understood data, tools to derive insights from the data and expose them as APIs or dashboards, and gives users the ability to run machine learning workloads on such data.
Today’s enterprises are increasingly in the habit of not throwing away data. Most data that is deemed useful is stored for a long time — though most of it is dark — never or infrequently used. But a well thought out data lake can be used to derive tremendous insights for the enterprises, augmenting or replacing the traditional warehouse approach, which is highly limiting in the kinds of insights that are readily derivable from it due to constraints on structured data or their cost.
Data lakes are used by several personas in an enterprise today, including:
- IT users, who manage the lake, data ingestion and governance
- Data engineers who prep and present cataloged, cleaned, documented data
- Analysts who understand the data, blend data and create insights as well as new data products
- Business users who create and consume visual analyses
- Data scientists who are applying machine learning to the data for various purposes
Public clouds — (AWS being the analytics services leader as of today) — have developed a plethora of technologies that are very powerful and cost-effective compared to their on-premises alternatives. AWS’s technologies include Athena, Glue, Sagemaker, Redshift Spectrum, EMR, and others. As Andy Jassy proudly proclaimed in re:Invent, AWS is focused on building powerful technologies focused on helping “builders” achieve great results. But even for top-notch “builders”, AWS’s technology choices could be hard to navigate in picking the right tool for the right workload, and to keep up with the blinding pace with which AWS evolves its services landscape. As a result, many people end up choosing the tool that is the easiest to understand or to use — often reflecting what they did originally in their on-premises environment – even though that tool may be poorly suited to the workload, taking longer and costing more than the function has to.
ShareInsights 3.0 addresses all these issues in the AWS analytics stack by unifying in one tool the myriad analytical operations — from cataloging to preparation to visualization to machine learning that any enterprise needs. It allows users to create their data pipelines in a visual environment, seamlessly uses the best AWS native analytics service for the analytics workload, orchestrates the service, executes the workload and returns the results. Crucially, it offers price and time forecasting, so users have an understanding of the performance of each service based on the individual workloads they intend to run. In such a highly specialized and complex environment, choosing the wrong tool can have huge impacts on cost and time. In some cases, the price reduction that results from moving a workload from one service to another (better suited) service is as much as 20x. ShareInsights 3.0 is the first tool to offer such price and time forecasts. Before, users would simply have to deal with sticker shock from the AWS bill.
Just as important as its benefits to technical users, possibly, is ShareInsights’ impact on business users. Public clouds are designed for developers, requiring a knowledge of programming. The complexity of the public cloud environment, which continues to increase exponentially, caused siloes to form between technical users and business users. Colleagues were forced to play a game of “telephone” — business users describe analytical needs to analysts, who then translate those needs to a developer, who then build the application. It’s a process rife with inefficiencies. ShareInsights 3.0 eliminates the need for so many different individuals to get involved. Instead, analysts can describe what they want to do and then view and analyze results all within a drag-and-drop UI. In this way, ShareInsights enables people to use machine learning in their daily workflows, democratizing the powerful machine learning applications that are driving a nascent era of innovation.
ShareInsights 3.0 shrinks the distance between analysts and insights. When business outcomes depend so heavily on an organization’s ability to act quickly on new information, that increased efficiency is the difference between a visionary company and an obsolete one. For more information, please visit https://accelerite.com/products/shareinsights/shareinsights-on-aws/.