Visualisation showdown

From Kautepedia
Jump to navigation Jump to search

Background[edit | edit source]

This page documents a test/compare process for Power BI alternatives in KPT context.

Issues with Power BI[edit | edit source]

Where to begin....

Alternative 1: AWS Quicksight[edit | edit source]

AWS QuickSight Standard Edition Pricing[edit | edit source]

  • Annual Plan: $9 per user/month
  • Monthly Plan: $12 per user/month
  • Included SPICE Capacity: 10 GB per user
  • Additional SPICE Capacity: $0.25 per GB/month

Note: The Standard Edition is primarily designed for individual users and small groups. It does not support the Reader role; thus, all users are considered Authors.

Features of interest[edit | edit source]

  • Fully managed, serverless BI service.
  • Integrates with AWS ecosystem (RDS etc).

Architecture[edit | edit source]

AWS QuickSight runs as a fully managed cloud service, eliminating the need for self-hosted infrastructure. Some architectural considerations:

  • Data Sources: Supports AWS-native sources (Redshift, S3, Athena, RDS) as well as external sources like MySQL, PostgreSQL, and SaaS applications.
  • SPICE Engine: In-memory caching and acceleration layer for interactive analysis.
  • Access Control: IAM-based security with additional user-based permissions.
  • Deployment Model: No on-premises installation required; fully cloud-

Issues with Quicksight[edit | edit source]

  • Price: Higher cost compared to open-source alternatives.
  • No Click-to-Filter by Default: Unlike Power BI or Superset, QuickSight does not offer click-to-filter interactions out of the box.
  • Limited Free Tier: Unlike some BI tools, QuickSight does not offer a fully functional free tier, requiring at least a paid Author license.
  • SPICE Limitations: While SPICE is useful for performance, the 10 GB per-user allocation can be restrictive, and additional storage incurs extra costs.
  • Customization Constraints: Fewer options for UI customization, branding, and layout adjustments.
  • Dependency on AWS Ecosystem: Works best with AWS-native data sources, making integration with non-AWS systems more complex.
  • Limited Advanced Visualizations: Compared to Power BI or Superset, QuickSight has fewer built-in visual customization options.


Alternative 2: Apache Superset[edit | edit source]

Test instance available at http://ec2-13-210-78-199.ap-southeast-2.compute.amazonaws.com:8088/login/

Features of interest[edit | edit source]

Almost unlimited configurability. Click to filter by default on most visuals. RLS. CSS/branding/templating. Rest API for programmatic access. OAuth (presumably requiring access to Azure Entra for SSO).

Architecture[edit | edit source]

Lots of different options. Default/quickstart is to run as Docker container, however this is not a good production solution. Please refer to extensive Superset docs, but broadly a sensible approach appears to be use of Kubernetes to serve the app:

  • App Host:
    • Minikube can be used to serve the app, this would be deployed on EC2 instance[1] Minikube is good for simple, single instance deployments.
    • Amazon Elastic Kubernetes Service. Is a Kubernetes orchestration service which can automatically scale up/down and provision EC2 resources as needed.
    • AWS Fargate could also be used in conjunction with EKS to reduce management/admin burden. Hard to quantify relative cost/benefit of this at this stage, but it should be looked at.

Setup notes[edit | edit source]

Some random things to note about install/setup here:

  • Ubuntu hosts need to install docker-compose-v2 package, NOT docker-compose. I have no idea why even ubuntu 24 doesn't have v2 by default, because it screws up when trying to deploy containers that have already shifted to newer and more secure docker compose practices. If you get any errors with docker, this is probably your issue.
  • There is a sophisticated but non-intuitive configuration model for Superset. Please refer to this and this, for example.

Security[edit | edit source]

This needs a lot of attention, since Superset will theoretically provide access to identifiable client data. Some things to consider:

  • Ensuring that DB connection is secure.[2]
  • Ensuring that specific service-account DB roles are only used in Superset. For auditing we must ensure that no regular user/team credentials are used, as well as making sure that least privilege principles are followed in terms of what the DB role can access.
  • Host app should have SSL/TLS enabled and available via port 443 (forwarded from 8088 or perhaps some other host config is possible).
  • 'Secret key' configuration. Via superset_config.py should be configured. See here for not much detail.
  • Metadata database. Whilst Superset docs state that SQLite is used by default, the quickstart container actually stands up a Postgres instance. Whatever final solution is deployed, we absolutely need to ensure that the metadata backend is running on a properly secured production database.[3]
  • Superset uses Flask-AppBuilder (FAB), which supports OAuth2/LDAP providers - including Azure/Entra - out of the box (apparently). We should use this to manage access, and permit use of existing AD groups for access (membership of which can be centrally managed via our MSP). OAuth groups can be mapped to Superset groups by setting an AUTH_ROLES_MAPPING dictionary.[4]
  • CORS might need to be set.
  • Superset roles. Remove access from Public superset group.
  • Admin role. Reset default admin password, or preferably inactivate default admin account.


Alternative 3: Microsoft On-premises data gateway with PowerBI[edit | edit source]

cost :[edit | edit source]

  • MS Gateway - Free
  • EC2 (t3a.large, 2CPU & 8G memory) ~ $60/month

References[edit | edit source]

  1. Medium is needed, as a minimum.
  2. At the time of writing, we plan to put the RDS instance in an isolated VPC which cannot be directly accessed over the internet. The App host will therefore need to manage incoming public traffic and securely route database requests accordingly.
  3. AKA Postgres.
  4. I don't know where this dict is supposed to live, but some documentation can be found here.