Apache superset установка на windows

Apache SuperSet – достаточно известное в кругах bi-аналитиков решение с открытым исходным кодом. Самый простой способ установить его локально — использовать Docker на ОС Linux или Mac OSX. Однако, большинство пользователей дашбордов используют в своей работе Windows, которую, к сожалению, SuperSet официально не поддерживается. Решить проблему можно установив Docker на виртуальную машину Ubuntu через VirtualBox или использовав поддержку WSL2 в Docker Desktop. В своих проектах мы используем второй вариант и в данной статье приведем инструкцию по установке используя WSL2.

1. Установка WSL2

1.1. Необходимо убедиться, что ваша сборка Windows 10 19041 и выше. Если это не так ее нужно обновить.

1.2. Далее открываем PowerShell от имени администратора и запускаем команду установки wsl:

wsl --install

После окончания установки, Появится сообщение о необходимости перезагрузить компьютер:

1.3. После перезагрузки будет доступно приложение WSl в отдельном окне Ubuntu. Если вы в первый раз запускаете виртуализацию в своей системе, то все необходимые компоненты выключены и возникнет следующая ошибка:

Чтобы ее исправить в поиске находим «Включение и отключение компонентов Windows» и включаем Hyper V (если включение Hyper V недоступно, то необходимо включить компонент виртуализации в BIOS):

Нажимаем «OK», появится сообщение о необходимости перезагрузки. Нажимаем «Перезагрузить сейчас»:

1.4. После перезагрузки запускаем приложение Ubuntu, вводим имя пользователя UNIX и пароль.

1.5. Проверим доступ на диск С из-под Ubuntu: ll /mnt/c/

1.6. Проверим доступ из-под Windows к файловой системе Ubuntu: \\wsl$

2. Установка Docker Desktop.

2.1. Переходим по ссылке и скачиваем Docker Desktop:

https://docs.docker.com/desktop/install/windows-install/

2.2. Запускаем приложение, в окне конфигурации устанавливаем следующие галочки:

После установки необходимо будет согласиться с перезагрузкой компьютера и принять лицензионное соглашение. Запустится окно Docker.

2.3. Заходим в настройки (кнопка с шестеренкой в правой верхней части), затем Resources > WSL integration, включаем Ubuntu:

Далее нажимаем кнопку «Apply & restart» и «Cancel».

2.4. Заходим в Ubuntu, проверяем доступ к Docker из WSL. Запускаем команду

docker –v

3. Установка Apache SuperSet

3.1. Клонируем проект из github:

git clone https://github.com/apache/superset.git

3.2. Переходим в папку superset и запускаем контейнер

cd superset

sudo docker-compose -f docker-compose-non-dev.yml up -d

3.3. Проверяем запуск контейнера в Docker Desktop

3.4. Переходим по адресу http://localhost:8088/ и авторизуемся

логин/пароль: admin/admin

Источник: Блог bilab

⚠️ WARN: This doc might be outdated. Use with caution. Only tested with Python v3.7

🙋‍♂️ INFO: If you have fixes/suggestions to for this doc, please comment below.

🌟 STAR: This doc if you found this document helpful.


Pre-Requisites

  1. Install Microsoft Visual C++ 14.x standalone: Build Tools for Visual Studio 2019 (x86, x64, ARM, ARM64)

    1. Select latest version of MSVCv142 — VS 2019 C++ x64/x86 build tools

    2. Select Windows 10 SDK

  2. Install Python 3.7.x

    1. Install PIP within the installer

    2. Add Python 3.7 to PATH

  3. Use CMD to execute below commands (Recommended)

Installation

Ideally run these commands sequentially …

:: Create directory to host the files
mkdir D:\superset
cd /d D:\superset

:: Check Versions
python --version
pip --version
systeminfo | findstr /C:"OS"

:: Upgrade Setuptools & PIP
pip install --upgrade setuptools pip

:: Create Virtual Environment named venv
python -m venv venv

:: Activate Virtual Environment
venv\Scripts\activate

:: Workaround - Install PIP within Virtual Environment
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py --ssl-no-revoke
python get-pip.py

:: Workaround - Upgrade Setuptools & PIP within venv
pip install --upgrade setuptools pip

:: (OUTDATED) Install Superset Requirements - Not required towards end of 2020
:: -- ref: https://gist.github.com/mark05e/d9cccae129dd11a21d7219eddd7d9923#gistcomment-3614048
:: curl https://raw.githubusercontent.com/apache/incubator-superset/master/requirements.txt -o requirements.txt --ssl-no-revoke
:: pip install -r requirements.txt

:: Install Superset
pip install apache-superset

:: (OUTDATED) Workaround - Install specific versions for compatibility (USE WITH CAUTION - ONLY IF NEEDED)
:: pip uninstall pandas
:: pip install pandas==0.23.4
:: pip install Flask==1.0
:: pip install SQLAlchemy==1.2.18

:: Install DB Drivers - Postgres & MS SQL
pip install psycopg2
pip install pymssql

:: Open Scripts folder to do superset related stuff
cd venv\Scripts

:: Create application database
python superset db upgrade

:: Create admin user
set FLASK_APP=superset
flask fab create-admin

:: Load some data to play with (optional)
python superset load_examples

:: Create default roles and permissions
python superset init

:: Start web server on port 8088
python superset run -p 8088 --with-threads --reload --debugger

(OUTDATED) Sample requirements.txt

Click to expand!

This sample can be used as a reference if something does not work correctly or incase of any installation issues.

alembic==1.0.11
amqp==2.5.0
apispec==1.3.3
asn1crypto==0.24.0
attrs==19.1.0
Babel==2.7.0
billiard==3.6.0.0
bleach==3.1.0
boto3==1.4.7
botocore==1.7.48
cchardet==2.1.4
celery==4.3.0
certifi==2019.6.16
cffi==1.12.3
chardet==3.0.4
click==6.7
colorama==0.3.9
contextlib2==0.5.5
cryptography==2.7
defusedxml==0.6.0
docutils==0.15.2
et-xmlfile==1.0.1
Flask==1.0
Flask-AppBuilder==2.1.10
Flask-Babel==0.12.2
Flask-Caching==1.7.2
Flask-Compress==1.4.0
Flask-JWT-Extended==3.21.0
Flask-Login==0.4.1
Flask-Migrate==2.5.2
Flask-OpenID==1.2.5
Flask-SQLAlchemy==2.4.0
Flask-WTF==0.14.2
flower==0.9.3
future==0.16.0
geographiclib==1.49
geopy==1.20.0
gunicorn==19.9.0
humanize==0.5.1
idna==2.8
ijson==2.4
isodate==0.6.0
itsdangerous==1.1.0
jdcal==1.4.1
Jinja2==2.10.1
jmespath==0.9.4
jsonlines==1.2.0
jsonschema==3.0.2
kombu==4.6.3
linear-tsv==1.1.0
Mako==1.1.0
Markdown==3.1.1
MarkupSafe==1.1.1
marshmallow==2.19.5
marshmallow-enum==1.4.1
marshmallow-sqlalchemy==0.17.0
numpy==1.17.0
openpyxl==2.4.11
pandas==0.23.4
parsedatetime==2.4
pathlib2==2.3.4
polyline==1.4.0
prison==0.1.2
psycopg2==2.8.3
pure-sasl==0.6.1
pycparser==2.19
pydruid==0.5.6
PyHive==0.6.1
PyJWT==1.7.1
pymssql==2.1.4
pyrsistent==0.15.4
python-dateutil==2.8.0
python-editor==1.0.4
python-geohash==0.8.5
python3-openid==3.1.0
pytz==2019.2
PyYAML==5.1.2
requests==2.22.0
rfc3986==1.3.2
s3transfer==0.1.13
simplejson==3.16.0
six==1.12.0
SQLAlchemy==1.2.18
SQLAlchemy-Utils==0.34.1
sqlparse==0.3.0
superset==0.28.1
tableschema==1.6.0
tabulator==1.23.0
thrift==0.11.0
thrift-sasl==0.3.0
tornado==5.1.1
unicodecsv==0.14.1
Unidecode==1.1.1
urllib3==1.25.3
vine==1.3.0
webencodings==0.5.1
Werkzeug==0.15.5
WTForms==2.2.1
xlrd==1.2.0

This output can be obtained by running this command against your venv

pip freeze > requirements.txt

🔎 References

  1. Superset Installation Guide

  2. Windows Compilers for Python

  3. Direct Link to download Visual Studio Build Tools 2019 (might be broken)

  4. Stackoverflow — Remove PIP installed packages

  5. Workaround — Pandas package downgrade (Ref 1, Ref 2)

  6. Workaround — SQLAlchaemy package downgrade (Ref 1)

  7. @philip-sparks suggestion from superset issue #5788

  8. @hainv suggestion

  9. @ronna update command — ref 1, ref 2


mark05e / apache-superset-on-windows10.md

WINDOWS

Apache Superset is a powerful modern data exploration and visualization platform that allows users to visualize and explore their data in a user-friendly environment. Originally developed by Airbnb, Superset can connect to a wide variety of databases, making it an invaluable tool for data analysts and business intelligence professionals. In this article, we will walk you through the step-by-step process of installing Apache Superset on a Windows system using Windows Subsystem for Linux (WSL).

Prerequisites

Before you begin the installation, ensure you have the following prerequisites:

  • Windows 10 or later: WSL is supported on Windows 10 and later versions.

  • WSL Installed: To use Ubuntu in WSL, ensure you have it installed. You can check this by opening PowerShell and typing:

    If you don’t have WSL installed, you can enable it using the following command in PowerShell:

  • Ubuntu Distribution: You can install Ubuntu from the Microsoft Store.

  • Python (3.6 or higher): Superset requires Python for installation. You can check the version by running python3 --version. If it’s not installed, you can install Python using the following commands:

    sudo apt update
    sudo apt install python3 python3-pip python3-dev
    

Steps to Install Apache Superset on WSL

Step 1: Install Required Dependencies

Before installing Superset, you must install several dependencies. Start your Ubuntu WSL terminal and run the following commands:

sudo apt update
sudo apt install build-essential libssl-dev libffi-dev python3-dev \
    libsasl2-dev libmysqlclient-dev libpq-dev libjpeg-dev \
    libxml2-dev libxslt1-dev zlib1g-dev

Step 2: Create a Virtual Environment

Creating a Python virtual environment helps manage dependencies effectively for different projects. You can create a virtual environment using venv:

python3 -m venv superset-venv
source superset-venv/bin/activate

Step 3: Upgrade pip and Install Apache Superset

Once your virtual environment is activated, ensure you have the latest pip version and install Apache Superset:

pip install --upgrade pip
pip install apache-superset

Step 4: Configure the Superset Database

Apache Superset requires a database for its metadata. You’ll typically use SQLite for testing, but for a production scenario, it’s advisable to use PostgreSQL or MySQL.

To initialize the database with SQLite:

If you opt for a different database, you will need to set the database URI. For instance, to use PostgreSQL:

  1. Install PostgreSQL dependencies:

    sudo apt install postgresql postgresql-contrib libpq-dev
    
  2. Create a new PostgreSQL user and database:

    sudo -u postgres createuser supersetuser -P
    sudo -u postgres createdb supersetdb -O supersetuser
    
  3. Configure the Superset database connection:
    Update the DATABASE_URL in the superset_config.py or use the environment variable:

    export DATABASE_URL="postgresql+psycopg2://supersetuser:yourpassword@localhost/supersetdb"
    

Step 5: Create an Admin User

Now, create an admin user that will have access to the Superset dashboard:

export FLASK_APP=superset
superset fab create-admin --username admin --firstname Admin --lastname User --email admin@superset.com --password admin

Step 6: Initialize Superset

After setting up the admin user, you need to initialize Superset. Running the following commands will load sample data and create necessary roles and permissions:

superset db upgrade
superset load_examples
superset init

Step 7: Start Apache Superset

To run the Apache Superset server, execute the following command:

superset run -p 8088 --with-threads --reload --debugger

Step 8: Access Superset

Once the server is running, open your web browser and navigate to:

Log in using the credentials you created for the admin user.

Conclusion

With the above steps, you have successfully installed Apache Superset on your Windows machine using Ubuntu WSL. You can explore its extensive features for data visualization and analytics, enabling you to harness the power of your data effectively.

Additional Resources

  • Apache Superset Documentation

  • WSL Installation Guide

Keywords

  • Apache Superset

  • Windows Subsystem for Linux

  • Install Apache Superset on Windows

  • Data Visualization

  • WSL Ubuntu Installation

By following this guide, you will empower your data visualization capabilities with Apache Superset on a Windows environment. Happy exploring!

Suggested Articles

WINDOWS

WINDOWS

WINDOWS

WINDOWS

WINDOWS

WINDOWS

Don’t let AI Agents fail in production

Restack backend framework provides long-running workflows and infrastructure for reliable & accurate AI agents.

Get started with example agents

Research Paper

Agent accuracy benchmark

Many enterprises are exploring AI agents, but one issue blocks their adoption: keeping them accurate and on brand. General-purpose LLMs hit only 51% accuracy, while fine-tuned small agents reach 99.7%.

Agent accuracy benchmark

The trust in AI is eroding due to unreliable, poorly designed agents. For AI to reach its full potential, we need better ones. Restack helps you build agents that are reliable, scalable, and ready for real-world use.

Features

The Restack framework

Build reliable and accurate AI agents with Restack.

Simulate, time travel and replay AI agents

Developer UI

Simulate, time travel and replay AI agents

The Restack developer toolkit provides a UI to visualize and replay workflows or individual steps. Open a favourite IDE like VS Code or Cursor on one side and view workflows on the other to improve debugging and local development.

Get started in seconds

Start building with Restack AI framework and deploy with Restack Cloud.

Introduction

What is Superset

  • Superset is a Data Visualization tool which is cloud-native, highly available and scalable as it works very well with containers.
  • You can easily install it on a Kubernetes cluster using helm charts and then customize it to your needs
  • It was developed at Airbnb by the creator of Airflow, Maxime Beauchemin and became a top level project in Apache foundation in 2021.
    – Significant contributions were made by Lyft and Dropbox.
    – It is also available as a managed service by Maxime’s company Preset
  • Superset is currently run at scale at many companies. For example, Superset is run in Airbnb’s production environment inside Kubernetes and serves 600+ daily active users viewing over 100K charts a day.

What this article will cover

This article will cover steps to locally install superset on Windows with DB drivers for MS SQL Server, Dremio, MySQL and Pyodbc. While performing this installation I had to dig a lot to get it working. The end result is the article below especially the Dockerfile below which will be very helpful for you.

Prerequisites

Docker Desktop

If you do not already have Docker Desktop installed please follow the steps mentioned in the link below.

https://docs.docker.com/desktop/windows/install/

Once everything is installed correctly you should see a Docker icon in your system tray

Just double click on this icon to open the Docker Desktop window, where you can see all your containers whether stopped or running

MSSQL Server

This is only needed if you are using local version of MSSQL server.

For installing MSSQL Server Download the server installation from below link

https://go.microsoft.com/fwlink/?linkid=866662

Once server is installed install the SQL Server Management Studio (SSMS) from below link from Microsoft

https://docs.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms?view=sql-server-ver15

Once SSMS is installed, connect to the server using localhost and Windows Authentication mode using SSMS.

After you are connected, create a new user login and call it anything you want. I named it superset. Important thing to note is to enable “SQL Server and Windows Authentication mode” on server properties (as shown below). You can skip all these steps if you already have a SQL Server available anywhere on your network.

Dremio

To install Dremio you will need to download the docker image from docker hub

https://hub.docker.com/r/dremio/dremio-oss/

The way you can do is from your command prompt on Windows.

Step1:- Open Command Prompt

Step2:- Run the following command

docker pull dremio/dremio-oss

Step3:- Then run below docker command to start the container

docker run -p 9047:9047 -p 31010:31010 -p 45678:45678 dremio/dremio-oss

Once the command completes check your Docker containers list as shown in Docker step above (in Docker Desktop section), you should see a Dremio container running.

Step4:- Then go to the page localhost:9047 on your web browser and for the first time it will ask you to setup a new user account. Create the account by following the instructions on the screen and keep a note of the username and password. You will need them while connecting from Superset to Dremio

Step5:- Once you are logged in to Dremio, upload a sample csv file to Dremio with data which you would like to visualize on superset. As you can see in below screenshot I have uploaded WheatWorldPSD.csv file using the button highlighted in red

Now you are all set to install Superset along with necessary drivers

Custom Dockerfile for Superset

I have spent lots of hours in getting this right so that everything works. Please use below Dockerfile I used to install custom DB drivers for MSSQL, MYSQL and DREMIO.

FROM apache/superset
# Switching to root to install the required packages
USER root

# install FreeTDS and dependencies
RUN apt-get update \
 && apt-get install unixodbc -y \
 && apt-get install unixodbc-dev -y \
 && apt-get install freetds-dev -y \
 && apt-get install freetds-bin -y \
 && apt-get install tdsodbc -y \
 && apt-get install --reinstall build-essential -y \
 && apt-get install rpm2cpio -y \
 && apt-get install cpio -y

 # populate "ocbcinst.ini"
RUN echo "[FreeTDS]\n\
Description = FreeTDS unixODBC Driver\n\
Driver = /usr/lib/x86_64-linux-gnu/odbc/libtdsodbc.so\n\
Setup = /usr/lib/x86_64-linux-gnu/odbc/libtdsS.so" >> /etc/odbcinst.ini

# Install Dremio ODBC driver
RUN pip install pyodbc
RUN wget https://download.dremio.com/odbc-driver/1.5.4.1002/dremio-odbc-1.5.4.1002-1.x86_64.rpm

RUN rpm2cpio dremio-odbc-1.5.4.1002-1.x86_64.rpm | cpio -idmv
RUN mv opt/dremio-odbc /opt/dremio-odbc


# Example: installing the MySQL driver to connect to the metadata database
# if you prefer Postgres, you may want to use `psycopg2-binary` instead
RUN pip install mysqlclient
# Example: installing a driver to connect to Redshift
# Find which driver you need based on the analytics database
# you want to connect to here:
# https://superset.apache.org/installation.html#database-dependencies
#RUN pip install pydobc
RUN pip install sqlalchemy-redshift
RUN pip install sqlalchemy-dremio
RUN pip install pymssql
# Switching back to using the `superset` user
USER superset

How to run the above docker file

Open the above dockerfile in Visual Studio code

Build the image (may take some time) – remember the dot “.” is important at the end of command. Do not miss it!. Run below command in the terminal window of your VS code

docker build -t custom_superset .

Run Docker container

docker run -d -p 8080:8088 --name superset custom_superset

Create Superset admin account

docker exec -it superset superset fab create-admin --username admin --firstname Superset --lastname Admin --email admin@superset.com --password admin

Migrate local DB to latest

docker exec -it superset superset db upgrade

Load Examples

docker exec -it superset superset load_examples

Initialize Superset

docker exec -it superset superset init

Then login at below URL with username/password as admin/admin

http://localhost:8080/login/ — u/p: [admin/admin]

There you have it Superset running as a docker container on your local machine 🙂 You are wlecome!

Next time you want to run it just start it from the Container list window on your Docker Desktop. All your connectivity settings will remain intact.

Connectivity to MS SQL server

Step1: Goto Data->Databases dropdown and select Add Database as shown in the screenshot below

Step2:- Choose Microsoft SQL Server from the Supported Databases dropdown in the popup as shown below. You can select Other if you don’t see it as an option.

Step3:- On the next screen give a meaningful name to your connection and enter the below string as SQLALCHEMY URI. Replace the username and password which is superset and superset90 in my case.

Also as superset is running under docker container and my MS SQL Server is running on local host, I needed to use the host.docker.internal as hostname. You can replace it with your SQL server hostname.

Connection String

mssql+pymssql://superset:superset90@host.docker.internal:1433/RawData/

Connectivity to Dremio

Steps for Dremio are similar to as shown above for MSSQL server. Just use the below connection string.

Connection String

dremio://raghav.ds:dremio9003@172.17.0.3:31010/dremio?SSL=1

Again, here also replace your own username and password. Mine were raghav.ds and dremio9003 respectively.

To replace the hostname (172.17.0.33) get the bridge network IP address that Docker has assigned to your Dremio container using the below command

 docker network inspect bridge

Conclusion

I had to research and try a lot of things to get everything working. I am sharing the result of my hard work here. If you like this please do share it with people who need it. Also let me know if you have any questions.

I have a Masters in Data Analytics and 15 years of experience in Data Engineering and Data Science space.
View all posts by Raghav

Понравилась статья? Поделить с друзьями:
0 0 голоса
Рейтинг статьи
Подписаться
Уведомить о
guest

0 комментариев
Старые
Новые Популярные
Межтекстовые Отзывы
Посмотреть все комментарии
  • В звездных пучинах windows xp
  • Canoscan lide 25 driver windows 10 64 bit
  • Как добавить питон в path windows 10
  • Windows 10 политики доставки
  • Тормозит хром на windows 10