DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Norm of a One-Dimensional Tensor in Python Libraries
  • Improving Sentiment Score Accuracy With FinBERT and Embracing SOLID Principles
  • CockroachDB TIL: Volume 11
  • What Does Synchronization With Asyncio Look Like

Trending

  • The Modern Data Stack Is Overrated — Here’s What Works
  • A Guide to Developing Large Language Models Part 1: Pretraining
  • The Role of Functional Programming in Modern Software Development
  • Apache Doris vs Elasticsearch: An In-Depth Comparative Analysis
  1. DZone
  2. Data Engineering
  3. Databases
  4. Exploring CockroachDB With Jupyter Notebook and Python

Exploring CockroachDB With Jupyter Notebook and Python

Today, we're going to explore CockroachDB from the Data Science perspective, using a popular exploratory web tool called Jupyter Notebook and Python language.

By 
Artem Ervits user avatar
Artem Ervits
DZone Core CORE ·
Jan. 11, 22 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
4.0K Views

Join the DZone community and get the full member experience.

Join For Free

This is the next installment in the series of tutorials on CockroachDB and Docker Compose.

I was inspired to write this post based on this article. The article goes over using Jupyter with Oracle, MySql, and Postgresql, we're going to do the same with Cockroach! One caveat here is the heavy reliance on ipython-sql library. We're going to use Pandas library as the ipython-sql magic functions are not compatible with Cockroach today. Hopefully, you will find it useful.

You can find the older posts here: Part 1, Part 2, Part 3, Part 4.

  • Information on CockroachDB can be found here.
  • Information on Jupyter Notebook can be found here.

We're going to continue using our trusty docker-compose. Jupyter project publishes its images on Docker Hub. We're going to use the minimal image for this tutorial.

Here's my compose file:

version: '3.9'

services:

 crdb:
   image: cockroachdb/cockroach:v21.2.3
   container_name: crdb-1
   ports:
     - "26257:26257"
     - "8080:8080"
   command: start-single-node --insecure
   volumes:
     - ${PWD}/cockroach-data/crdb:/cockroach/cockroach-data:rw

 jupyter:
   image: jupyter/minimal-notebook
   container_name: jupyter
   environment:
     - GRANT_SUDO=yes
   ports:
     - "8888:8888"
   volumes:
     - $PWD:/home/jovyan/work


If you've been following my posts, you may notice that I switched to CockroacihDB version 19.2.3. For the Jupyter environment, I'm passing GRANT_SUDO=yes as an environment variable in case, we need to install some additional software inside the container. I'm also exposing the notebook at port 8888.

  1. Open Jupyter notebook

When docker-compose up is executed, Jupyter will output an access token to the Jupyter notebook.

    To access the notebook, open this file in a browser:
        file:///home/jovyan/.local/share/jupyter/runtime/nbserver-6-open.html
    Or copy and paste one of these URLs:
        http://1b8009ed3252:8888/?token=37a818efee156c6e3b2d3bd21db07827ce250560a336987b
     or http://127.0.0.1:8888/?token=37a818efee156c6e3b2d3bd21db07827ce250560a336987b


You can also access this information with the `docker logs ' command. Copy one of the provided URLs and paste it into a browser.

  1. Once you're in the notebook, you will notice a work directory. Clicking into it will expose all of the files in your host's current directory.

This behavior can be explained by the following volume mapping.

volumes:
     - $PWD:/home/jovyan/work


Let's create a new notebook by clicking New and selecting Python 3. You can click on the title of the notebook and name it to your liking.

  1. Install prerequisite software

The beauty of Jupyter's environment is its ability to adapt to different scenarios. In the next step, we're going to install Python libraries using pip from within the notebook!

!pip install sqlalchemy cockroachdb pandas psycopg2-binary matplotlib


NOTE: Since the release of this post, cockroachdb adapter for sqlalchemy has been renamed to sqlalchemy-cockroachdb.

Then click on the Run icon to execute.

Before we go to the next step, let's switch to Cockroach and load some data.

  1. Initialize a workload.
docker exec -it crdb-1 ./cockroach workload init movr


This will initialize a workload for our application. We've chosen a sample movr application.

10:18 $ docker exec -it crdb-1 ./cockroach workload init movr
I200211 15:18:30.420866 1 workload/workloadsql/dataload.go:135  imported users (0s, 50 rows)
I200211 15:18:30.431182 1 workload/workloadsql/dataload.go:135  imported vehicles (0s, 15 rows)
I200211 15:18:30.468309 1 workload/workloadsql/dataload.go:135  imported rides (0s, 500 rows)
I200211 15:18:30.498384 1 workload/workloadsql/dataload.go:135  imported vehicle_location_histories (0s, 1000 rows)
I200211 15:18:30.528526 1 workload/workloadsql/dataload.go:135  imported promo_codes (0s, 1000 rows)


Now we're ready to populate the database:

docker exec -it crdb-1 ./cockroach workload run movr --duration=1m


The workload will run for 1m, specified by the --duration parameter.

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
   59.9s        0            439            7.3      1.9      1.9      2.4      2.5      4.5  addUser

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
   59.9s        0            124            2.1      3.6      3.7      4.5      4.5      5.0  addVehicle
...


There are different workload generators available with Cockroach, feel free to explore them here. When load completes, you can log in to the database and poke around in the database.

docker exec -it crdb-1 ./cockroach sql --insecure
root@:26257/defaultdb> show databases;
  database_name
+---------------+
  bank
  defaultdb
  movr
  postgres
  system
(5 rows)

Time: 2.847ms

root@:26257/defaultdb> use movr;
SET

Time: 856.8µs

root@:26257/movr> show tables;
          table_name
+----------------------------+
  promo_codes
  rides
  user_promo_codes
  users
  vehicle_location_histories
  vehicles
(6 rows)

Time: 2.459ms

root@:26257/movr>


While we're there, let's also add a user with access to our database.

  1. Create a user and grant access
CREATE USER IF NOT EXISTS maxroach;
GRANT ALL ON DATABASE movr TO maxroach;
GRANT ALL ON TABLE movr.* TO maxroach;
\q


We can test access for users maxroach with:

docker exec -it crdb-1 ./cockroach sql --insecure --user=maxroach --database=movr


  1. Access the database in Jupyter.
import pandas as pd
from sqlalchemy.engine import create_engine
engine = create_engine('cockroachdb://maxroach@crdb-1:26257/movr')
df = pd.read_sql('select count(*) from vehicles', engine)


If you're familiar with Python, sqlalchemy is a popular Python framework for working with databases. Feel free to explore our docs for more in-depth examples.

Once a database connection is established, we use pandas data frame to read the results of the select statement. Click on Run to execute. Nothing happened!

Because we stored the result in a variable called df, we need to use the methods provided by the library.

Type the following and hit Run.

df.head()


The good news, the result matches the total count of vehicles in the database!

Let's try another query:

df = pd.read_sql('select city, count(city) from vehicles group by city', engine)
print(df)


  1. Use matplotlib to graph a plot

Jupyter is also capable of graphing your results, here's a simple graph chart of the data frame.

df.plot(x ='city', y='count', kind = 'bar')


  1. Graph a pie chart.
df.plot.pie(y='count',figsize=(5, 5),autopct='%1.1f%%', startangle=90)


This is just a taste of the powerful Jupyter Notebook and CockroachDB Python capabilities. Feel free to visit the docs for more.

Please share your feedback in the comments.

jupyter notebook Database connection Python (language) CockroachDB

Published at DZone with permission of Artem Ervits. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Norm of a One-Dimensional Tensor in Python Libraries
  • Improving Sentiment Score Accuracy With FinBERT and Embracing SOLID Principles
  • CockroachDB TIL: Volume 11
  • What Does Synchronization With Asyncio Look Like

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!