DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Mobile Database Essentials: Assess data needs, storage requirements, and more when leveraging databases for cloud and edge applications.

Full-Stack Observability Essentials: Explore the fundamentals of system-wide observability and key components of the OpenTelemetry standard.

Monitoring and Observability for LLMs: Datadog and Google Cloud discuss how to achieve optimal AI model performance.

Automated Testing: The latest on architecture, TDD, and the benefits of AI and low-code tools.

Related

  • Common Mistakes to Avoid When Writing SQL Code
  • SQL Commands: A Brief Guide
  • Cloud SQL Guidelines for Cloud Database Administration
  • Amazon RDS vs Azure SQL — Know the Key Differentiators and Choose the Best

Trending

  • Getting Started With Prometheus Workshop: Instrumenting Applications
  • Unraveling Lombok's Code Design Pitfalls: Exploring Encapsulation Issues
  • Java Parallel GC Tuning
  • Microservices With Apache Camel and Quarkus (Part 5)
  1. DZone
  2. Data Engineering
  3. Databases
  4. Implementing SchemaSpy in your MySQL environment

Implementing SchemaSpy in your MySQL environment

Peter Zaitsev user avatar by
Peter Zaitsev
·
Jun. 05, 13 · Interview
Like (0)
Save
Tweet
Share
6.90K Views

Join the DZone community and get the full member experience.

Join For Free
this post comes from michael coburn at the mysql performance blog.


schemaspy lately i have been working with a set of customers on a longer term basis which has given me time to explore new tools using their environments.  one tool that i am finding very helpful is called schemaspy .

schemaspy is a java-based tool (requires java 5 or higher ) that analyzes the metadata of a schema in a database and generates a visual representation of it in a browser-displayable format. it lets you click through the hierarchy of database tables via child and parent table relationships as represented by both html links and entity-relationship diagrams. it’s also designed to help resolve the obtuse errors that a database sometimes gives related to failures due to constraints.

one of the things that i like about  erd tools is that i can quickly generate a graphic representation of the database to see which tables are referenced the most, and which are candidates for dropping (why keep old data if you don’t need to?).  i get up to speed a lot faster and can contribute to the development process a lot more effectively when i know the relationships between tables versus observing queries only.  an additional benefit is that schemaspy is command-line driven and builds the html after each run, so i find it convenient to set it up on crontab so that schema changes are automatically picked up.  nice, eh?

usage of this tool is very straightforward and it does a lot of the work for you.  basically if your database has foreign keys then you’re laughing, if not don’t despair, you just have a bit more work ahead of you.  most of the time schemaspy “does the right thing” but sometimes you need to give it a little help in the form of metadata files.

installation of schemaspy and dependencies

  • download the latest copy of schemaspy jar file (5.0.0 at time of writing)
  • make sure you have latest copy of java jre for your platform
  • you will need a java driver for your database — i’m using the mysql connector/j
  • install the graphiz package

creating a mysql.properties file

while not explicitly required, i prefer to create this once so that the  command line is shorter and neater.  in my case here is the properties file in use for my percona server 5.6.10 sandbox:

description=mysql
driver=com.mysql.jdbc.driver
connectionspec=jdbc:mysql://127.0.0.1:5610/schemaspy
driverpath=/usr/share/java/mysql-connector-java.jar

example schema

create table `parent` (
 `parent_id` int(10) unsigned not null auto_increment,
 primary key (`parent_id`)
) engine=innodb default charset=latin1
create table `child_a` (
 `id` int(10) unsigned not null auto_increment,
 `parent_id` int(10) unsigned default null,
 primary key (`id`)
) engine=innodb default charset=latin1
create table `child_b` (
 `id` int(10) unsigned not null auto_increment,
 `parent_id` int(10) unsigned default null,
 primary key (`id`),
 key `parent_id` (`parent_id`),
 constraint `child_b_ibfk_1` foreign key (`parent_id`) references `parent` (`parent_id`)
) engine=innodb default charset=latin1

notice that in the case of child_a there is an implicit fk relationship to parent via parent_id, however in child_b it is explicit.  this has an impact on how schemaspy identifies the relationship, and whether you need to provide a hint or not.

running schemaspy (with foreign keys)

when you run schemaspy without a metadata file, you will see that the relationship between parent and child_b is properly displayed:

fk_only

table child_a can be made to display if you tag the implied relationships box, but this generally only works if you have used the exact same name (in this case, parent_id) in the child and parent tables.  your environment may be like a lot of shops that we see, where the parent table is known as id, and on the child tables they reference it as parent_id — this might make sense to developers but it doesn’t help schemaspy at all.  this is where metadata files come in to play

implied

creating a metadata file

the format of the metadata file is xml-based.  i have included the metadata file i used to properly link child_a to parent table:

<?xml version="1.0" encoding="utf-8"?>
<schemameta xmlns:xsi="http://www.w3.org/2001/xmlschema-instance"
xsi:nonamespaceschemalocation="http://schemaspy.sourceforge.net/xmlschema/2011/02/05/schemaspy.meta.xsd">
<comments>
main production database. percona, michael coburn
</comments>
<tables>
<table name="parent" comments="parent table">
<column name="parent_id" primarykey="true">
</column>
</table>
<table name="child_a" comments="non-fk relationship, implicit relationship to parent">
<column name="id" primarykey="true">
</column>
<column name="parent_id">
<foreignkey table="parent" column="parent_id"/>
</column>
</table>
</tables>
</schemameta>

if you’re looking for other examples you can find a more comprehensive example here .

running schemaspy

the syntax below if specific for mysql, but note that schemaspy works for pretty much all the popular rdbms out there.

java -jar schemaspy_5.0.0.jar -t mysql -u msandbox_ro -p msandbox -meta schemaspy.meta.xml -o /var/www/schemaspy/

the -o directive tells schemaspy where to write the output to, make sure this is a directory that can then be accessed by a web server.  also note that since i created the mysql.properties file in the same directory as the jar file, i don’t need to pass any host:port or schema name information.

viewing schemaspy output

a completed relationships view of these three tables should look like this:

after_metadata_application

final thoughts

while i find the relationships tab the most useful component of schemaspy, i don’t want to leave you with the impression that this is the only component of the tool.  there are additional tabs:

  • tables – names, number of children, parents, count of columns, row counts, and comments — a great way for a high level overview of the table sizes and an easy way to use the search feature of your browser to zero in on a particular table
  • constraints – lists the explicit foreign key constraints in the database (this does not include constraints identified via metadata files!)
  • anomalies – identifies possible relationships between columns/tables based on names, tables without indexes, columns flagged ‘nullable’ and ‘must be unique’ (woops!), single column tables, incrementing column names in tables, and tables with the string null instead of the actual sql null value. this is basically a quick sanity check of your schema for any significant errors or items requiring review.
  • columns – a listing of all columns in the schema, really handy to sort by name to see if you have any implicit constraints that you might have missed and can then write into your metadata file.
  • donate – this is free software and john currier asks for donations so he can justify the time spent maintaining schemaspy to his wife :)

finally don’t forget to automate schemaspy via crontab once you’re done.

what are some erd tools you use and how do they compare with schemaspy?  feel free to answer via the comments.  thanks for reading!

Database MySQL Relational database sql Metadata

Published at DZone with permission of Peter Zaitsev, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Common Mistakes to Avoid When Writing SQL Code
  • SQL Commands: A Brief Guide
  • Cloud SQL Guidelines for Cloud Database Administration
  • Amazon RDS vs Azure SQL — Know the Key Differentiators and Choose the Best

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: