Over a million developers have joined DZone.

TokuDB and PerconaFT Database File Management (Part 1 of 2)

DZone's Guide to

TokuDB and PerconaFT Database File Management (Part 1 of 2)

Databases have a lot of files under the hood. Here's a quick overview of the files under TokuDB and PerconaFT and what they do.

· Database Zone ·
Free Resource

Built by the engineers behind Netezza and the technology behind Amazon Redshift, AnzoGraph is a native, Massively Parallel Processing (MPP) distributed Graph OLAP (GOLAP) database that executes queries more than 100x faster than other vendors.  

In this blog post, we’ll look at TokuDB and PerconaFT database file management.

The TokuDB/PerconaFT file set consists of many different files that all serve various purposes. These blog posts lists the different types of TokuDB and PerconaFT files, explains their purpose, shows their location and how to move them around.

Peter Zaitsev blogged on the same topic a few years ago. By the time you read back through Peter’s post and reach the end of this series, you should have some ideas to help you to manage your data set more efficiently.

TokuDB and PerconaFT Files and File Types

  • tokudb.environment
    • This file is the root of the PerconaFT file set and contains various bits of metadata about the system, such as creation times, current file format versions, etc.
    • PerconaFT will create/expect this file in the directory specified by the MySQL datadir.
  • tokudb.rollback
    • Every transaction within PerconaFT maintains its own transaction rollback log. These logs are stored together within a single PerconaFT dictionary file and take up space within the PerconaFT cachetable (just like any other PerconaFT dictionary).
    • The transaction rollback logs will “undo” any changes made by a transaction if the transaction is explicitly rolled back, or rolled back via recovery as a result of an uncommitted transaction when a crash occurs.
    • PerconaFT will create/expect this file in the directory specified by the MySQL datadir.
  • tokudb.directory
    • PerconaFT maintains a mapping of a dictionary name (example: sbtest.sbtest1.main) to an internal file name (example: _sbtest_sbtest1_main_xx_x_xx.tokudb). This mapping is stored within this single PerconaFT dictionary file and takes up space within the PerconaFT cachetable just like any other PerconaFT dictionary.
    • PerconaFT will created/expect this file in the directory specified by the MySQL datadir.
  • Dictionary files
    • TokuDB dictionary (data) files store actual user data. For each MySQL table there will be:
      • One “status” dictionary that contains metadata about the table.
      • One “main” dictionary that stores the full primary key (an imaginary key is used if one was not explicitly specified) and full row data.
      • One “key” dictionary for each additional key/index on the table.
    • These are typically named: _<database>_<table>_<key>_<internal_txn_id>.tokudb PerconaFT creates/expects these files in the directory specified by tokudb_data_dir if set, otherwise the MySQL datadir is used.
  • Recovery log files
    • The PerconaFT recovery log records every operation that modifies a PerconaFT dictionary. Periodically, the system will take a snapshot of the system called a checkpoint. This checkpoint ensures that the modifications recorded within the PerconaFT recovery logs have been applied to the appropriate dictionary files up to a known point in time and synced to disk.
    • These files have a rolling naming convention, but use: log<log_file_number>.tokulog<log_file_format_version>
    • PerconaFT creates/expects these files in the directory specified by tokudb_log_dir if set, otherwise the MySQL datadir is used.
    • PeconaFT does not track what log files should or shouldn’t be present. Upon startup, it discovers the logs in the log dir, and replays them in order. If the wrong logs are present, the recovery aborts and possibly damages the dictionaries.
  • Temporary files
    • PerconaFT might need to create some temporary files in order to perform some operations. When the bulk loader is active, these temporary files might grow to be quite large.
    • As different operations start and finish, the files will come and go.
    • There are no temporary files left behind upon a clean shutdown,
    • PerconaFT creates/expects these files in the directory specified by tokudb_tmp_dir if set. If not, the tokudb_data_dir is used if set, otherwise the MySQL datadir is used.
  • Lock files
    • PerconaFT uses lock files to prevent multiple processes from accessing/writing to the files in the assorted PerconaFT functionality areas. Each lock file will be in the same directory as the file(s) that it is protecting. These empty files are only used as semaphores across processes. They are safe to delete/ignore as long as no server instances are currently running and using the data set.
    • __tokudb_lock_dont_delete_me_environment
    • __tokudb_lock_dont_delete_me_recovery
    • __tokudb_lock_dont_delete_me_logs
    • __tokudb_lock_dont_delete_me_data
    • __tokudb_lock_dont_delete_me_temp

PerconaFT is extremely pedantic about validating its data set. If a file goes missing or unfound, or seems to contain some nonsensical data, it will assert, abort or fail to start. It does this not to annoy you, but to try to protect you from doing any further damage to your data.

Look out for part 2 of this series for information on how to move your log, dictionary, and temp files around correctly.

Download AnzoGraph now and find out for yourself why it is acknowledged as the most complete all-in-one data warehouse for BI style and graph analytics.  

database ,percona ,tokudb

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}