DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • How to Restore a Transaction Log Backup in SQL Server
  • How to Attach SQL Database Without a Transaction Log File
  • A Deep Dive into Apache Doris Indexes
  • Spring Boot Sample Application Part 1: Introduction and Configuration

Trending

  • Debugging Core Dump Files on Linux - A Detailed Guide
  • How to Convert Between PDF and TIFF in Java
  • SQL Server Index Optimization Strategies: Best Practices with Ola Hallengren’s Scripts
  • Key Considerations in Cross-Model Migration
  1. DZone
  2. Data Engineering
  3. Databases
  4. Data Migration From GaussDB to GBase8a

Data Migration From GaussDB to GBase8a

This step-by-step tutorial explores how to export data from GaussDB to GBase8a, as well as a comparison of export methods.

By 
Cong Li user avatar
Cong Li
·
Jul. 10, 24 · Tutorial
Likes (1)
Comment
Save
Tweet
Share
3.2K Views

Join the DZone community and get the full member experience.

Join For Free

Exporting Data From GaussDB

Comparison of Export Methods

Export Tool Export Steps Applicable Scenarios and Notes
Using GDS Tool to Export Data to a Regular File System

Note: The GDS tool must be installed on the server where the data files are exported
Remote Export Mode: Export business data from the cluster to an external host.
1. Plan the export path, create GDS operation users, and set write permissions for the GDS user on the export path.
2. Install, configure, and start GDS on the server where data will be exported.
3. Create an external table in the cluster, with the location path in the format "gsfs://192.168.0.90:5000/".

Local Export Mode: Export business data from the cluster to the host where the cluster nodes are located. This strategy is tailored for numerous small files.
1. Plan the export path and create directories to store exported data files on each DN in the cluster, such as "/output_data", and change the owner of this path to omm.
2. Install, configure, and start GDS on the server where data will be exported.
3. Create an external table in the cluster, with the location path in the format "file:///output_data/".
GDS tools suitable for scenarios with high concurrency and large data exports. Utilizes multi-DN parallelism to export data from the database to data files, improving overall export performance. Does not support direct export to HDFS file system.

Notes on Remote Export:
1. Supports concurrent export by multiple GDS services, but one GDS can only provide export services for one cluster at a time.
2. Configure GDS services within the same intranet as the cluster nodes. Export speed is affected by network bandwidth. The recommended network configuration is 10GE.
3. Supported data file formats: TEXT, CSV, and FIXED. Single-row data size must be <1GB.

Notes on Local Export:
1. Data will be evenly split and generated in the specified folders on the cluster nodes, occupying disk space on the cluster nodes.
2. Supports data file formats: TEXT, CSV, and FIXED. Single-row data size must be <1GB.
gs_dump and gs_dumpall Tools
gs_dump supports exporting a single database or its objects.
gs_dumpall supports exporting all databases in the cluster or common global objects in each database.
The tools support exporting content at the database level, schema level, and second level. Each level can be separately defined to export the entire content, only object definitions, or only data files.
Step 1: The omm operating system user logs into any host with MPPDB service installed and executes: source $ {BIGDATA_HOME}/mppdb/.mppdb

gs_profile command to start environment variables

Step 2: Use gs_dump to export the postgres database: gs_dump -W Bigdata@123 -U jack -f /home/omm/backup/postgres_backup.tar -p 25308 postgres -F t
1. Export the entire database information, including data and all object definitions.
2. Export the full information of all databases, including each database in the cluster and common global objects (including roles and tablespace information).
3. Export only all object definitions, including tablespace, database definitions, function definitions, schema definitions, table definitions, index definitions, and stored procedure definitions.
4. Export only data, excluding all object definitions.


GDS External Table Remote Export Example

Shell
 
mkdir -p /output_data 
groupadd gdsgrp 
useradd -g gdsgrp gds_user 
chown -R gds_user:gdsgrp /output_data
/opt/bin/gds/gds -d /output_data -p 192.168.0.90:5000 -H 10.10.0.1/24 -D 
CREATE FOREIGN TABLE foreign_tpcds_reasons 
( 
r_reason_sk integer not null, 
r_reason_id char(16) not null, 
r_reason_desc char(100) 
) SERVER gsmpp_server OPTIONS (LOCATION 'gsfs://192.168.0.90:5000/', FORMAT 'CSV',ENCODING 
'utf8',DELIMITER E'\x08', QUOTE E'\x1b', NULL '') WRITE ONLY; 
INSERT INTO foreign_tpcds_reasons SELECT * FROM reasons; 
ps -ef|grep gds 
gds_user 128954 1 0 15:03 ? 00:00:00 gds -d /output_data -p 192.168.0.90:5000 -D 
gds_user 129003 118723 0 15:04 pts/0 00:00:00 grep gds 
kill -9 128954 


GDS External Table Local Export Example

Shell
 
mkdir -p /output_data 
chown -R omm:wheel /output_data 
CREATE FOREIGN TABLE foreign_tpcds_reasons 
( 
r_reason_sk integer not null, 
r_reason_id char(16) not null, 
r_reason_desc char(100) 
) SERVER gsmpp_server OPTIONS (LOCATION 'file:///output_data/', FORMAT 'CSV',ENCODING 
'utf8', DELIMITER E'\x08', QUOTE E'\x1b', NULL '') WRITE ONLY; 
INSERT INTO foreign_tpcds_reasons SELECT * FROM reasons; 


gs_dumpall Export Example

Export all global tablespace and user information of all databases (omm user as the administrator), the export file is in text format.

Shell
 
gs_dumpall -W Bigdata@123 -U omm -f /home/omm/backup/MPPDB_globals.sql -p 25308 -g 
gs_dumpall[port='25308'][2018-11-14 19:06:24]: dumpall operation successful 
gs_dumpall[port='25308'][2018-11-14 19:06:24]: total time: 1150 ms


Export all database information (omm user as the administrator), the export file is in text format. After executing the command, there will be a long printout, and finally, when "total time" appears, it means the execution was successful.

Shell
 
gs_dumpall -W Bigdata@123 -U omm -f /home/omm/backup/MPPDB_backup.sql -p 25308 
gs_dumpall[port='25308'][2017-07-21 15:57:31]: dumpall operation successful 
gs_dumpall[port='25308'][2017-07-21 15:57:31]: total time: 9627 ms 


Export all database definitions (omm user as the administrator), the export file is in text format.

Shell
 
gs_dumpall -W Bigdata@123 -U omm -f /home/omm/backup/MPPDB_backup.sql -p 25308 -s 
gs_dumpall[port='25308'][2018-11-14 11:28:14]: dumpall operation successful 
gs_dumpall[port='25308'][2018-11-14 11:28:14]: total time: 4147 ms 


GBase 8a MPP Data Import

Execute SQL File To Import Database Definitions

Shell
 
gccli -ugbase -pgbase20110531 -Dtestdb -vvv -f <guessdb_out.sql >>guessdb_out.result  2>guessdb_out.err


Note: The -D parameter must be followed by an existing database within the GBase cluster. The executed guessdb_out.sql file will operate according to the databases specified within the SQL file, regardless of the database specified after the -D parameter.

GBase 8a MPP Import Text Data

Step 1

The data server where the data exported from GaussDB is located needs to be configured with FTP service. Ensure that all nodes in the GBase 8a MPP cluster can access the data files on the data server via FTP.

Step 2

Organize the characteristics of the data files exported from GaussDB:

  • Encoding format
  • Field delimiter
  • Quote character
  • Null value in data files
  • Escape character (default is double quotes)
  • Whether the data file contains a header row
  • Line break style of the exported data files
  • Date format in date columns, etc.

Step 3

Based on the characteristics organized in Step 2, write and execute the SQL for importing data in GBase 8a MPP.

Syntax format:

Shell
 
LOAD DATA INFILE 'file_list' 
INTO TABLE [dbname.]tbl_name 
[options] 
options: 
[CHARACTER SET charset_name] 
[DATA_FORMAT number [HAVING LINES SEPARATOR]] 
[NULL_VALUE 'string'] 
[FIELDS 
[TERMINATED BY 'string'] 
[ENCLOSED BY 'string'] 
[PRESERVE BLANKS] 
[AUTOFILL] 
[LENGTH 'string'] 
[TABLE_FIELDS 'string'] 
] 
[LINES 
[TERMINATED BY 'string'] 
] 
[MAX_BAD_RECORDS number] 
[DATETIME FORMAT format]
[DATE FORMAT format] 
[TIMESTAMP FORMAT format] 
[TIME FORMAT format] 
[TRACE number] 
[TRACE_PATH 'string'] 
[NOSPLIT] 
[PARALLEL number] 
[MAX_DATA_PROCESSORS number] 
[MIN_CHUNK_SIZE number] 
[SKIP_BAD_FILE number] 
[SET col_name = value[,...]] 
[IGNORE NUM LINES] 
[FILE_FORMAT format] 


Load Examples

Multi-data file load:

Shell
 
gbase> LOAD DATA INFILE 'ftp://192.168.0.1/pub/lineitem.tbl, 
http://192.168.0.2/lineitem.tbl' INTO TABLE test.lineitem FIELDS 
TERMINATED BY '|' ENCLOSED BY '"' LINES TERMINATED BY '\n';


Import statement with wildcards for multiple files:

Shell
 
gbase> LOAD DATA INFILE 'ftp://192.168.10.114/data/*' INTO TABLE test.t; 


Import statement with column, row delimiters, and enclosing characters:

Shell
 
gbase> LOAD DATA INFILE 'ftp://192.168.0.1/pub/lineitem.tbl' INTO TABLE test.lineitem FIELDS TERMINATED BY '|' ENCLOSED BY '"' LINES TERMINATED BY '\n'


Import statement with date format:

Shell
 
load data infile 
'ftp://192.168.88.141/load_data/table_fields.tbl' into table test.t 
fields terminated by ',' table_fields 'i, vc, dt date "%H:%i:%s %Y-%m-%d", dt1 date "%Y-%m-%d %H:%i:%s"'; 


Import statement with auto-fill:

Shell
 
load data infile 'ftp://192.168.88.141/load_data/autofill.tbl' into table test.t fields terminated by '|' autofill;


Import statement with constant values:

Shell
 
gbase> Load data infile 'data.tbl' into table t fields terminated by '|' set c='2016-06-06 18:08:08',d='default',e=20.6; 


Import statement ignoring header:

Shell
 
gbase>load data infile ‘http://192.168.6.39/test.tbl’ into table data_test fields terminated by ‘|’ ignore 3 lines;


Import statement with Blob data:

Shell
 
gbase>load data infile ‘http://192.168.6.39/test.tbl’ into table 
data_test fields terminated by ‘|’ table_fields ‘a,b,c type_text,d’; 
gbase>load data infile ‘http://192.168.6.39/test.tbl’ into table 
data_test fields terminated by ‘|’ table_fields ‘a,b,c type_base64,d’; 
gbase>Load data infile ‘http://192.168.6.39/test.tbl’ into table 
data_test fields terminated by ‘|’ table_fields ‘a,b,c type_url,d’; 


Database sql

Published at DZone with permission of Cong Li. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • How to Restore a Transaction Log Backup in SQL Server
  • How to Attach SQL Database Without a Transaction Log File
  • A Deep Dive into Apache Doris Indexes
  • Spring Boot Sample Application Part 1: Introduction and Configuration

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!