Compatibility of GitLab on CockroachDB and YugabyteDB (I) — System Initialization
This article compares how well these two databases support GitLab to a certain extent and reflect the compatibility with standard PostgreSQL.
Join the DZone community and get the full member experience.
Join For FreeGitLab is a globally popular source code management tool. In earlier versions, users could choose either MySQL or PostgreSQL, but since version 12.1.0, the official support for MySQL has been completely dropped.
Many features in the new version of GitLab are based on PostgreSQL, the benchmark for many products that use PostgreSQL as the underlying data store.
Imagine a scenario where a large group is divided into divisions. Each division or even a small team may maintain its own GitLab, making it tricky to manage these repositories from the group level. For example.
- Versioning issues (open source and commercial versions, high and low versions)
- Fine-grained permission control
- Data backups
- Infrastructure utilization
A unified GitLab environment with good scalability and high availability would
be the best solution. But the traditional standalone PostgreSQL database does not meet the above needs, so can we consider running GitLab on a distributed database?
CockroachDB and YugabyteDB are relatively well-known new open-source distributed databases that implement the PG protocol, according to their official websites' descriptions.
CockroachDB supports the PostgreSQL wire protocol and the majority of PostgreSQL syntax. Existing applications built on PostgreSQL can often be migrated to CockroachDB without changing the application code.
YugabyteDB is a high-performance, cloud-native distributed SQL database that aims to support all PostgreSQL features.
CockroachDB says it supports most PG syntax, and YugabyteDB says it supports all PG features. This series of review articles is used to compare how well these two databases support GitLab and, to a certain extent, reflect the compatibility with standard PostgreSQL.
Test Environment
- CockroachDB
defaultdb=# select version(); version ----------------------------------------------------------------------------------------- CockroachDB CCL v21.2.2 (x86_64-unknown-linux-gnu, built 2021/12/01 14:35:45, go1.16.6) (1 row)
- YugabyteDB
postgres=# select version(); version ------------------------------------------------------------------------------------------------------------ PostgreSQL 11.2-YB-2.9.1.0-b0 on x86_64-pc-linux-gnu, compiled by gcc (Homebrew gcc 5.5.0_4) 5.5.0, 64-bit (1 row)
- GitLab
GitLab information Version: 12.1.0-ee Revision: 1f2e6f3f6d8 Directory: /home/git/gitlab DB Adapter: PostgreSQL
GitLab deployed with standard PostgreSQL contains the following database schema:
gitlab_production=# select C.relkind,count(C.relname) from pg_class C left join pg_namespace n on n.oid = C.relnamespace where n.nspname = 'public' group by C.relkind; relkind | count ---------+------- r | 249 i | 903 S | 231 (3 rows)
CockroachDB Startup Process
1. Database Initialization
Execute the GitLab setup program to generate the required database schema.
dc@dc-virtual-machine:/home/git/gitlab$ sudo -u git -H bundle exec rake gitlab:setup RAILS_ENV=production This will create the necessary database tables and seed the database. You will lose any previous data stored in the database. Do you want to continue (yes/no)? yes
Dropped database 'gitlab'
Created database 'gitlab'
-- enable_extension("pg_trgm")
rake aborted! ActiveRecord::StatementInvalid: PG::FeatureNotSupported: ERROR: unimplemented: extension "pg_trgm" is not yet supported HINT: You have attempted to use a feature that is not yet implemented. See: https://go.crdb.dev/issue-v/51137/v21.2 : CREATE EXTENSION IF NOT EXISTS "pg_trgm"
/home/git/gitlab/config/initializers/peek.rb:18:in `async_exec_params' /home/git/gitlab/config/initializers/peek.rb:18:in `exec_params'
/home/git/gitlab/vendor/bundle/ruby/2.6.0/gems/activerecord-5.2.3/lib/active_record/connection_adapters/postgresql_adapter.rb:611:in `block (2 levels) in exec_no_cache' ....
As you can see from the output above, GitLab initialization relies on PostgreSQL's Extension feature, but unfortunately, CockroachDB does not currently support it and fails in the first step when no objects are created in the database.
gitlab=# select C.relkind,count(C.relname) from pg_class C left join pg_namespace n on n.oid = C.relnamespace where n.nspname = 'public' group by C.relkind;
Empty set
2. Visit GitLab
When we visit the main GitLab page, it will return a 502 error message.
From the logs, it is because the SQL execution could not find the target table when it reported the error.
ActiveRecord::StatementInvalid: PG::UndefinedTable: ERROR: relation "geo_nodes" does not exist : SELECT a.attname, format_type(a.atttypid, a.atttypmod), pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod, c.collname, col_description(a.attrelid, a.attnum) AS comment FROM pg_attribute a LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum LEFT JOIN pg_type t ON a.atttypid = t.oid LEFT JOIN pg_collation c ON a.attcollation = c.oid AND a.attcollation <> t.typcollation WHERE a.attrelid = '"geo_nodes"'::regclass AND a.attnum > 0 AND NOT a.attisdropped ORDER BY a.attnum
3. Update Database Version
Considering that the current version of CockroachDB is not the latest version, is it possible that the latest version already supports the extension function? Try to upgrade the version to the latest-v22.1:
defaultdb=# select version(); version
------------------------------------------------------------------------------------ CockroachDB CCL v22.1.0 (x86_64-pc-linux-gnu, built 2022/05/23 16:27:47, go1.17.6)
(1 row)
Executing setup again to create the database, I still find the same problem "ActiveRecord::StatementInvalid: PG::FeatureNotSupported: ERROR: unimplemented: extension "pg_trgm " is not yet supported", indicating that the extension feature is not supported in the new version either.
YugabyteDB Startup Process
1. Database Initialization
Modify the GitLab configuration file to switch the database connection to YugabyteDB and initialize a new repository in the same way.
dc@dc-virtual-machine:/home/git/gitlab$ sudo -u git -H bundle exec rake gitlab:setup RAILS_ENV=production This will create the necessary database tables and seed the database. You will lose any previous data stored in the database. Do you want to continue (yes/no)? yes
Dropped database 'gitlab'
Created database 'gitlab'
-- enable_extension("pg_trgm") -> 2.5496s
-- enable_extension("plpgsql") -> 0.1143s
-- create_table("abuse_reports", {:id=>:serial, :force=>:cascade}) -> 0.3709s
-- create_table("appearances", {:id=>:serial, :force=>:cascade}) -> 0.3022s ... ...
-- create_table("issue_tracker_data", {:force=>:cascade}) -> 3.7627s
-- create_table("issues", {:id=>:serial, :force=>:cascade})
rake aborted! ActiveRecord::StatementInvalid: PG::InternalError: ERROR: index method "ybgin" not supported yet HINT: See https://github.com/YugaByte/yugabyte-db/issues/1337. Click '+' on the description to raise its priority : CREATE INDEX "index_issues_on_description_trigram" ON "issues" USING gin ("description" gin_trgm_ops)
/home/git/gitlab/vendor/bundle/ruby/2.6.0/gems/peek-pg-1.3.0/lib/peek/views/pg.rb:17:in `async_exec' /home/git/gitlab/vendor/bundle/ruby/2.6.0/gems/peek-pg-1.3.0/lib/peek/views/pg.rb:17:in `async_exec'
From the above output information, we can see that at first setup runs normally and can create extension and table normally, but after about 20 minutes, it fails to create index, because YugabyteDB can't recognize "gin" type index, and the type instead is "ybgin" instead.
Look at the objects generated by the database up to this point:
gitlab=# select C.relkind,count(C.relname) from pg_class C left join pg_namespace n on n.oid = C.relnamespace where n.nspname = 'public' group by C.relkind; relkind | count
---------+------- S | 113 i | 391 r | 117
(3 rows)
The situation looks a little better than CockroachDB, but still much worse than the full database schema.
2. Visit GitLab
At this point, the main GitLab page is still inaccessible, and from the logs, I found that the reason for the error is that the target table is missing.
source=rack-timeout id=7gatOugcqB8 timeout=60000ms state=ready Started GET "/" for 10.3.74.126 at 2022-05-27 16:05:31 +0800 Processing by RootController#index as HTML Completed 500 Internal Server Error in 78ms (ActiveRecord: 58.8ms | Elasticsearch: 0.0ms)
ActiveRecord::StatementInvalid (PG::UndefinedTable: ERROR: relation "projects" does not exist LINE 8: WHERE a.attrelid = '"projects"'::regclass ^ : SELECT a.attname, format_type(a.atttypid, a.atttypmod), pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod, c.collname, col_description(a.attrelid, a.attnum) AS comment FROM pg_attribute a LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum LEFT JOIN pg_type t ON a.atttypid = t.oid LEFT JOIN pg_collation c ON a.attcollation = c.oid AND a.attcollation <> t.typcollation WHERE a.attrelid = '"projects"'::regclass AND a.attnum > 0 AND NOT a.attisdropped ORDER BY a.attnum
):
3. Update Database Version
Similarly, we tried to upgrade YugabytesDB to the latest version to see if Gin index compatibility has been completed:
postgres=# select version(); version
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- PostgreSQL 11.2-YB-2.13.2.0-b0 on x86_64-pc-linux-gnu, compiled by clang version 12.0.1 (https://github.com/yugabyte/llvm-project.git bdb147e675d8c87cee72cc1f87c4b82855977d94), 64-bit
(1 row)
Execute the setup program again, the process is relatively smooth, about 30 minutes later the program exits normally without errors. At this point we look at the objects in the database.
gitlab=# select C.relkind,count(C.relname) from pg_class C left join pg_namespace n on n.oid = C.relnamespace where n.nspname = 'public' group by C.relkind; relkind | count
---------+------- S | 231 i | 903 r | 249
(3 rows)
You can see that the comparison with the standard PostgreSQL library is exactly the same. Opening a browser to visit the GitLab homepage automatically jumps to the login page, and checking the logs without error reporting.
Fill out the user registration form and submit, then the new user will be registered successfully and automatically jump to the main page of GitLab.
Initially, GitLab functionality is not affected by switching databases. More detailed tests will be presented to you in the next issue.
Test Conclusion
1. CockroachDB v21.2 does not support Extension function, so GitLab cannot initialize the database, and finally fails to start, but the problem still exists after updating to the latest version v22.1.
2. YugabyteDB v2.9 does not support Gin Index (Generalized inverted indexes), resulting in an error after creating a part of the table, which also can not be started, but after updating to the latest version v2.13, the problem is solved, and you can access GitLab page and register users normally.
3. YugabyteDB supports PostgreSQL Extension, CockroachDB does not.
The Next Step
Next we will try to bypass the GitLab database generation step and import a standard GitLab library with data into CockroachDB and YugabyteDB, select some frequently used read and write scenarios, and then compare their compatibility performance.
Published at DZone with permission of he ao. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments