Avoid BigQuery SQL Injection in Go With saferbq
Learn how saferbq, a Go wrapper, secures Go BigQuery queries by safely handling user-supplied table and dataset names, preventing SQL injection risks.
Join the DZone community and get the full member experience.
Join For FreeYou can build dynamic queries in BigQuery using the Go SDK. When building applications that allow users to select tables or datasets dynamically, you need to include those identifiers in your SQL queries. I was surprised to find that the BigQuery manual and code examples do not warn about SQL injection vulnerabilities when doing this.
Even more surprising: BigQuery does not provide a built-in mechanism to safely handle user input in table or dataset names. The official SDK supports parameterized queries for data values using @ and ? syntax, but these cannot be used for identifiers that need backtick escaping. You’re forced to use string concatenation, which opens the door to SQL injection. This post explains the problem and introduces a package I wrote to tackle this shortcoming.
See: https://github.com/mevdschee/saferbq
The SQL Injection Problem
When you need to reference a table name that comes from user input, the BigQuery SDK leaves you with no safe option. Consider this common scenario:
client := bigquery.NewClient(ctx, projId)
tableName := getUserInput()
q := client.Query(fmt.Sprintf("SELECT * FROM `%s` WHERE id = 1", tableName))
q.Run(ctx)
If a user provides the input logs` WHERE 1=1; DROP TABLE customers; -- the resulting query becomes:
SELECT * FROM `logs` WHERE 1=1; DROP TABLE customers; --` WHERE id = 1
This might execute successfully, return all logs, and drop your customers table. You might want to use BigQuery’s named parameters as a mitigation:
q := client.Query("SELECT * FROM @table WHERE id = 1")
q.Parameters = []bigquery.QueryParameter{{Name: "table", Value: tableName}}
But this fails with an error because named parameters cannot be used for identifiers, only for data values. The official documentation and examples consistently show string concatenation for table names without any security warnings.
The saferbq Solution
I implemented a package called saferbq that adds safe identifier handling to the BigQuery SDK. It introduces $identifier syntax specifically for table and dataset names:
client := saferbq.NewClient(ctx, projId)
tableName := getUserInput()
q := client.Query("SELECT * FROM $table WHERE id = 1")
q.Parameters = []bigquery.QueryParameter{{Name: "$table", Value: tableName}}
q.Run(ctx)
If a user tries to inject malicious SQL, the query fails immediately:
Error: identifier $table contains invalid characters: `=;
The package works by intercepting queries before they reach BigQuery. It scans the SQL to identify all $identifier parameters, validates each identifier value character-by-character against BigQuery’s naming rules, and only then wraps them in backticks and substitutes them into the query. Invalid characters like backticks, semicolons, quotes, or slashes cause immediate failure with a detailed error message. The package also validates that all parameters are present and that identifiers don’t exceed BigQuery’s 1024-byte limit.
How It Works
When you execute a query, saferbq intercepts the SQL and parameters before they reach BigQuery. First, it scans through the SQL string to identify all dollar-sign parameters (like $table or $dataset) and extracts their names. Simultaneously, it identifies any native BigQuery parameters that start with @ and positional parameters marked with ?.
Next, it validates that every parameter in the SQL has a corresponding value in the parameters list, and vice versa, ensuring no parameters are missing or unused. For identifier parameters (those starting with $), it checks that the values are not empty and validates each character.
Each identifier value is validated by iterating through its characters. Valid characters include Unicode letters, marks, numbers, underscores, dashes, and spaces. If any invalid character is found (such as backticks, semicolons, quotes, or slashes), the query immediately fails with a detailed error message listing the problematic characters. This prevents any attempt at SQL injection from being executed. The query also fails when BigQuery's 1024-byte limit on identifiers is exceeded.
After validation succeeds, each identifier is wrapped in backticks and substituted into the SQL in place of its $parameter placeholder. Native BigQuery parameters (@param and ?) are left untouched in the SQL string, but have their names normalized (removing the @ prefix) so BigQuery can process them correctly.
Finally, the transformed SQL and updated parameter list are passed to BigQuery's standard query execution, where the BigQuery SDK itself securely binds the native parameters. This approach ensures identifiers are safely quoted while preserving the security benefits of parameterized queries for data values.
Conclusion
SQL injection remains one of the most critical security vulnerabilities. When official SDKs don’t provide safe mechanisms for common use cases, developers will be tempted to use unsafe string concatenation. The BigQuery SDK supports parameterized queries for data values, but not for identifiers. The saferbq package is a drop-in replacement that maintains the same API while adding that extra safety.
Published at DZone with permission of Maurits Van Der Schee. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments