Skip to content

fix(spanner_dbapi): replace insecure pickle with json for partition deserialization#17014

Draft
sinhasubham wants to merge 1 commit into
mainfrom
dbApi-vlnr
Draft

fix(spanner_dbapi): replace insecure pickle with json for partition deserialization#17014
sinhasubham wants to merge 1 commit into
mainfrom
dbApi-vlnr

Conversation

@sinhasubham
Copy link
Copy Markdown
Contributor

This PR resolves a critical Insecure Deserialization vulnerability (potential Remote Code Execution) in the spanner_dbapi module [b/510871112] . Previously, the module utilized pickle.loads() to decode partition IDs provided by users via the RUN PARTITION statement, creating a direct vector for arbitrary code execution attack payloads.
We have fully eliminated pickle usage in this module and migrated to standard json serialization.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request replaces the use of pickle with json for serializing and deserializing partition IDs to mitigate security risks associated with insecure deserialization. It introduces _serialize_value and _deserialize_value helper functions to handle specific types like bytes, datetime, and protobuf messages. Review feedback points out that MessageToDict defaults to camelCase, which could break compatibility with code expecting snake_case, and suggests using preserving_proto_field_name=True. Additionally, the reviewer noted that protobuf messages are currently deserialized as dictionaries rather than original message objects, which may lead to issues with nested field types.

return {
"__type__": "protobuf",
"class": val.__class__.__name__,
"value": MessageToDict(val._pb),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

MessageToDict converts Protobuf field names to camelCase by default. If the existing code expects snake_case (which is standard for Python Protobuf objects and would have been preserved by the previous pickle implementation), this will cause KeyError or AttributeError in downstream code. Using preserving_proto_field_name=True ensures compatibility with Pythonic naming conventions.

Suggested change
"value": MessageToDict(val._pb),
"value": MessageToDict(val._pb, preserving_proto_field_name=True),

return {
"__type__": "protobuf",
"class": val.__class__.__name__,
"value": MessageToDict(val),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

MessageToDict converts Protobuf field names to camelCase by default. If the existing code expects snake_case (which is standard for Python Protobuf objects and would have been preserved by the previous pickle implementation), this will cause KeyError or AttributeError in downstream code. Using preserving_proto_field_name=True ensures compatibility with Pythonic naming conventions.

Suggested change
"value": MessageToDict(val),
"value": MessageToDict(val, preserving_proto_field_name=True),

Comment on lines +67 to +68
elif t == "protobuf":
return _deserialize_value(val["value"])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The class information stored during serialization is currently ignored, and Protobuf messages are deserialized as dictionaries. This is a breaking change from the previous pickle-based implementation which restored the original message objects. Furthermore, nested bytes or Timestamp fields within these messages will remain as strings (base64 or ISO format) because MessageToDict performs these conversions and they are not automatically reversed by the current _deserialize_value logic. Consider using google.protobuf.json_format.ParseDict if the original message types must be restored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant