DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Coding
  3. Languages
  4. Serializing Python-Requests' Session Objects for Fun and Profit

Serializing Python-Requests' Session Objects for Fun and Profit

Chris Smith user avatar by
Chris Smith
·
Feb. 27, 12 · Interview
Like (0)
Save
Tweet
Share
9.33K Views

Join the DZone community and get the full member experience.

Join For Free

Originally Authored By Shrikant Sharat

If you haven't checked out @kennethreitz's excellent python-requests library yet, I suggest you go do that immediately. Go on, I'll wait for you.

Had your candy? That is one of the most beatiful pieces of python code I've read. And its an excellent library with a very humane API.

Recently, I have been using this library for a few of my company's internal projects and at a point I needed to serialize and save Session objects for later. That wasn't as straightforward as I first thought it'd be, so I am sharing my experience here.

First off, let's make a simple http server which we are going to contact with python-requests. The server should be able to handle cookie based sessions and also have basic auth, as these things are handled by python-requests' Session objects on the client side. I won't discuss the code for the server here, you can get it from bitbucket.

Once you have the server running, now for the client, lets do requests!

import requests as req

URL_ROOT = 'http://localhost:5050'

def get_logged_in_session(name):
    session = req.session(auth=('user', 'pass'))

    login_response = session.post(URL_ROOT + '/login', data={'name': name})
    login_response.raise_for_status()

    return session

def get_whoami(session):
    response = session.get(URL_ROOT + '/whoami')
    response.raise_for_status()
    return response.text


I defined two functions here. The get_logged_in_session will create a new session and login to the http server and return that session. Any subsequent requests using this sesssion will be made as if you have logged in. That's what will be tested with the get_whoami function, which will just return the response from /whoami.

Lets test this out. Make sure the server.py is running and in another terminal,

$ python -i client.py
>>> s = get_logged_in_session('sharat')
>>> get_whoami(s)
u'You are sharat'
>>> get_whoami(req.session(auth=('user', 'pass')))
u'You are a guest'


Works perfectly. If we pass it the logged in session, it gives us the username and if we pass it a new session, it gives us a guest.

Now, lets assume we have two functions, serialize_session and deserialize_session which do exactly what their names say. We can test them out by running a small test.py, as

from client import get_logged_in_session, get_whoami
from serializer import deserialize_session, serialize_session

session = get_logged_in_session('sharat')
dsession = deserialize_session(serialize_session(session))

assert get_whoami(session) == get_whoami(dsession)
print 'Success'


and a dummy serializer.py

def serialize_session(session):
    return session

def deserialize_session(session):
    return session


And with that, of course, the test will not fail

$ python test.py
Success


Serializing

Now, to implement the functions in serializer.py. A simple one, would be to use pickle. Lets try

import pickle as pk

def serialize_session(session):
    return pk.dumps(session)

def deserialize_session(data):
    return pk.loads(data)


If you run test.py now, python is going to yell at you.

$ python test.py
Traceback (most recent call last):
  File "test.py", line 10, in <module>
    dsession = deserialize_session(serialize_session(session))
[ ... ]
    raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle lock objects


Oh well, it was worth a try I suppose.

Update: The Session class can be made to implement the pickle protocol if you want to use pickle.

Next plan I had was to pick up attributes and data from a Session object, just enough to recreate this object using the Session constructor, and serialize those attributes as a json. After all, the Session's API is very easy to use, how hard can picking attributes from it be? :)

So, I dug in the sessions.py module of python-requests library. And here's what the signature of the constructor for Session objects looks like

def __init__(self,
    headers=None,
    cookies=None,
    auth=None,
    timeout=None,
    proxies=None,
    hooks=None,
    params=None,
    config=None,
    verify=True):
    # ...


So, if I pick up just these values, I should be able to recreate the session object. Sweet.

import json
import requests as req

def serialize_session(session):
    attrs = ['headers', 'cookies', 'auth', 'timeout', 'proxies', 'hooks',
        'params', 'config', 'verify']

    session_data = {}

    for attr in attrs:
        session_data[attr] = getattr(session, attr)

    return json.dumps(session_data)

def deserialize_session(data):
    return req.session(**json.loads(data))


And let's try this out

$ python test.py
Traceback (most recent call last):
  File "test.py", line 12, in <module>
    assert get_whoami(session) == get_whoami(dsession)
[ ... ]
[...]requests/models.py", line 447, in send
    r = self.auth(self)
TypeError: 'list' object is not callable


Okay, that error message is very wierd. Why would anyone call a list object?

Go dig in the models.py module. See this

[ ... ]
if isinstance(self.auth, tuple) and len(self.auth) == 2:
    # special-case basic HTTP auth
    self.auth = HTTPBasicAuth(*self.auth)

# Allow auth to make its changes.
r = self.auth(self)
[ ... ]


There. Its not a list that's being called. Not directly at least. The problem here is that the auth we are passing to session() is not a tuple. Duh! While I like it that auth is restricted to be a tuple, I wish there was a better error message for when auth is a list instead of a tuple. I personally wouldn't want it to accept a list for auth though.

So, what went wrong? json does not differentiate between a tuple and a list. It only does lists. So, when serializing and deserializing, the auth tuple is turned to a list. Lets turn it back

def deserialize_session(data):
    session_data = json.loads(data)

    if 'auth' in session_data:
        session_data['auth'] = tuple(session_data['auth'])

    return req.session(**session_data)

And
$ python test.py
Traceback (most recent call last):
  File "test.py", line 12, in <module>
    assert get_whoami(session) == get_whoami(dsession)
[ ... ]
  File "/usr/lib/python2.7/string.py", line 493, in translate
    return s.translate(table, deletions)
TypeError: translate() takes exactly one argument (2 given)

Wait. What? Now we have an error from stdlib? This just keeps getting better and better. If this looks like something that can frustrate you, go get some coffee :)

If you look at the complete stack trace, the second file from bottom,

File "[...]site-packages/requests/packages/oreos/monkeys.py", line 470, in set
  if "" != translate(key, idmap, LegalChars):


This thing seems to be calling the translate method incorrectly. With a bit of debugging and yelling at my monitor, I found out the problem and for a moment, lost my grip on reality.

str.translate takes 2 arguments, but unicode.translate takes only 1. I have no idea why this is done this way but I sure as hell didn't enjoy it. The code in oreos/monkeys.py assumes that the key is a str. However, what json.loads gives you, is unicode stuff. So, we need to convert just the parts in the deserialized dict we get from json.loads which are being used by the oreos/monkeys.py, from unicode to str.

Reading a bit more code around the oreos library, it didn't take long to figure out that those were the keys in the cookies dict. Lo

def deserialize_session(data):
    session_data = json.loads(data)

    if 'auth' in session_data:
        session_data['auth'] = tuple(session_data['auth'])

    if 'cookies' in session_data:
        session_data['cookies'] = dict((key.encode(), val) for key, val in
                session_data['cookies'].items())

    return req.session(**session_data)


And so

$ python test.py
Success


!

All the code is on a bitbucket repository.

Update: Pickling can also work

As Daslch pointed out in his comment on reddit, by implementing the pickle protocol on the Session class, we can get pickling to work. From the documentation, we need two methods, __getstate__ and __setstate__.

Adding those methods as follows to sessions.Session class

def __getstate__(self):
    attrs = ['headers', 'cookies', 'auth', 'timeout', 'proxies', 'hooks',
        'params', 'config', 'verify']
    return dict((attr, getattr(self, attr)) for attr in attrs)

def __setstate__(self, state):
    for name, value in state.items():
        setattr(self, name, value)

    self.poolmanager = PoolManager(
        num_pools=self.config.get('pool_connections'),
        maxsize=self.config.get('pool_maxsize')
    )


with this as the version of serializer.py that uses pickle, we do get a Success.

The creation of new poolmanager in __setstate__ is a piece of code copied from __init__ of the same class. This should probably be turned to a method to avoid code repetition.

Update 2: Created an issue about this.

Update 3: This has been merged and Session objects are pickleable as of version 0.10.3. See requests history.


Source: http://sharats.me/serializing-python-requests-session-objects-for-fun-and-profit.html

Session (web analytics) Object (computer science)

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Microservices 101: Transactional Outbox and Inbox
  • Low-Code Development: The Future of Software Development
  • How To Use Java Event Listeners in Selenium WebDriver
  • Automated Testing With Jasmine Framework and Selenium

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: