Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Python: Scoping Variables to Use with Timeit

DZone's Guide to

Python: Scoping Variables to Use with Timeit

· Java Zone
Free Resource

Check out this 8-step guide to see how you can increase your productivity by skipping slow application redeploys and by implementing application profiling, as you code! Brought to you in partnership with ZeroTurnaround.

I’ve been playing around with Python’s timeit library to help benchmark some Neo4j cypher queries but I ran into some problems when trying to give it access to variables in my program.

I had the following python script which I would call from the terminal using python top-away-scorers.py:

import query_profiler as qp
 
attempts = [
{"query": '''MATCH (player:Player)-[:played]->stats-[:in]->game, stats-[:for]->team
             WHERE game<-[:away_team]-team
             RETURN player.name, SUM(stats.goals) AS goals
             ORDER BY goals DESC
             LIMIT 10'''}
]
 
qp.profile(attempts, iterations=5, runs=3)

query_profiler initially read like this:

from py2neo import neo4j
import timeit
 
graph_db = neo4j.GraphDatabaseService()
 
def run_query(query, params):
	query = neo4j.CypherQuery(graph_db, query)
	return query.execute(**params).data
 
def profile(attempts, iterations=10, runs=3):
	print ""
 
	for attempt in attempts:
		query = attempt["query"]
		potential_params = attempt.get("params")
 
		params = {} if potential_params == None else potential_params
 
		timings = timeit.repeat("run_query(query, params)", setup="from query_profiler import run_query", number=iterations, repeat=runs)
 
		print re.sub('\n[ \t]', '\n', re.sub('[ \t]+', ' ', query))
		print timings

but when I ran top-away-scorers.py I got the following exception:

$ python top-away-scorers.py 
 
Traceback (most recent call last):
  File "top-away-scorers.py", line 11, in <module>
    qp.profile(attempts, iterations=5, runs=3)
  File "/Users/markhneedham/code/cypher-query-tuning/query_profiler.py", line 19, in profile
    timings = timeit.repeat("run_query(query, params)", setup="from query_profiler import run_query", number=iterations, repeat=runs)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 233, in repeat
    return Timer(stmt, setup, timer).repeat(repeat, number)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 221, in repeat
    t = self.timeit(number)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 194, in timeit
    timing = self.inner(it, self.timer)
  File "<timeit-src>", line 6, in inner
NameError: global name 'query' is not defined

As far as I understand, timeit couldn’t understand what ‘query’ is because we didn’t explicitly import it in the setup and it doesn’t automatically pick up values from the scope its running in.

I tried adding query and params to the list of imports from query_profiler like so:

def profile(attempts, iterations=10, runs=3):
	print ""
 
	for attempt in attempts:
		query = attempt["query"]
		potential_params = attempt.get("params")
 
		params = {} if potential_params == None else potential_params
 
		timings = timeit.repeat("run_query(query, params)", setup="from query_profiler import run_query, query, params", number=iterations, repeat=runs)
 
		print re.sub('\n[ \t]', '\n', re.sub('[ \t]+', ' ', query))
		print timings

Unfortunately that didn’t work:

$ python top-away-scorers.py 
 
Traceback (most recent call last):
  File "top-away-scorers.py", line 11, in <module>
    qp.profile(attempts, iterations=5, runs=3)
  File "/Users/markhneedham/code/cypher-query-tuning/query_profiler.py", line 21, in profile
    timings = timeit.repeat("run_query(query, params)", setup="from query_profiler import run_query, query, params", number=iterations, repeat=runs)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 233, in repeat
    return Timer(stmt, setup, timer).repeat(repeat, number)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 221, in repeat
    t = self.timeit(number)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 194, in timeit
    timing = self.inner(it, self.timer)
  File "<timeit-src>", line 3, in inner
ImportError: cannot import name query

I eventually came across the global keyword which allows me to scope query and params in a way that we can import them from the query_profiler module:

def profile(attempts, iterations=10, runs=3):
	print ""
 
	for attempt in attempts:
		global query
		query = attempt["query"]
		potential_params = attempt.get("params")
 
		global params
		params = {} if potential_params == None else potential_params
 
		timings = timeit.repeat("run_query(query, params)", setup="from query_profiler import run_query, query, params", number=iterations, repeat=runs)
 
		print re.sub('\n[ \t]', '\n', re.sub('[ \t]+', ' ', query))
		print timings

I’m generally wary of using anything global but in this case it seems necessary…or I’ve completely misunderstood how you’re meant to use timeit.

Any Pythonistas able to shed some light?


The Java Zone is brought to you in partnership with ZeroTurnaround. Check out this 8-step guide to see how you can increase your productivity by skipping slow application redeploys and by implementing application profiling, as you code!

Topics:

Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}