Python: Scoping Variables to Use with Timeit
Join the DZone community and get the full member experience.
Join For FreeI’ve been playing around with Python’s timeit library to help benchmark some Neo4j cypher queries but I ran into some problems when trying to give it access to variables in my program.
I had the following python script which I would call from the terminal using python top-away-scorers.py:
import query_profiler as qp attempts = [ {"query": '''MATCH (player:Player)-[:played]->stats-[:in]->game, stats-[:for]->team WHERE game<-[:away_team]-team RETURN player.name, SUM(stats.goals) AS goals ORDER BY goals DESC LIMIT 10'''} ] qp.profile(attempts, iterations=5, runs=3)
query_profiler initially read like this:
from py2neo import neo4j import timeit graph_db = neo4j.GraphDatabaseService() def run_query(query, params): query = neo4j.CypherQuery(graph_db, query) return query.execute(**params).data def profile(attempts, iterations=10, runs=3): print "" for attempt in attempts: query = attempt["query"] potential_params = attempt.get("params") params = {} if potential_params == None else potential_params timings = timeit.repeat("run_query(query, params)", setup="from query_profiler import run_query", number=iterations, repeat=runs) print re.sub('\n[ \t]', '\n', re.sub('[ \t]+', ' ', query)) print timings
but when I ran top-away-scorers.py I got the following exception:
$ python top-away-scorers.py Traceback (most recent call last): File "top-away-scorers.py", line 11, in <module> qp.profile(attempts, iterations=5, runs=3) File "/Users/markhneedham/code/cypher-query-tuning/query_profiler.py", line 19, in profile timings = timeit.repeat("run_query(query, params)", setup="from query_profiler import run_query", number=iterations, repeat=runs) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 233, in repeat return Timer(stmt, setup, timer).repeat(repeat, number) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 221, in repeat t = self.timeit(number) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 194, in timeit timing = self.inner(it, self.timer) File "<timeit-src>", line 6, in inner NameError: global name 'query' is not defined
As far as I understand, timeit couldn’t understand what ‘query’ is because we didn’t explicitly import it in the setup and it doesn’t automatically pick up values from the scope its running in.
I tried adding query and params to the list of imports from query_profiler like so:
def profile(attempts, iterations=10, runs=3): print "" for attempt in attempts: query = attempt["query"] potential_params = attempt.get("params") params = {} if potential_params == None else potential_params timings = timeit.repeat("run_query(query, params)", setup="from query_profiler import run_query, query, params", number=iterations, repeat=runs) print re.sub('\n[ \t]', '\n', re.sub('[ \t]+', ' ', query)) print timings
Unfortunately that didn’t work:
$ python top-away-scorers.py Traceback (most recent call last): File "top-away-scorers.py", line 11, in <module> qp.profile(attempts, iterations=5, runs=3) File "/Users/markhneedham/code/cypher-query-tuning/query_profiler.py", line 21, in profile timings = timeit.repeat("run_query(query, params)", setup="from query_profiler import run_query, query, params", number=iterations, repeat=runs) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 233, in repeat return Timer(stmt, setup, timer).repeat(repeat, number) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 221, in repeat t = self.timeit(number) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py", line 194, in timeit timing = self.inner(it, self.timer) File "<timeit-src>", line 3, in inner ImportError: cannot import name query
I eventually came across the global keyword which allows me to scope query and params in a way that we can import them from the query_profiler module:
def profile(attempts, iterations=10, runs=3): print "" for attempt in attempts: global query query = attempt["query"] potential_params = attempt.get("params") global params params = {} if potential_params == None else potential_params timings = timeit.repeat("run_query(query, params)", setup="from query_profiler import run_query, query, params", number=iterations, repeat=runs) print re.sub('\n[ \t]', '\n', re.sub('[ \t]+', ' ', query)) print timings
I’m generally wary of using anything global but in this case it seems necessary…or I’ve completely misunderstood how you’re meant to use timeit.
Any Pythonistas able to shed some light?
Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments