Loopy Lattices Redux
Join the DZone community and get the full member experience.
Join For Free
a
graph
can be realized as a
tessellated
surface
. each vertex (or point) on the surface has a set of neighbors. two vertices are neighbors if there is an edge that connects them. for a
discrete
traverser (e.g. a marble), movement in a space requires a decision at each vertex. the traverser must decide whether to go left, up, east, right, west, or down — i.e., it must choose one and only one edge emanating from its current vertex. the edge chosen leads to yet another vertex. at which point, another choice must be made
ad infinitum
.
kregular graphs: kary string generators
a
directed graph
is
k

regular
if each vertex has
k
adjacent vertices. starting from any vertex, there are
k
^{
n
}
possible
n
length
paths
in a
k
regular graph.
for a 2regular graph (
k=2
), the number of unique paths is equivalent to the number of objects that can be represented by a
binary string
. for instance, there are 4,294,967,296 length 32 paths in a 2regular graph (
2
^{
32
}
). this relationship is apparent if every vertex’s incident edges are labeled with either a 0 or 1. if the traversers print the edge label at each step in an
n
step walk, then all possible
n
length binary strings will be generated.
a graph with 2
degree
vertices is rare in
nature
.
a graph with 10+ degree vertices is more common, though such
k
regularity is uncommon. a 10regular graph has 100,000,000,000,000,000,000,000,000,000,000 unique 32length paths (
10
^{
32
}
).
combinatorial explosions
are readily apparent when
computing on natural graphs
. such truths effect the way in which technology/algorithms should be implemented when solving realworld graph problems.
lattice graphs: binary strings with an equal number of 0s and 1s
a
lattice
is a
planar graph
with a rectangular tessellation. the “inside” of a
finite
lattice is regular in that each vertex connects to the same number of vertices, but the “outside” is not regular in that there are no neighbors beyond the border vertices. a
20x20
directed lattice has 441 vertices and 840 edges. as shown analytically in the original
loopy lattices
post, there are 137,846,528,820 (137.85 billion) unique paths from the topleft vertex to the bottomright vertex. a discrete traverser must take 40 steps to make the journey. of those 40 steps, an equal number of 1s and 0s will be “printed.” thus, the problem of determining how many unique paths there are in a directed
20x20
lattice is a question of how many unique 40length binary strings exist such that there is an equal number of 1s and 0s. this constraint yields a number that is much smaller than
k
^{
n
}
. in the previous post,
gremlin
(in its depthfirst, enumerative form) could not calculate the answer due to the explosion of possibilities. therefore, to answer the question, the
closed form solution
below was provided. the solution says “
2n choose n
” or, in particular, “40 choose 20.” the 20 chosen 1 slots in the 40length binary string forces the remaining 20 positions to be 0.
titan vs. faunus: the enumerative/counting distinction
a
graph database
such as
titan
can be used to store a
20x20
lattice. while a 840 edge graph is extremely small for a “
database
,” it is necessary for the experiment to follow. the
gremlin
/
groovy
code to create a
20x20
lattice in titan is provided below.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

def
generatelattice(n) {
g = titanfactory.open(
'bin/cassandra.local'
)
// total number of vertices
max
= math.pow((n+
1
),
2
)
// generate the vertices
vertices = [];
(
1
..
max
).
each
{ vertices.add(g.addvertex()) }
// generate the edges
vertices.
eachwithindex
{ v, i >
right = i +
1
if
(((right % (n +
1
)) >
0
) && (right <=
max
)) {
v.addedge(
'link'
, vertices[right])
}
down = i + n +
1
if
(down <
max
) {
v.addedge(
'link'
, vertices[down])
}
}
g.commit();
return
g
}

traversing a lattice with titan
with titan/gremlin, it is possible to count the number of 1, 2, 3, …, 40length paths in the
20x20
lattice. a traversal of length 40 is a full traversal of the lattice from the topleft to the bottomright. the traversal code and the runtimes are provide below. an
exponentiating
runtime is realized: 10.9 minutes (28 steps), 2.3 hours (29 steps), and 9.8 hours (30 steps) traversals. the calculation was halted on the 31st step. the number of 40length paths could not be reasonably calculated with titan/gremlin. by exhaustively enumerating paths and with the number of paths growing exponentially as the path length increases, a calculation of this form is doomed for large values of
n
.
/usr/local/titanall0.3.1$ bin/gremlin.sh \,,,/ (o o) oooo(_)oooo gremlin> g = generatelattice(20) ==>titangraph[cassandrathrift:127.0.0.1] gremlin> g.v.count() ==>441 gremlin> g.e.count() ==>840 gremlin> (1..40).each{ l > t = system.currenttimemillis(); c = g.v(4).out.loop(1){it.loops <= l}.count() t = system.currenttimemillis()  t; println(l + ":" + c + ":" + t) }
path length  path count  traversal time (ms)  path length  path count  traversal time (ms) 

1  2  9  2  4  2 
3  8  1  4  16  2 
5  32  3  6  64  5 
7  128  10  8  256  17 
9  512  36  10  1024  62 
11  2048  74  12  4096  96 
13  8192  49  14  16384  61 
15  32768  88  16  65536  163 
17  131072  325  18  262144  646 
19  524288  1296  20  1048576  2573 
21  2097150  5172  22  4194258  10306 
23  8388054  20659  24  16772566  41555 
25  33523880  82993  26  66941500  166828 
27  133422540  331954  28  265069020  654070 
29  523921830  8288566  30  1027813650  35512124 
…  …  …  …  …  … 
traversing a lattice with faunus
faunus
is a graph analytics engine that leverages
hadoop
and a
breadthfirst
implementation of
gremlin
. with faunus, paths can be enumerated in a similar fashion to titan/gremlin, but if only counts are required (destinations vs. paths), then it is more efficient to propagate and sum counters.
instead of explicitly storing each and every gremlin at every vertex, faunus stores the
number
of gremlins at each vertex. this is the difference between representing a
list
of length
m
and a
long
with value
m
. a consequence of this model is that it is possible to efficiently compute the number of 40length paths in a
20x20
lattice with faunus. this counter propagation mechanism is analogous to the mechanical technique for computing
binomial coefficients
as proposed by
blaise pascal
via
pascal’s triangle
(in the western world). it is important to note the breadthfirst requirement of this computation.
g = faunusfactory.open('bin/titancassandrainput.properties') t = system.currenttimemillis() // as of faunus 0.3.1, the loop() construct is not supported g.v(4).out.out.out.out.count() // 4steps t = system.currenttimemillis()  t
the number of paths of length
n
in a
20x20
lattice is plotted below in both
y
linear (left) and
y

logarithmic
(right) form. in short, the number of paths grows exponentially as the path length increases. what is interesting to note in the following table of faunus counts and runtimes is that faunus is able to compute the total number of unique 40length paths in the lattice in 10.53 minutes — 137,846,528,820.
path length  path count  traversal time (ms)  path length  path count  traversal time (ms) 

1  2  32781  2  4  48071 
3  8  63537  4  16  78575 
5  32  94078  6  64  109246 
7  128  124574  8  256  139850 
9  512  156223  10  1024  171549 
11  2048  186049  12  4096  201111 
13  8192  218417  14  16384  232642 
15  32768  248019  16  65536  263355 
17  131072  278685  18  262144  296912 
19  524288  308225  20  1048576  324440 
21  2097150  340823  22  4194258  355349 
23  8388054  371546  24  16772566  385755 
25  33523880  402189  26  66941500  416868 
27  133422540  433917  28  265069020  448150 
29  523921830  462522  30  1027813650  478595 
31  1995537270  493224  32  3821729910  508492 
33  7191874140  524791  34  13237415400  539216 
35  23690879520  556108  36  40885872720  568512 
37  67156001220  586697  38  102501265020  601196 
39  137846528820  617152 
titan’s runtime grows exponentially, in proportion to the number of paths computed. on the other hand, faunus’ computation time grows linearly when computing an exponential path count. at step 28, faunus executes the path count faster than titan. this does not mean that titan is inherently less efficient at such computations, it is simply a function of the depthfirst, enumerative nature of gremlin/titan vs. the breadthfirst, counter nature of gremlin/faunus. implementing faunus’ gremlin engine over titan would enable titan to compute such path counts effectively. however, the purpose of faunus is to serve as the global batch processer to titan.
conclusion
a graph is a data structure composed of vertices and edges. the natural interpretation of a computation on a graph is a traversal — i.e., a directed walk over the vertices by means of chosen incident edges. an exhaustive exploration of all paths within a graph is typically not feasible because the number of paths grows exponentially as a function of the path length and the graph’s branch factor . as demonstrated with titan and faunus, the goal of the traversal and the choice of the traversal engine ultimately determines the feasibility of the computation. once again, the loopy lattice exposes a simple truth in the theory and practice of graphs.
Published at DZone with permission of Marko Rodriguez, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Trending

Cypress Tutorial: A Comprehensive Guide With Examples and Best Practices

Building a Robust Data Engineering Pipeline in the Streaming Media Industry: An Insider’s Perspective

Microservices With Apache Camel and Quarkus (Part 2)

Building the World's Most Resilient ToDo List Application With Node.js, K8s, and Distributed SQL
Comments