DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations

Trending

  • Cypress Tutorial: A Comprehensive Guide With Examples and Best Practices
  • Building a Robust Data Engineering Pipeline in the Streaming Media Industry: An Insider’s Perspective
  • Microservices With Apache Camel and Quarkus (Part 2)
  • Building the World's Most Resilient To-Do List Application With Node.js, K8s, and Distributed SQL

Trending

  • Cypress Tutorial: A Comprehensive Guide With Examples and Best Practices
  • Building a Robust Data Engineering Pipeline in the Streaming Media Industry: An Insider’s Perspective
  • Microservices With Apache Camel and Quarkus (Part 2)
  • Building the World's Most Resilient To-Do List Application With Node.js, K8s, and Distributed SQL

Loopy Lattices Redux

Marko Rodriguez user avatar by
Marko Rodriguez
·
Jun. 19, 13 · Interview
Like (0)
Save
Tweet
Share
3.23K Views

Join the DZone community and get the full member experience.

Join For Free


loopy lattices redux a graph can be realized as a tessellated surface . each vertex (or point) on the surface has a set of neighbors. two vertices are neighbors if there is an edge that connects them. for a discrete traverser (e.g. a marble), movement in a space requires a decision at each vertex. the traverser must decide whether to go left, up, east, right, west, or down — i.e., it must choose one and only one edge emanating from its current vertex. the edge chosen leads to yet another vertex. at which point, another choice must be made ad infinitum .

k-regular graphs: k-ary string generators

binary state machine a directed graph is k - regular if each vertex has k adjacent vertices. starting from any vertex, there are k n possible n -length paths in a k -regular graph. 2-regular graph for a 2-regular graph ( k=2 ), the number of unique paths is equivalent to the number of objects that can be represented by a binary string . for instance, there are 4,294,967,296 length 32 paths in a 2-regular graph ( 2 32 ). this relationship is apparent if every vertex’s incident edges are labeled with either a 0 or 1. if the traversers print the edge label at each step in an n -step walk, then all possible n -length binary strings will be generated.

a graph with 2- degree vertices is rare in nature . binary tree a graph with 10+ degree vertices is more common, though such k -regularity is uncommon. a 10-regular graph has 100,000,000,000,000,000,000,000,000,000,000 unique 32-length paths ( 10 32 ). tiny binary machine combinatorial explosions are readily apparent when computing on natural graphs . such truths effect the way in which technology/algorithms should be implemented when solving real-world graph problems.

lattice graphs: binary strings with an equal number of 0s and 1s

lattice graph a lattice is a planar graph with a rectangular tessellation. the “inside” of a finite lattice is regular in that each vertex connects to the same number of vertices, but the “outside” is not regular in that there are no neighbors beyond the border vertices. a 20x20 directed lattice has 441 vertices and 840 edges. as shown analytically in the original loopy lattices post, there are 137,846,528,820 (137.85 billion) unique paths from the top-left vertex to the bottom-right vertex. a discrete traverser must take 40 steps to make the journey. of those 40 steps, an equal number of 1s and 0s will be “printed.” thus, the problem of determining how many unique paths there are in a directed 20x20 lattice is a question of how many unique 40-length binary strings exist such that there is an equal number of 1s and 0s. this constraint yields a number that is much smaller than k n . in the previous post, gremlin (in its depth-first, enumerative form) could not calculate the answer due to the explosion of possibilities. therefore, to answer the question, the closed form solution below was provided. the solution says “ 2n choose n ” or, in particular, “40 choose 20.” the 20 chosen 1 slots in the 40-length binary string forces the remaining 20 positions to be 0.

titan vs. faunus: the enumerative/counting distinction

a graph database such as titan can be used to store a 20x20 lattice. while a 840 edge graph is extremely small for a “ database ,” it is necessary for the experiment to follow. the gremlin / groovy code to create a 20x20 lattice in titan is provided below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def generatelattice(n) {
g = titanfactory.open( 'bin/cassandra.local' )
// total number of vertices
max = math.pow((n+ 1 ), 2 )
// generate the vertices
vertices = [];
( 1 .. max ). each { vertices.add(g.addvertex()) }
// generate the edges
vertices. eachwithindex { v, i ->
right = i + 1
if (((right % (n + 1 )) > 0 ) && (right <= max )) {
v.addedge( 'link' , vertices[right])
}
down = i + n + 1
if (down < max ) {
v.addedge( 'link' , vertices[down])
}
}
g.commit();
return g
}

traversing a lattice with titan

titan: distributed graph database with titan/gremlin, it is possible to count the number of 1-, 2-, 3-, …, 40-length paths in the 20x20 lattice. a traversal of length 40 is a full traversal of the lattice from the top-left to the bottom-right. the traversal code and the runtimes are provide below. an exponentiating runtime is realized: 10.9 minutes (28 steps), 2.3 hours (29 steps), and 9.8 hours (30 steps) traversals. the calculation was halted on the 31st step. the number of 40-length paths could not be reasonably calculated with titan/gremlin. by exhaustively enumerating paths and with the number of paths growing exponentially as the path length increases, a calculation of this form is doomed for large values of n .

/usr/local/titan-all-0.3.1$ bin/gremlin.sh
 
         \,,,/
         (o o)
-----oooo-(_)-oooo-----
gremlin> g = generatelattice(20)
==>titangraph[cassandrathrift:127.0.0.1]
gremlin> g.v.count()
==>441
gremlin> g.e.count()
==>840
gremlin> (1..40).each{ l ->
  t = system.currenttimemillis();
  c = g.v(4).out.loop(1){it.loops <= l}.count()
  t = system.currenttimemillis() - t;
  println(l + ":" + c + ":" + t)
}


path length path count traversal time (ms) path length path count traversal time (ms)
1 2 9 2 4 2
3 8 1 4 16 2
5 32 3 6 64 5
7 128 10 8 256 17
9 512 36 10 1024 62
11 2048 74 12 4096 96
13 8192 49 14 16384 61
15 32768 88 16 65536 163
17 131072 325 18 262144 646
19 524288 1296 20 1048576 2573
21 2097150 5172 22 4194258 10306
23 8388054 20659 24 16772566 41555
25 33523880 82993 26 66941500 166828
27 133422540 331954 28 265069020 654070
29 523921830 8288566 30 1027813650 35512124
… … … … … …

traversing a lattice with faunus

faunus: graph analytics engine faunus is a graph analytics engine that leverages hadoop and a breadth-first implementation of gremlin . with faunus, paths can be enumerated in a similar fashion to titan/gremlin, but if only counts are required (destinations vs. paths), then it is more efficient to propagate and sum counters. pascal's triangle instead of explicitly storing each and every gremlin at every vertex, faunus stores the number of gremlins at each vertex. this is the difference between representing a list of length m and a long with value m . a consequence of this model is that it is possible to efficiently compute the number of 40-length paths in a 20x20 lattice with faunus. this counter propagation mechanism is analogous to the mechanical technique for computing binomial coefficients as proposed by blaise pascal via pascal’s triangle (in the western world). it is important to note the breadth-first requirement of this computation.

enumeration vs. counting

g = faunusfactory.open('bin/titan-cassandra-input.properties')
t = system.currenttimemillis()
// as of faunus 0.3.1, the loop() construct is not supported
g.v(4).out.out.out.out.count() // 4-steps
t = system.currenttimemillis() - t

the number of paths of length n in a 20x20 lattice is plotted below in both y -linear (left) and y - logarithmic (right) form. in short, the number of paths grows exponentially as the path length increases. what is interesting to note in the following table of faunus counts and runtimes is that faunus is able to compute the total number of unique 40-length paths in the lattice in 10.53 minutes — 137,846,528,820.

20x20 lattice path count

path length path count traversal time (ms) path length path count traversal time (ms)
1 2 32781 2 4 48071
3 8 63537 4 16 78575
5 32 94078 6 64 109246
7 128 124574 8 256 139850
9 512 156223 10 1024 171549
11 2048 186049 12 4096 201111
13 8192 218417 14 16384 232642
15 32768 248019 16 65536 263355
17 131072 278685 18 262144 296912
19 524288 308225 20 1048576 324440
21 2097150 340823 22 4194258 355349
23 8388054 371546 24 16772566 385755
25 33523880 402189 26 66941500 416868
27 133422540 433917 28 265069020 448150
29 523921830 462522 30 1027813650 478595
31 1995537270 493224 32 3821729910 508492
33 7191874140 524791 34 13237415400 539216
35 23690879520 556108 36 40885872720 568512
37 67156001220 586697 38 102501265020 601196
39 137846528820 617152

titan’s runtime grows exponentially, in proportion to the number of paths computed. on the other hand, faunus’ computation time grows linearly when computing an exponential path count. at step 28, faunus executes the path count faster than titan. this does not mean that titan is inherently less efficient at such computations, it is simply a function of the depth-first, enumerative nature of gremlin/titan vs. the breadth-first, counter nature of gremlin/faunus. implementing faunus’ gremlin engine over titan would enable titan to compute such path counts effectively. however, the purpose of faunus is to serve as the global batch processer to titan.

titan and faunus lattice traversal times

conclusion

20x20 lattice a graph is a data structure composed of vertices and edges. the natural interpretation of a computation on a graph is a traversal — i.e., a directed walk over the vertices by means of chosen incident edges. an exhaustive exploration of all paths within a graph is typically not feasible because the number of paths grows exponentially as a function of the path length and the graph’s branch factor . as demonstrated with titan and faunus, the goal of the traversal and the choice of the traversal engine ultimately determines the feasibility of the computation. once again, the loopy lattice exposes a simple truth in the theory and practice of graphs.






Lattice (module) Graph (Unix) Titan (computer)

Published at DZone with permission of Marko Rodriguez, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Trending

  • Cypress Tutorial: A Comprehensive Guide With Examples and Best Practices
  • Building a Robust Data Engineering Pipeline in the Streaming Media Industry: An Insider’s Perspective
  • Microservices With Apache Camel and Quarkus (Part 2)
  • Building the World's Most Resilient To-Do List Application With Node.js, K8s, and Distributed SQL

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: