Over a million developers have joined DZone.

Examining Decentralized Social Networks

DZone's Guide to

Examining Decentralized Social Networks

Analyze the intricacies of data on decentralized social networks and learn what is good or bad about the underlying architecture of various platforms.

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

Most companies who create a social network do so with the end goal of collecting information, interests, and habits of their users in order to monetize that data (usually through advertising). They guard this data heavily and many of the largest social networks are trusted enough to be Identity Providers for OAuth-based authentication and single-sign-on mechanisms such as "Log In with Facebook/Twitter/Linkedin/etc.".

We have many customers with production apps who use Stream to help develop social applications, and we thought it would be interesting to dig into four "decentralized" or "distributed" social networks/platforms (who have not been customers of Stream) to see what makes them tick. We'll try to assess what is good or bad about the underlying architecture and take a look at why they're succeeding or failing.

Backstory and History

"Social networks" have been around a lot longer than many folks realize. My own first experience was a "bulletin board system" back in the 80's using a dial-up modem where you registered on a computer and could interact with, and send messages to other users on that system. This expanded into more mainstream email accounts and ushered in the era of AOL, CompuServe, and others. These spun off into other community-based websites which launched ideas into larger and larger platforms such as Facebook, Twitter, Instagram, Google+ and so on.

These platforms are "centralized" in the sense that there is only one entity responsible for your account access, and managing access to any content you choose to upload and share privately or publicly on their platform. Many users aren't crazy at the idea of large entities controlling access to their content, which has resulted in many viral ownership claims that get recycled from time to time. It's a valid concern, especially when the platform isn't 100% clear on how they'll use your content for additional monetization efforts.

Over time, the idea of ownership and granular control on who can see and use your content has spawned ever-growing interest in a decentralized platform where you can share your data.

One of the biggest contributors to decentralized platforms points back to StatusNet, based in Montreal Canada, which started as a microblogging platform back in 2007. Their work was later donated to the Free Software Foundation, with contributions from other parties later became GNU Social, while StatusNet Inc (the company) later migrated to pump.io. StatusNet Inc also developed OStatus as a combination of protocols for social sharing; both GNU Social and OStatus were the inspiration for Mastodon.

GNU Social

GNU Social began in 2010 as a spin-off project from a music community and later merged into StatusNet which adopted the GNU Social name in 2013. It has a very large list of federated servers (found on other links at gnu.io) and allows members to post content and share that content easily across other systems using a pub/sub model.

GNU Social at its roots is a full open-source PHP project that encourages new contributions. They support a rich plugin system allowing users to build up a server with exactly the features they would like to support their community. Several supporting projects exist on their GitLab page outlining plugins and features and alternate user interfaces.


Mastodon is a mash-up of several protocols and platforms, but largely based on the idea of compatibility with GNU Social:

Mastodon is a re-implementation of the GNU Social codebase, which itself is an implementation of the OStatus protocol, originally forking from the GNU FM project and later merging with the StatusNet and FreeSocial projects, from the same people behind Identi.ca, which was later folded into pump.io, which uses the ActivityStreams spec along with protocols like PubSubHubBub, Salmon, WebFinger, and Atom syndication to deliver a federated, open-source Twitter-like experience for the masses. (Source: hackernoon.com)

Mastodon started out as a Ruby on Rails rewrite of GNU Social. Its core UI, written in React and Redux, is reminiscent of TweetDeck. The platform offers some more powerful features than Twitter but also some quirks that perhaps users weren't expecting. Certainly, Mastodon has expanded on Twitter's platform by allowing longer messages and other features, and promises to be advertising-free. No ads — where have we heard that before? Ello, App.net, Diaspora, Path, and more, some of which either charged members to use the platform or moved to "freemium" models.

Mastodon maintains a large list of instances where you can register an account, and Eugin Rochko has authored a post about his challenges scaling the platform. As of late April, the author of Mastodon claims almost half a million users across over 1,200 Mastodon instances running around the world.


Steemit is a "distributed social network" written mostly in C++ which rewards its users with a cyber-currency called Steem for posting and curating quality content. Based on modern blockchain technology, their open-source platform allows for verifiable and indisputable origin of content, which is great if you really want people to know you authored something, but most blockchain instances do not allow for deletion of content so you lose granularity of control to gain authenticity. Also, since every Steem client has access to everything on the blockchain, your posts/votes/comments are effectively public unless you encrypt them for SteemMsg.

You can certainly spin up your own instance, but ultimately all data will still be shared on the blockchain. It appears the only way to have a truly "private" Seemit instance would be to encrypt all transmissions and control a decryption key within client software.


A similar blockchain-based social network, Synereo, has a similar model to Steemit, but their target audience appears to be people who want to monetize their content authorship and reward people who help promote the content. Their invite-only platform is written in mostly Scala and some Java. Again, it focuses on the authenticity of who authored the content, but the content is still shared across the entire blockchain. Their project also exists on GitHub with instructions on setting up your own node, but with warnings that it's experimental in nature.

Where Is Your Content, and Who Are You, Really?

A primary difference between blockchain networks and those using something like GNU Social is where your content lives. If you register on a Mastodon site, your content lives at that site but is distributed via pub/sub to other sites where users follow your account. Just because a site is part of the larger "fediverse" of Mastodon, that one site still controls your content every bit as much as Facebook, Twitter, Instagram, etc.. And while each Mastodon partner site has to conform to the overall Mastodon statement of purpose (no ads, etc), they may not be restricted in how they use your data. In contrast, in a blockchain system, your content is distributed to every other node using that blockchain whether someone at that node wants your content or not.

A common confusion to the rush of new Mastodon users (and I'm sure other users on GNU Social sites) is that you register on only one site. This is good because it means that that one site is the only one controlling access to your data, but it's a lot like registering an email address: registering and reserving your username on one Mastodon site (for example) does not reserve that name on any other Mastodon site.

In order to use at-mentions in a message to flag another user, you have to know which other sites they registered on. When mastodon.social shut down registrations due to scalability issues, I registered an account on mastodon.cloud instead. If you also registered on the ".cloud" domain and write a message mentioning "@iandouglas," I will get a notification. But you register on the ".social" domain and mention "@iandouglas," it will ping some other user named "@iandouglas" on that platform (which is not me). So if you're on any other Mastodon node and want to ping me in a message when my account is at mastodon.cloud, you need to mention "@iandouglas@mastodon.social" in your message.

This is a serious drawback to commercial groups or high-profile users like celebrities because not reserving your business name or celebrity name on one platform runs the risk of infringement and other abuse not unlike the rush to reserve your company's name whenever a new top-level domain (TLD) is released so you can get YourBusinessName.newTLDsuffix to match your .com, .org, .net, .io, and so on. I imagine some legal teams are lining up to take on legal claims from cyber-squatters who register names of big companies.

Other Pros and Cons of a Decentralized Platform

A positive benefit to a decentralized system such as GNU Social or Mastodon is the ability to launch your own version of the software on a private system for you and your friends. This means you do not have to worry about sharing that content outside of your server. Blockchain-based social networks can ensure the authenticity of who authored some content but keeping your data private would rely on a large network of shared encryption keys.

There are other benefits to decentralized networks, certainly, and it gives end users a lot more flexibility for who controls their content. Unlike a large entity like Facebook or Twitter, whose users love to hate for questioning the "ownership" of content, unless you're running your own decentralized server, ultimately you're still handing over your content to an unknown entity who may not have the same scruples around data security that the larger social networks have had to put in place to protect your data. In the case of blockchain-based groups, it can be difficult or impossible to ever delete your content.

Since each GNU Social or Mastodon site can also establish their own features and rules (or lack of rules), there's no one entity to control or censor your content. This may be great for those who want more freedom in their writing, but there are other readers who may not want to read unmoderated content on a timeline feed.


Should you build a traditional social application, or join a decentralized network to expand the reach and capabilities of your users? That architecture decision is ultimately yours to make, but there are some great options out there based on both GNU Social and blockchain technologies. Decentralized or distributed social platforms are a growing trend and there are pros and cons to the two different approaches we discussed above.

At Stream, we make building social apps and activity feeds a quick and easy process, but there are lots of interesting ways to store the core activity data within your back-end application. Regardless of how you choose to build your social application, Stream offers a convenient, scalable and highly-available solution for applications of all sizes. You can try things out on our interactive tutorial and have a prototype going in minutes. Stream takes the guesswork out of follow relationships and offers aggregated and notification feeds, ranked feeds, and recommendation feeds based on machine learning of activity analytics and other data points. We can't wait to see what you build with Stream!

Also published on Medium.

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

big data ,data analytics ,decentralized data ,architecture

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}