DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Concourse CI/CD Pipeline: Webhook Triggers
  • Streamlining Event Data in Event-Driven Ansible
  • Beyond Linguistics: Real-Time Domain Event Mapping with WebSocket and Spring Boot
  • APIs for Logistics Orchestration: Designing for Compliance, Exceptions, and Edge Cases

Trending

  • Stateless vs Stateful Stream Processing With Kafka Streams and Apache Flink
  • Simplify Authorization in Ruby on Rails With the Power of Pundit Gem
  • Revolutionizing Financial Monitoring: Building a Team Dashboard With OpenObserve
  • Segmentation Violation and How Rust Helps Overcome It
  1. DZone
  2. Testing, Deployment, and Maintenance
  3. Deployment
  4. GitHub Events Are Booming! Are Bots the Reason?

GitHub Events Are Booming! Are Bots the Reason?

This article dives deeply into GitHub event trending, why GitHub events are surging, and whether GitHub's architecture can handle the increasing load.

By 
Mia Zhou user avatar
Mia Zhou
·
Aug. 11, 22 · Analysis
Likes (3)
Comment
Save
Tweet
Share
4.6K Views

Join the DZone community and get the full member experience.

Join For Free

The OSS Insight website displays the data changes of GitHub events in real time. GitHub events are activities triggered by user actions on GitHub, for example, commenting and forking a repository. In nearly seven weeks, GitHub events increased by about 150 million, from 4.7 billion to 4.85 billion. GitHub's events are booming!

This post dives deeply into GitHub event trending, why GitHub events are surging, and whether GitHub's architecture can handle the increasing load.

Historical Data Analysis

The OSS Insight database includes all the GitHub events since 2011. When we plot the number of events by year, we can see that since 2018, they have been increasing rapidly.

GitHub event trending

GitHub event trending

The figure below shows how long it takes to grow each billion events in GitHub.

The time to reach a billion GitHub events

The time to reach a billion GitHub events

It's taking less and less for GitHub to generate 1 billion events. It took more than 6 years for the first billion events and only 13 months for the last billion!

The Secret Behind the Exponential Growth of GitHub Events

GitHub Actions was released in October 2018. Since August 2019, it has supported continuous integration and continuous delivery (CI/CD), and it has been free for open-source projects. Therefore, projects hosted on GitHub can automate their own development workflows, and a large number of automation-related bot applications have appeared on GitHub Marketplace. Could GitHub events' data growth be related to these?

To find the answer, we divided the events into data from humans and data from bots and plotted them with the following histogram. The blue columns represent the human data, and the yellow columns represent the bot data.

Bot events vs. human events

Bot events vs. human events

As you can see, the proportion of GitHub bot events has increased each year. In 2015, they were only 1.23% of all events. In early July of this year, they reached 13.2%. To show the data changes of bot events more clearly, we made the following line chart.

Bot event trending

Bot event trending

This figure shows that since 2019, bot events have been grown faster than before. As Mini256, a TiDB community contributor, said in Love, Code, and Robot — Explore robots in the world of code:

For now, rough statistics find that there are more than 95,620 bots on GitHub. The number doesn't seem like so much, but wait...

These 95 thousand bot accounts generated 603 million events. These events account for 12.82% of all public events on GitHub, and these GitHub robots have served over 18 million open source repositories.

Bots are playing an increasingly important role on GitHub. Many projects are handing over automated work to bots. We expect that GitHub events will grow faster in the future.

When Will GitHub Reach 10 Billion Events?

How many GitHub events will there be by the end of 2022? We fit predictions to GitHub historical data.

Human event fit (left) vs. bot event fit (right)

Human event fit (left) vs. bot event fit (right)

It's estimated that by the end of 2022, GitHub events will reach 5.36 billion.

github-event-predictionGitHub event prediction

According to this prediction, GitHub events will exceed 10 billion in February 2025.

gitub-events-exceed-10-billionGitHub events will exceed 10 billion in 2025

Can MySQL Sharding Support Such a Huge Amount of Data?

GitHub uses MySQL as the main storage for all non-git warehouse data. The rapid growth of data volume poses a great challenge to GitHub's high availability. In March 2022, GitHub had 3 service disruptions, each lasting 2-5 hours. The official investigation report shows the MySQL database caused the outages. During peak load periods, the GitHub mysql1 database (the main database cluster in GitHub) load increased. Therefore, database access reached the maximum number of connections. This affected the performance of many GitHub services and features.

In fact, over the past few years GitHub has optimized its databases. For example, it added clusters to support platform growth and partitioned the main database. But these improvements did not fundamentally solve the problem. In the near future, GitHub events will exceed 5 billion, or even 10 billion. Can MySQL sharding support such a data surge?

Data Sources

All the analysis data in this article comes from OSS Insight, a tool based on TiDB to analyze and gain insights into GitHub events data.

GitHub Event

Published at DZone with permission of Mia Zhou. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Concourse CI/CD Pipeline: Webhook Triggers
  • Streamlining Event Data in Event-Driven Ansible
  • Beyond Linguistics: Real-Time Domain Event Mapping with WebSocket and Spring Boot
  • APIs for Logistics Orchestration: Designing for Compliance, Exceptions, and Edge Cases

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!