DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Testing, Deployment, and Maintenance
  3. Deployment
  4. Why and How to Use Git LFS

Why and How to Use Git LFS

Learn how Git LFS, an open-source Git extension, will help you handle large repositories.

Gunter Rotsaert user avatar by
Gunter Rotsaert
CORE ·
Nov. 13, 18 · Tutorial
Like (6)
Save
Tweet
Share
99.67K Views

Join the DZone community and get the full member experience.

Join For Free

Although Git is well known as a version control system, the use of Git LFS (Large File Storage) is often unknown to Git users. In this post I will try to explain why and when Git LFS should be used and how to use it. The source code of this post can be found on GitHub.

What Is It?

Git LFS is an open-source project and is an extension to Git. The goal is to work more efficiently with large files and binary files into your repository.

  • Large files will grow the history of your repository every time they are updated;
  • Large files will make fetching and pulling slower;
  • An update of a binary file will be seen by Git as a complete file change, other than e.g. for a plain text file, where only the differences to the file are stored. If you have frequent changes to binary files, then your Git repository will grow in size. After a certain amount of time, Git commands will become slower because of the growing size of your repository.

So, when you have large files in your repository and/or a lot of binaries, then it is advisable to use Git LFS. Git LFS uses pointers instead of the actual files when the files or file types are marked as LFS files. When a Git LFS file is pulled to your local repository, the file is sent through a filter which will replace the pointer with the actual file. The actual files are located on the remote server and the pulled actual files are located in a cache in your local repository. This means that your local repository will be limited in size, but the remote repository of course will contain all the actual files and differences.

Installation

The installation will be done on Ubuntu and we assume that Git is already installed. As said before, Git LFS is an extension to Git and therefore needs to be installed separately:

sudo apt install git-lfs

First create an empty new Git repository:

$ mkdir mygitlfsplanet 
$ cd mygitlfsplanet 
$ git init 
Initialized empty Git repository in /home/user/mygitlfsplanet/.git/

Navigate to your Git repository (where the .git directory is located) and execute the following command in order to activate Git LFS:

$ git lfs install
Updated git hooks.
Git LFS initialized.

First, take a look at your .gitconfig file in your home directory. The following section has been added:

[filter "lfs"]
    clean = git-lfs clean -- %f
    smudge = git-lfs smudge -- %f
    process = git-lfs filter-process
    required = true

Navigate to the directory mygitlfsplanet/.git/hooks. The following hooks have been added/updated and contain git-lfs commands which will be executed when the hook is triggered:

  • post-checkout
  • post-commit
  • post-merge
  • pre-push

Also a directory mygitlfsplanet/.git/lfs is added which is the local cache we have been talking about.

Configuration

Now that we have installed Git LFS for our repository, it is time to configure which file types we want to associate with Git LFS. This information will be added to a .gitattributes file in your repository. It is advised to commit and push this file to your repository in order that every developer works with the same Git LFS configuration. The most easiest way to associate a file type with Git LFS is by means of the git lfs track command. Let’s associate all jpg files to Git LFS:

$ git lfs track "*.jpg"
Tracking "*.jpg"

The .gitattributes file is created and contains the following information:

*.jpg filter=lfs diff=lfs merge=lfs -text

What if we have a directory largefiles in our repository with large xml files and we don’t want to associate all xml files to Git LFS but only the ones that reside in that particular directory? We can track the directory largefilesand only associate the xml files in that directory with Git LFS:

$ git lfs track "largefiles/*.xml"
Tracking "largefiles/*.xml"

The only thing left to do, is to commit the .gitattributes file to our local repository.

Git LFS in Action

Now that we are ready with all the preparation work, it is time for some action. We are going to add a root.jpg, root.xml and root.txt file in the root of our repository. We also add a largefile.jpg, largefile.xml and largefile.txt in the directory largefiles. Commit those files and with the following command we can verify which files are being tracked as Git LFS files:

$ git lfs ls-files
0282cb373a * largefiles/largefile.jpg
fc3b142235 * largefiles/largefile.xml
72d5491269 * root.jpg

This result is exactly as we expected: all jpg files are tracked by Git LFS and only the xml file inside the largefiles directory is being tracked by Git LFS and not our root.xml file and the two txt files. When you look at the files on your file system, you won’t see any difference between files tracked by Git LFS or not. That is because the Git LFS filters replace the pointer files with the actual content. This way, the usage of Git LFS is transparent to you as a user.

Now push everything to the remote repository. When you click in GitHub on a Git LFS File, the file is being displayed normally but at the top of the file it is indicated that the file has been stored as a Git LFS file.

Git LFS With an Existing Repository

Up till now, we have shown how to enable Git LFS when we start a new repository and we know which files we want to associate with Git LFS. But what if you want to enable Git LFS to your existing repository? You can do so the same way as we have done for a new repository. From that moment on, new files or updates to files will be tracked by Git LFS. The commits before you have enabled Git LFS, will not be automatically migrated. There is a way however to migrate your entire repository. You have to migrate all your existing branches by means of the following command:

git lfs migrate import --include="*.jpg,largefiles/*.xml" --include-ref=refs/heads/master

The above example shows the command which should be used if we had forgotten to associate any file types to our previous created repository. After the include option you specify which file types have to be migrated, after the include-ref option the branch you want to migrate. After this, your history will have been migrated to LFS. But be careful, this migratecommand will also rewrite your history! Your repository history will have different commit hashes and therefore every developer should clone the repository anew after this action. Think carefully of the consequences before you execute this migration.

Tips

  • The local Git LFS cache will not be cleaned up automatically. Just like you have to prune remote branches on a regular basis, you also have to prune your Git LFS content with the following command: git lfs prune
  • Ensure that all the developers have Git LFS installed. When someone without Git LFS installed commits a file which should be associated with Git LFS, you will get some strange errors. They can be fixed, but it is better to prevent this from happening.
  • We have also mentioned committing binaries to a Git repository, but is it advised to do so? As a first answer I would say no. But sometimes you just don’t have valid alternatives. Think about the following when you consider committing binaries:
    • Is it really necessary to put the binary under version control?
    • Is there a text-based alternative for the binary? E.g. assume you want to commit MS Word files, is it possible to convert them to plain text or are there valid arguments not to do so?

Summary

In this post we explained what Git LFS is, how you can install it and how you can use it. We also explained how you can apply Git LFS to an existing repository and gave some tips.

Git Repository (version control)

Published at DZone with permission of Gunter Rotsaert, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • How To Set Up and Run Cypress Test Cases in CI/CD TeamCity
  • Fixing Bottlenecks in Your Microservices App Flows
  • A Guide to Understanding XDR Security Systems
  • 10 Things to Know When Using SHACL With GraphDB

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: