Master Git by learning advanced tools and techniques that can help you resolve tricky issues with the revision control system
Join the DZone community and get the full member experience.Join For Free
Git is a distributed revision control system. We learned in Understanding Git - DZone that Git stores different objects - commits, blobs, trees, and tags, in its repository, i.e., inside the
.git folder. The repository is just one of the four areas that Git uses to store objects. In this article, we'll explain the four areas in Git, we'll delve deeper into each of these areas, uncovering their significance in facilitating tracking changes made to files, maintaining a history of revisions, and collaboration among developers. Understanding these areas empowers you to harness Git's capabilities to the fullest.
The Four Areas
Git stores objects in four areas illustrated below. These four areas represent the flow of changes in a typical Git workflow.
The working area is where you edit, create, delete, and modify files as you work on your project. It represents the current state of the project and contains all the files and directories that make up your codebase. It is important to remember that the modifications made in the working area are temporary unless changes are committed.
The repository area is where all the project's history, metadata, and versioned files are stored. In Git, the repository is represented by the
.git folder, which is typically located at the root of the project directory. The Git objects, which are immutable, are stored in the objects subdirectory.
In order to gain a deeper understanding of Git, we should be able to answer the following questions when issuing a Git command:
- How does this command move information across the four areas?
- How does this command change the Repository area?
We'll explain these by taking illustrative Git commands. Let's review how Git maintains a project's history.
The Git objects linked together represent a project's history. Each commit is a snapshot of the working area at a certain point in time.
Branches are entry points to history. A branch refers to a commit, HEAD points to the current branch. Pictorially, this is represented as shown below.
This brings us to the third area in Git - the Index.
The Index, also known as the Staging Area, is an intermediate step between the Working Area and the Repository. It helps in preparing and organizing changes before they are committed to the repository.
Let's see a basic Git workflow that touches the three areas that we've encountered so far - the working area, the index, and the repository.
Files in the working directory can be in different states, including untracked, modified, or staged for commit. We use the
git add command to move changes from the working directory to the staging area. The
git commit command saves the changes in the Git repository by creating Git objects, e.g., commit. This is illustrated below.
To see the differences in code made between these areas, we can use
git diff the command as shown below.
$ echo "DS 8000" >> storage_insights.txt $ git diff diff --git a/storage_insights.txt b/storage_insights.txt index 972d3ae..210df65 100644 --- a/storage_insights.txt +++ b/storage_insights.txt @@ -2,3 +2,4 @@ Flash 9000 Storwize XIV SVC +DS 8000 $ git add storage_insights.txt $ git diff --cached diff --git a/storage_insights.txt b/storage_insights.txt index 972d3ae..210df65 100644 --- a/storage_insights.txt +++ b/storage_insights.txt @@ -2,3 +2,4 @@ Flash 9000 Storwize XIV SVC +DS 8000
git checkout command that is used to move to a specific branch, copies changes from the repository area to both the working area and the index.
git rm command is used to remove file(s) from both the working directory and the index to ensure that Git is aware of their removal, so the changes can be committed. The
--cached option removes the file(s) from the index but leaves them in the working directory, effectively untracking them without deleting them locally.
To rename a file we can use the
git mv command. This moves the file from both the working area and the index. It does not touch the repository.
git reset Command
git reset command in Git is used to move the HEAD pointer and branch references to a specific commit, effectively rewinding or resetting the state of the repository to a previous point in its history.
The options are:
- Soft Reset (
--soft): A soft reset moves the HEAD and branch reference to a different commit while keeping the changes in the staging area. It does not move any data between areas.
- Mixed Reset (Default Behavior,
--mixed): A mixed reset moves the HEAD and branch reference while also unstaging the changes. The changes remain in your working directory.
- Hard Reset (
--hard): A hard reset moves the HEAD and branch reference and discards all changes in both the staging area and the working directory. It effectively removes commits and changes.
Thus, a reset moves the current branch, and optionally copies data from the repository area to the other areas as illustrated above.
Now that we've covered the three areas and illustrated by way of examples how Git moves data between these areas, it is time to introduce the fourth area - the Stash.
git stash is a handy Git command that allows you to temporarily save and stash changes in your working directory without committing them. This is useful when you need to switch to a different branch, work on something else, or pull changes from a remote repository while preserving your current changes. The stashed changes can later be reapplied or discarded as needed.
git stash --include-untracked
This command moves all data from the working area and index to the stash and checks out the current commit.
Working With Paths
A commit typically includes changes from multiple files. We've worked with commits so far in our journey. It is possible to operate at a more granular level than a commit, e.g., a file. The following examples illustrate how to work with the individual file rather than commits.
To restore a file from the repository to the index, we use
git reset command at the file level.
$ git reset HEAD storage_insights.txt Unstaged changes after reset: M storage_insights.txt
This command moves the file storage_insights.txt from the repository to the index. At the file level, the option
--hard is not supported.
To restore a file from the repository to both the working area and the index, we use
git checkout command at the file level.
$ git checkout HEAD storage_insights.txt Updated 1 path from 43134cc
Parts of a File
Git can operate on things that are smaller than a file. The
--patch option in Git refers to an interactive mode that allows you to selectively stage changes within individual files or even specific lines of code, giving you fine-grained control over what gets committed. It's commonly used with commands like
We'll illustrate how to use this option with
git add a command. We've two local changes in the file, we want to stage only one of the changes, called hunk, and not stage the other hunk.
$ git add --patch storage_insights.txt diff --git a/storage_insights.txt b/storage_insights.txt index 972d3ae..16a4557 100644 --- a/storage_insights.txt +++ b/storage_insights.txt @@ -1,4 +1,6 @@ Flash 9000 +DS 8000 Storwize XIV SVC +Spectrum Scale (1/1) Stage this hunk [y,n,q,a,d,s,e,?]? s Split into 2 hunks. @@ -1,4 +1,5 @@ Flash 9000 +DS 8000 Storwize XIV SVC (1/2) Stage this hunk [y,n,q,a,d,j,J,g,/,e,?]? y @@ -2,3 +3,4 @@ Storwize XIV SVC +Spectrum Scale (2/2) Stage this hunk [y,n,q,a,d,K,g,/,e,?]? n $ git diff --cached diff --git a/storage_insights.txt b/storage_insights.txt index 972d3ae..30280ea 100644 --- a/storage_insights.txt +++ b/storage_insights.txt @@ -1,4 +1,5 @@ Flash 9000 +DS 8000 Storwize XIV SVC
--patch allows us to process changes, not on a file-by-file basis, but on a hunk-by-hunk basis.
Switch and Restore
restore commands are used to perform specific operations related to branches and file management. They help to switch between branches and restore files to specific states.
The following command will move to a different branch.
git switch <branch-name>
git restore will allow us to restore files in the working directory to a specified state. It's used to undo changes, either by discarding modifications made to files or by reverting them to a previous commit.
This will replace the modified file with the committed version.
git restore --source=HEAD storage_insights.txt
This command moves the changes from the staging area back to the working directory, effectively uncommitting them.
git restore --staged storage_insights.txt
It is crucial to understand the four areas that Git uses to move data around in order to implement a revision control system. In this article, we reviewed the four areas and illustrated data movements by taking examples of different Git commands. For every Git command that we use, if we can explain the data movements through these areas, it will help us with a deeper understanding of Git workflows.
Opinions expressed by DZone contributors are their own.