Understanding Git Basics and Terminology
Introduction
Git is a distributed version control system that tracks changes in source code during software development. Before diving into practical Git usage, it's essential to understand the fundamental concepts and terminology that form the foundation of how Git works.
This tutorial will cover the core concepts that every Git user should understand, providing you with a solid foundation for working with Git effectively.
What is Version Control?
Version control is a system that records changes to files over time so you can recall specific versions later. It allows you to:
- Track changes to your code
- Collaborate with other developers
- Revert to previous versions when needed
- Understand what changed, when, and who made the changes
- Maintain different versions of your project simultaneously
Git vs. Other Version Control Systems
Centralized vs. Distributed
Traditional version control systems (like CVS, Subversion) are centralized:
- Single central server stores all versions
- Developers check out files from the central repository
- If the server goes down, collaboration stops
Git is distributed:
- Every developer has a complete copy of the project history
- Can work offline and sync changes later
- No single point of failure
- Multiple backup copies exist naturally
Core Git Concepts
Repository (Repo)
A repository is a storage space where your project lives. It contains:
- All your project files
- Complete history of changes
- Branches and tags
- Configuration settings
Types of repositories:
- Local repository: On your computer
- Remote repository: On a server (like GitHub, GitLab)
Working Directory
The working directory is the folder on your computer where you're currently working on your project files. It's where you edit, create, and delete files.
Staging Area (Index)
The staging area is a file that stores information about what will go into your next commit. It's like a preview of your next commit.
Think of it as a shopping cart:
- You add items (changes) to your cart (staging area)
- When ready, you checkout (commit) everything in your cart
Commit
A commit is a snapshot of your project at a specific point in time. Each commit contains:
- A unique identifier (hash)
- Author information
- Timestamp
- Commit message describing the changes
- Pointer to the previous commit
Branch
A branch is a lightweight movable pointer to a specific commit. It allows you to:
- Work on different features simultaneously
- Experiment without affecting the main codebase
- Collaborate with others on separate features
The default branch is usually called main
or master
.
HEAD
HEAD is a pointer that refers to the current branch you're working on. It tells Git which commit you're currently looking at.
Git Workflow States
Git files can exist in three main states:
1. Modified
- Files have been changed but not committed
- Changes exist only in your working directory
2. Staged
- Files are marked to go into the next commit
- Changes are in the staging area
3. Committed
- Files are safely stored in your local repository
- Changes are part of the project history
The Three Areas of Git
Understanding these three areas is crucial for Git mastery:
Working Directory → Staging Area → Repository
(modify) (stage) (commit)
Working Directory
- Where you edit files
- Contains one version of the project
- Files can be modified, added, or deleted
Staging Area
- Stores information about what will go into the next commit
- Also called the "index"
- Allows you to craft exactly what goes into each commit
Repository
- Where Git stores metadata and object database
- Contains all versions of your project
- The
.git
folder in your project root
Essential Git Terminology
Clone
Creating a local copy of a remote repository on your computer.
Fork
Creating a personal copy of someone else's repository on a hosting service like GitHub.
Pull
Fetching changes from a remote repository and merging them into your current branch.
Push
Uploading your local commits to a remote repository.
Merge
Combining changes from different branches into a single branch.
Rebase
Moving or combining commits from one branch to another, creating a linear history.
Tag
A reference to a specific commit, usually used to mark release points.
Remote
A version of your repository hosted on a server, used for collaboration.
Origin
The default name for the remote repository you cloned from.
Upstream
The original repository that you forked from (in fork-based workflows).
Git Object Types
Git stores everything as objects in its database:
1. Blob (Binary Large Object)
- Stores file contents
- Doesn't contain filename or directory structure
2. Tree
- Represents directories
- Contains references to blobs and other trees
- Stores filenames and permissions
3. Commit
- Points to a tree object
- Contains metadata (author, timestamp, message)
- References parent commit(s)
4. Tag
- Points to a commit
- Contains additional metadata
- Usually used for releases
Common Git Commands Overview
Here are the most frequently used Git commands and their purposes:
Repository Operations
git init
- Initialize a new repositorygit clone
- Copy a repository from remote to localgit status
- Check the status of your working directory
Basic Workflow
git add
- Stage changes for commitgit commit
- Save changes to the repositorygit push
- Upload changes to remote repositorygit pull
- Download changes from remote repository
Branch Operations
git branch
- List, create, or delete branchesgit checkout
- Switch branches or restore filesgit merge
- Merge changes from one branch into another
Information Commands
git log
- View commit historygit diff
- Show changes between commits, branches, etc.git show
- Display information about commits
Best Practices for Understanding Git
1. Think in Snapshots
Git doesn't store differences; it stores snapshots of your entire project at each commit.
2. Commits Are Cheap
Don't be afraid to commit often. Small, focused commits are easier to understand and manage.
3. Use Meaningful Commit Messages
Write clear, descriptive commit messages that explain what changed and why.
4. Understand the Three States
Always be aware of which state your files are in: modified, staged, or committed.
5. Branch Early and Often
Use branches for features, experiments, and bug fixes. They're lightweight and easy to work with.
Summary
Understanding Git's core concepts and terminology is essential for effective version control. Key takeaways:
- Git is distributed: Every copy is a complete repository
- Three states: Modified, staged, committed
- Three areas: Working directory, staging area, repository
- Commits are snapshots: Not differences, but complete project states
- Branches are pointers: Lightweight references to commits
- HEAD tracks location: Shows where you are in the project history
With these fundamental concepts understood, you're ready to start using Git effectively. The next tutorial will guide you through creating your first Git repository and performing basic operations.
Next Steps
Now that you understand Git basics and terminology, you can proceed to: