I created a query language for .git files (GQL)

Amr Hesham
ITNEXT
Published in
4 min readJun 8, 2023

--

Hello everyone. Last month I got interested in Rust programming language and want to discover more about it. So I started to learn the basics and started to see the open source projects written in Rust. I also created one PR in the rust analyzer project; it does not depend on my knowledge of rust but on my general knowledge of Compilers and Static analysis. As usual, I love to learn new things by creating new projects with ideas that I am interested in.

The idea

I started to think about small ideas that I love to use, for example, a faster search CLI or some utility apps. But then I got a new cool idea.

While reading the Building git book (a book about building git from scratch), I learned what each file inside the .git folder does and how git store commits, branches and other data and manage its own database. So what if we have a query language that runs on those files?

The Git Query Language (GQL)

I decided to implement this query language, and I named it GQL. I was very excited to start this project because it was my first time implementing a query language. I decided to implement it from scratch, not converting .git files into an SQLite database and running normal SQL queries. And I thought it will be cool if, in the future, I can use the GQL engine as a part of a Git client or analyzer.

The implementation of GQL

The goal is to implement it into two parts. The first one is converting the GQL query into AST of nodes, then passing it to the engine to walk and execute it as an interpreter or in the future to convert this into virtual matching for GQL Byte code instructions.

The engine has the functionality to deal with .git files using the rust binding for git2 library so it can perform selecting, updating and deleting tasks, also storing the selected data into a data structure so we can perform filtering or sorting.

To simplify this implementation and I created a struct called GQLObject that can represent commit, branch, tag or any other object in this engine also to make it easy to perform sorting, searching, and filtering with single functions that deal with this type.

pub struct GQLObject {
pub attributes: HashMap<String, String>,
}

The GQLObject is just a map of string as a key and value, so it can be general to put the info of any type. And now features like comparisons, filtering or sorting can be implemented easily on this strings map.

The current state

Over the last week, I implemented the selecting feature with conditions, filtering and sorting with optional limit and offset so you can write queries like this

select * from commits
select name, email, title, message, time from commits
select * from commits order by name limit 1
select * from commits order by name limit 1 offset 1
select * from branches where ishead = "true"
select * from tags where name contains = "2023"

Version 0.1.0 and 0.2.0 Updates

After publishing this article and sharing the project i got amazing feedback from many peoples and feature requests so i started to implement many of them with the goal to be able to use SQL features for example

Now we have group by, Aggregation Functions and column name alias so you can perform more advanced query for example selecting top n contributors name and number of commits

SELECT name, count(name) AS commit_num FROM commits GROUP BY name ORDER BY commit_num DES LIMIT 10

The next step

Now the next step is to optimize the code and start to support more features, for example, imaging query for deleting all branches except the master.

delete * from branches where name ! "master"

Or pushing all or some branches to a remote repository using a single query. Maybe grouping and analyzing how many commits for each user in this month and many other things we can do.

The GQL project is a free open source, so everyone is most welcome to contribute, suggest features or report bugs.

I am looking forward to your opinion and feedback 😋.

I hope you enjoyed my article and you can find me on

You can find me on: GitHub, LinkedIn, and Twitter.

Enjoy Programming 😋.

--

--