Virtual Filesystem in Go — Creating our Foundation

AlysonN
ITNEXT
Published in
10 min readFeb 21, 2021

--

The official first article on the filesystem’s implementation.

Finding these generic pictures is really hard :(

We finally made it to day one reader, hope you’re psyched to get going with creating an incredibly pretty while also not so super fast virtual filesystem in Go because we’re building while planning at the same time. Why not follow along with the project over at my GitHub repo for it over at https://github.com/AlysonBee/GoVirtualFilesystem and let me know what you think or share any ideas you might have along the way.

And with that brief introductions out of the way, before we go into the actual filesystem, let’s make sure that we have everything we’ll need to code in this project. It’s prerequisite time!

Disclaimer: I’ll be coding this project on Windows 10, but will be testing it out on both Windows 10 and the Ubuntu subsystem that comes with it to ensure it’s supported by both operating systems. It shouldn’t be an issue but it’s my first rodeo with Go so safety first.

Setting It All Up

Edit: so it turns out I was using an obsolete version of Go and updated from 1.14 to 1.16.

The first and obvious item on our checklist is making sure you have Go installed (I’m working with Go version 1.16 for Windows). You can download it at https://golang.org/dl/ for your specific operating system. Installation is a simple unzip and double-click but for specific instructions, you’ll find these on the main Go website as well (https://golang.org/doc/install).

That’s actually pretty much it for the things you need right away. We’re keeping this as Go-only as possible.

Each step will be summarized and added to my GitHub repo a few days after each article. You can follow along with the project there if you just want the code on its own while these articles will go into the ‘how’s and ‘why’s of my design decisions. Note that the repo structure divides each part into folders marked as numbers (01, 02…). Each of these have code up to the point we’re at when those folders were made and no further. None of these folders depend on each other so feel free to treat them as different, iterative versions of the same project. You can just keep all your work in one folder if you’re following along.

And that’s it for our prep. Let’s get started.

The Design

A design overview of the project

This is the overall structure of everything and how the parts relate to one another. It’s very simplistic with straightforward relationships between the four components. Each component can be described as follows:

  • User Object — this represents the user and all access permissions associated to that specific user’s identity. Think Google Cloud IAM, but but bare and heavily underdeveloped. This has to be well fleshed out and seamless as its the part the user will experience the app through as it will determine what they can and can’t do.
  • Interactive Shell — the shell that the user will use to interact with the filesystem. This structure contains the library of functions that will be used to interact with the virtual space. Functions like open, close, mkdir and ls are just to name a few utilities we’ll be making virtual filesystem equivalents for.
  • Virtual Filesystem — this is the collection of files and directories pulled into RAM from your native filesystem, starting from whichever directory you started the app from and traversing downwards through nested subdirectories. There’s a lot of space here for creativity as most of our performance bottlenecks and/or problems like lag and freezes will come from how this is eventually implemented.
  • Native Directory — the directory that the filesystem will clone itself from. This will also be a location that your edited copy will be saved to when you’re done with editing. As far as our interactions with the native directory go, we’ll be implementing checks from the virtual filesystem that verify depth of traversal as well as memory consumed while cloning files to check against out-of-memory violations and so on.

In this article, we’ll be aiming to have a basic layout of each of these components for now. Each piece needs to be as standalone as possible to make debugging easy to work with while also ensuring that adding functionality later on won’t require the headache of a massive refactor; ideally no refactor at all but it’s too soon to get ahead of ourselves.

The User Object

The user object is both the easiest aspect to get started on right away but will also be the component that can expand the furthest and become the most complex as this is where security and ease of use will have to both go hand in hand. Features like setting permissions, granting permissions to other users and inheriting access rights will all happen here. These features won’t be in the very first version, but assuming we got that far, adding these features in shouldn’t end up forcing us to dismantle most of our prior work so we’ll be spending a lot of time here in future.

The code (user.go):

package mainimport (
"fmt"
"math/rand"
)
// The main user object.
type user struct {
userID uint64 // A randomized integer representing the users's unique ID.
username string // The user's onscreen name.
accessList map[string]int // A map containing the unique hashes and access rights for each file.
}
// generateRandomID generates a random userID value.
func generateRandomID() uint64 {
return uint64(rand.Uint32()) << 32 + uint64(rand.Uint32())
}
// createUser creates a user object.
func createUser(username string) *user {
return &user{
userID: generateRandomID(),
username: username,
}
}
// updateUsername updates the name of the current user.
func (currentUser * user) updateUsername(username string) {
currentUser.username = username
}

Initially, this structure’s impact will only be felt at the shell level by allowing the user to set their own username which will mark the shell prompt. For example; making the prompt look like this AlysonV$> or VivianS$> and so on. The first version of our filesystem won’t have access permissions defined so the accessList method will go unused for now. Though, let’s talk about that last detail a bit because it’s an important filesystem concept in general.

Some Technical Background — The Inode

Most Unix-like filesystems make use of data structures known as Inodes. These are basically responsible for storing information on the files and directories that populate the system. They hold info like access permissions for files and directories, info on which files belong to which directories and subdirectories, file size and file type data etc. A small design idea we’ll be using from the Inode concept is the accessList member. The idea being that each unique file created will have, among other things, an accompanying hash value attached to it. Assuming the current user has certain access restrictions in place in relation to a target file, this info will be stored in the accessList variable; the key being a file’s unique hash (a unique unsigned 64-bit number) and the value being the unique permissions the user has against the file this hash belongs to.

That’s the eventual idea for handling permissions. For now, everyone will have access to everything, no matter who it is. So we’ll park that for now.

The Interactive Shell

The shell is a structure that will house all the functions the user will use to interact with their files. It’s divided into two parts; the library functions needed to interact with the system and the shell loop where the user will input their commands.

The library ( lib.go):

package mainimport (
"fmt"
)
// the base library object.
type library struct {
}
// initLibrary initializes the library functions.
func initLibrary() *library {
fmt.Println("Importing library.")
return &Library{}
}
// open will allow for opening files in virtual space.
func (session * library) open() error {
fmt.Println("open() called")
return nil
}
// close closes open virtual files.
func (session * library) close() error {
fmt.Println("close() called")
return nil
}
// mkDir makes a virtual directory.
func (session * library) mkDir() error {
fmt.Println("mkDir() called")
return nil
}
// removeFile removes a file from the virtual filesystem.
func (session * library) removeFile() error {
fmt.Println("removeFile() called")
return nil
}
// removeDir removes a directory from the virtual filesystem.
func (session * library) removeDir() error {
fmt.Println("removeDir() called")
return nil
}
// listDir lists a directory's contents.
func (session * library) listDir() error {
fmt.Println("listDir() called")
return nil
}

Above is the basic skeleton. Functions can be added and removed with ease without breaking everything else. Below is how the functions work within the shell. I’m hoping there’s a way that will make calling functions easier than having a long list of switch statements lined up after each other but for now, this will work.

The shell loop code (shell.go ):

package mainimport (
"fmt"
"bufio"
"os"
)
// shellLoop runs the main shell loop for the filesystem.
func shellLoop() {
library := InitLibrary()
reader := bufio.NewReader(os.Stdin)
for {
fmt.Printf("$>")
input, _ := reader.ReadString('\n')

if input == "\r\n" {
continue
}
input = input[:len(input) - 2]switch input {
case "open":
library.Open()
case "close":
library.Close()
case "remove":
library.RemoveDir()
case "ls":
library.listDir()
default:
fmt.Println(input, ": Command not found")
}
}
}

This loop will be subject to possible changes solely because of all the work it’ll be expected to do. This version is just to give an overview of the basic control flow it will have; it will read input from the user and based on the command passed in will either run one of the library functions, fail or, in the instance of no command passed in, will simply continue. Details like command history and autocomplete will be added in to make it more Unix-like.

The Filesystem

The filesystem only really has two main features that have the potential to be implemented in a number of ways; it’s initialization and teardowns. I suspect I’ll be having the most trouble with making these work reasonably fast as to not halt the system on both startup and exit. The initialization will involve recursively traversing through the directory the app is spawned in and all its subdirectories; cloning every file it comes across for access in virtual space. This step will need some robustness around detecting things like failed copies, out-of-memory violations and max open file descriptor violations (if you don’t know what this stuff means, that’s okay for now, I’ll explain when they become necessary to directly worry about). So many violations, so little time.

There’s a lot of creativity that could go into this step in terms of optimization. This will be covered in more detail as we begin narrowing our focus to one component at a time.

The code (filesystem.go):

package mainimport (
"fmt"
)
// A global list of all files created and their respective names for
// ease of lookup.
var globalFileTable map[uint64]string
// The data structure for each file.
type file struct {
name string // The name of the file.
pathFromRoot string // The absolute path of the file.
fileHash uint64 // The unique file hash assigned to this file on creation.
fileType string // The type of the file.
content byte // The file's content in bytes.
size uint64 // The size in bytes of the file.
}
// The core struct that makes up the filesystem's file/directory
type fileSystem struct {
directory string // The name of the current directory we're in.
files []file // The list of files in this directory.
directories []fileSystem // The list of directories in this directory.
prev *fileSystem // a reference pointer to this directory's parent directory.
}
// Root node.
var root *fileSystem
// initFilesystem scans the current directory and builds the VFS from it.
func initFilesystem() * fileSystem {
// recursively grab all files and directories from this level downwards.
fmt.Println("Welcome to the tiny virtual filesystem.")
return root
}
// reloadFilesys Resets the VFS and scraps all changes made up to this point.
// (basically like a rerun of initFilesystem())
func (root * fileSystem) reloadFilesys() {
fmt.Println("Refreshing...")
}
// tearDown gracefully ends the current session.
func (root * fileSystem) tearDown() {
fmt.Println("Teardown")
}
// saveState aves the state of the VFS at this time.
func (root * fileSystem) saveState() {
fmt.Println("Save the current state of the VFS")
}

We have placeholders for where the initialization and teardown code will go, as well as save and refresh state code. A system for saving and loading hasn’t been considered at time of writing but as saving and loading could mostly have an impact on disc space, it might not be that complicated.

The Native Directory (Honorable Mention)

So this doesn’t have it’s own code, but it’s worth thinking about in relation to the virtual filesystem itself. Because by nature our application interacts with files, it’s important that it never does anything potentially destructive to the original files it’ll be working with so a lot of care has to go into the filesystem portion. Take note that the filesystem code is likely to take the most time to complete and might need the most revisions, if any.

Conclusion — onwards to part 2

This will be the overall layout of our filesystem. Your directory structure doesn’t have to be anything crazy; a simple structure that looks like this will suffice.

./src/
filesystem.go
user.go
lib.go
shell.go

If you’re interested in seeing how each of these individual structures would run at this point, I’ll be writing unit tests for my code on GitHub and will add instructions for testing each file and its functions. These will give you a better understanding of what execution will look like.

And that’s it for our introductory groundwork. Next up, we’ll start with working on the user object and how it interacts with the shell. I’ll be spending downtime working on implementation details of the filesystem portion before publishing further on it. I imagine it’ll take up most of our time on these.

Edit: also, I’ll be sticking to just abbreviating “virtual filesystem” to VFS in future; I’ve typed that phrase so many times, it now sounds weird saying it out loud.

Till next time.

The project repo: https://github.com/AlysonBee/GoVirtualFilesystem

--

--

I’m a software developer by day and tinkerer by night. Working on getting into opensource stuff with a focus on C and Python. I’m also a Ratchet and Clank fan.