Monday, March 10, 2014

The Location Graph

At the macro level, the filesystem abstraction (VFS) is very inflexible and not user-friendly at all. Without falling back to symlink or mount tricks, paths force content to a physical location because they are also used to address that content in scripts or with bookmarks.

Directory names are used to categorize content at the macro level. And unlike when using tags and sets, you can only categorize one-way.

I suggest working in terms of a Location Graph from the Workshop (the root of your digital world) down to individual modules (known locations; repositories) to separate those concerns.

Graph Utilities

I will use the term Module here for leaves in the location graph. They hold data content and are the units used to configure data management (VCS, replication). Other nodes serve to give structure to the graph and can be equipped with further meaning by adding tags.

All nodes in the graph are addressable by IDs that will always be unique within a particular WS. Project and module names can be universally unique, adding qualification where necessary (for example to identify forks). This means that you can restructure the graph without invalidating bookmarks or scripts.

To be able to put individual resources under a different DM-scheme without breaking applications there are Data-Spaces and Content-Units. A Data-Space is a virtual parent node (or a tag) to find all the modules that contribute to a particular resource group. Inside a module, where regular file-system semantics are used, Content-Units describe the resources contained in a directory. This works similar to, for example, the well known .directory files or similar special files.

Note in the following, that while the project hierarchy is relatively stable, most other aspects can be grouped in Views that serve a specific purpose. You can, for example, view the contributed nodes of an extension-project separately, or merged with the project it extends.

An Example Graph


This picture schematically shows an example of the location graph reduced to the essential parts. At the top, you can see the Workshop node with the global Workshop Data-Space (the yellow oval) arranged in three modules (the blue rectangles).

Below that is one of possibly many Workbench nodes with its accompanying Data-Space.

The workbench is equipped with one project that got augmented (by the user) with one extension-project (the smaller dark-green oval). The extension-project adds one data-module and one project-repository (both in dark blue) displayed here inline with the main project. The additions of a simple extension-project are just inserted into the main project hierarchy on a node by node basis. An extension-project can also be fully specified and/or override the hierarchy of the main project.

You can see that the main project is actually a meta-project that references another project. The sub-project has its own area of responsibility (separated by a dashed line). Meta-projects resolve configuration conflicts between sub-projects and add resources to combine all parts into a whole.

The Workbench Meta-Project

The Workbench is also a place for local experimentation and in this regard acts like a meta-project that is not published. You can assemble multiple projects and modules in a workbench, and also add a project-repository to contribute configuration or process resources.

If you think your setup is worth publishing, you can wrap selected top-level nodes and the project-repository into a new meta-project. You can see an example transition in the picture below. In the initial state, the project-repository is shown as a child of the WB data-space. This is again just a matter of the grouping done in a View which can be more concise in the context of creating a new meta-project.

In a similar way, you can move top-level modules from your workbench to a [meta-]project or draft comprehensive changes to a project in an extension-project and ask the project to merge the changes.

You can see that the hierarchy independent addressing of nodes plays again a very valuable role.

Diverse Points

An interesting advantage of the physical decoupling of locations is that you can move the module you are currently working with the most onto the fastest storage device without disrupting anything.

By adding tags to Workbenches it is possible to group Workbenches according to varying and overlapping aspects.

There can also be Module-Groups to have an additional option for structuring the graph. A refinement could be Fork-Groups that enforce the necessary requirements for selecting one of many clones.

Content-Units can be saved/restored in/from an archive. That would make it possible to get the configuration of uninstalled applications out of the way, but still be able to restore it on demand.

For complex setups (eg: a meta-project of multiple workplaces (develop, design, etc)) of one large project, it might be necessary to recognize when the same module or project is used by more than one workplace. References could accomplish that. On the other hand, it might be necessary to instantiate the same module separately with different configurations. That should not be a problem.

Modules can be unmanaged known locations and projects can acquire further locations (eg for build or install tasks). I wanted to emphasize the role of known locations as data management units here.

Occasionally, it might be necessary to prioritize modules of a Data-Space against each other or de/activate selected modules.

Finally

As always: crush it, chew it, see how it tastes (?!) and what can be improved.