Monday, March 17, 2014

The Workbench Network

(This is markuped, but otherwise exact, copy of the feature tracker issue item that you might know from sf.net.)

The Workbench Network

It would offer a lot of opportunities to be able to easily instantiate workbenches in a VM or the cloud. The benefits range from having a clearly visible and understandable security barrier to having a fully laid out cloud-workplace for newcomers to explore. A cloud-workspace would also be useful in an educational setting or collaboration in areas other than software development. Seeing the global resources as nodes that can be synced or read-only replicated to other systems supports this interesting perspective.

I'd like to paint a coarse picture of how a workbench network could look like. The tool possibilities this enables are vast, and I'll only scratch the surface of that topic.

But first a name change. I will refer to Workbench Bots what was previously known as Workbench Slaves. I've created a Wiki page (Glossary) for naming ideas and discussions to keep that out of the drafting process.

The Workshop Node

The global resources belong to a WorkShop Node (WSN). This is your personal work-hub, the nerve center for all your workbenches. The resources managed by the WSN could be roughly separated into three categories: Configuration & Data, Tools, and Active Parts.

Configuration & Data

This is subject to syncing and/or replication. Configuration is a core responsibility of the workbench application and what facilities will be provided can certainly fill a separate topic.

Tools

Global tool usage could simply be recorded and synced with other data. Instead of syncing the tool fleet on every node, it would be possible to browse the tools used on other nodes in a general tool installation dialog.

Active Parts

Although all of the WSN would reside in a single location, and would therefore be switchable, you will certainly want to use just one (no configuration duplication). The configuration for active parts (start/stop workbenches, notification handling, cron-jobs, etc) could be arranged in profiles that itself could be started and stopped. This would also allow you to switch, for example, between private and work profiles.

Drones

The master record of your WSN could be somewhere online which would allow you to fully sync the WSN to multiple devices without much hassle. In addition to the WSN, it would be possible to have parts of the WSN be read-only instantiated on Drone nodes. These nodes can be on VMs or in the cloud and would offer the mentioned easy to understand security container where only notifications are routed back to your WSN. No automatic syncing of other resources would take place. Note that these nodes are usually accessible from a device where you have a fully capable WSN running -- in another browser window or on the host of the VM. These drones are container nodes where you want to do actual work, but where the global resources are injected from your full WSN.

Drones could be managed with profiles that describe what tools to install initially, what parts of the configuration to push, and which notifications will be forwarded to the WSN. A special VM image could be kept up to date. When you want to explore a project that you cannot fully trust, the image could be cloned for that purpose.

The WSN would know what drones and workbenches are available and could display status info about them.

Your tool fleet might include patched tools or extensions for tools managed with a workbench (e.g. a patched Vim or a workbench of Vim-Vundles (Vim extensions distributed through git)). To have those tools available on a Drone, it would be possible to push Workbench Bots there (remember: lightweight, raw workbenches that are completely independent and also used for automated tasks and deployment pipelines)

Cloud Workplaces

Projects could prepare a cloud workplace with web-based tools (or configuration for tool-categories with default tool selection). Users could explore the project without even having the workbench application installed locally. On the other hand, workbench application users could use the WSN to push their tool-configuration to that cloud workplace, manage credentials, and receive notifications (e.g. a review status change could be routed through the cloud workbench back to your WSN)

The WSN is not a workbench

It has different and a lot less responsibilities. However, it would certainly be advantageous to have some facilities like the navigation parts and k-loc managers (VCS/sync control of known locations) of the workbench there too. So it probably is a special, stripped down workbench. The default would be to not version-control any parts of the WSN and the online storage that is used to sync the WSN could do incremental backups, for example. Advanced users could gradually put k-locs under version control.

More on security

The WSN will probably manage credentials too. After all, you would want to be able to instantiate multiple workbenches of the same project. You would not want your whole workbench live to fall into the wrong hands. So, maybe further means of compartmentalization can be incorporated. For example, a credentials archive for projects the user is not currently actively involved in.

The devices the WSNs are on can provide additional means of security to shield against unauthorized access (locking, encryption). In addition to that, a mobile device could be a notification receiver without accessing the WSN.

How fine grained the permissions/capabilities of Drones have to be remains to be seen. Viewing drones as accessible from a device with a full WSN and having notification receivers separate from that will cover a lot. The possibility of fine-grained permissions can be another guiding principle for structuring the layout of the WSN.

Conclusion

So, what I'm saying is that the global node can be organized as well. That again offers a lot of opportunities in the tool space.

Monday, March 10, 2014

The Location Graph

At the macro level, the filesystem abstraction (VFS) is very inflexible and not user-friendly at all. Without falling back to symlink or mount tricks, paths force content to a physical location because they are also used to address that content in scripts or with bookmarks.

Directory names are used to categorize content at the macro level. And unlike when using tags and sets, you can only categorize one-way.

I suggest working in terms of a Location Graph from the Workshop (the root of your digital world) down to individual modules (known locations; repositories) to separate those concerns.

Graph Utilities

I will use the term Module here for leaves in the location graph. They hold data content and are the units used to configure data management (VCS, replication). Other nodes serve to give structure to the graph and can be equipped with further meaning by adding tags.

All nodes in the graph are addressable by IDs that will always be unique within a particular WS. Project and module names can be universally unique, adding qualification where necessary (for example to identify forks). This means that you can restructure the graph without invalidating bookmarks or scripts.

To be able to put individual resources under a different DM-scheme without breaking applications there are Data-Spaces and Content-Units. A Data-Space is a virtual parent node (or a tag) to find all the modules that contribute to a particular resource group. Inside a module, where regular file-system semantics are used, Content-Units describe the resources contained in a directory. This works similar to, for example, the well known .directory files or similar special files.

Note in the following, that while the project hierarchy is relatively stable, most other aspects can be grouped in Views that serve a specific purpose. You can, for example, view the contributed nodes of an extension-project separately, or merged with the project it extends.

An Example Graph


This picture schematically shows an example of the location graph reduced to the essential parts. At the top, you can see the Workshop node with the global Workshop Data-Space (the yellow oval) arranged in three modules (the blue rectangles).

Below that is one of possibly many Workbench nodes with its accompanying Data-Space.

The workbench is equipped with one project that got augmented (by the user) with one extension-project (the smaller dark-green oval). The extension-project adds one data-module and one project-repository (both in dark blue) displayed here inline with the main project. The additions of a simple extension-project are just inserted into the main project hierarchy on a node by node basis. An extension-project can also be fully specified and/or override the hierarchy of the main project.

You can see that the main project is actually a meta-project that references another project. The sub-project has its own area of responsibility (separated by a dashed line). Meta-projects resolve configuration conflicts between sub-projects and add resources to combine all parts into a whole.

The Workbench Meta-Project

The Workbench is also a place for local experimentation and in this regard acts like a meta-project that is not published. You can assemble multiple projects and modules in a workbench, and also add a project-repository to contribute configuration or process resources.

If you think your setup is worth publishing, you can wrap selected top-level nodes and the project-repository into a new meta-project. You can see an example transition in the picture below. In the initial state, the project-repository is shown as a child of the WB data-space. This is again just a matter of the grouping done in a View which can be more concise in the context of creating a new meta-project.

In a similar way, you can move top-level modules from your workbench to a [meta-]project or draft comprehensive changes to a project in an extension-project and ask the project to merge the changes.

You can see that the hierarchy independent addressing of nodes plays again a very valuable role.

Diverse Points

An interesting advantage of the physical decoupling of locations is that you can move the module you are currently working with the most onto the fastest storage device without disrupting anything.

By adding tags to Workbenches it is possible to group Workbenches according to varying and overlapping aspects.

There can also be Module-Groups to have an additional option for structuring the graph. A refinement could be Fork-Groups that enforce the necessary requirements for selecting one of many clones.

Content-Units can be saved/restored in/from an archive. That would make it possible to get the configuration of uninstalled applications out of the way, but still be able to restore it on demand.

For complex setups (eg: a meta-project of multiple workplaces (develop, design, etc)) of one large project, it might be necessary to recognize when the same module or project is used by more than one workplace. References could accomplish that. On the other hand, it might be necessary to instantiate the same module separately with different configurations. That should not be a problem.

Modules can be unmanaged known locations and projects can acquire further locations (eg for build or install tasks). I wanted to emphasize the role of known locations as data management units here.

Occasionally, it might be necessary to prioritize modules of a Data-Space against each other or de/activate selected modules.

Finally

As always: crush it, chew it, see how it tastes (?!) and what can be improved.