mirror of
https://github.com/microsoft/vcpkg
synced 2024-11-21 16:09:03 -07:00
[vcpkg RFC] initial registries RFC (#12881)
This commit is contained in:
parent
9740611cab
commit
d73ead568b
1 changed files with 285 additions and 0 deletions
285
docs/specifications/registries.md
Normal file
285
docs/specifications/registries.md
Normal file
|
@ -0,0 +1,285 @@
|
|||
# Package Federation: Custom Registries
|
||||
|
||||
As it is now, vcpkg has over 1400 ports in the default registry (the `/ports` directory).
|
||||
For the majority of users, this repository of packages is enough. However, many enterprises
|
||||
need to more closely control their dependencies for one reason or another, and this document
|
||||
lays out a method which we will build into vcpkg for exactly that reason.
|
||||
|
||||
## Background
|
||||
|
||||
A registry is simply a set of packages. In fact, there is already a registry in vcpkg: the default one.
|
||||
Package federation, implemented via custom registries, allows one to add new packages,
|
||||
edit existing packages, and have as much or as little control as one likes over the dependencies that one uses.
|
||||
It gives the control over dependencies that an enterprise requires.
|
||||
|
||||
### How Does the Current Default Registry Work?
|
||||
|
||||
Of course, the existing vcpkg tool does have packages in the official,
|
||||
default registry. The way we describe these packages is in the ports tree –
|
||||
at the base of the vcpkg install directory, there is a directory named ports,
|
||||
which contains on the order of 1300 directories, one for each package. Then,
|
||||
in each package directory, there are at least two files: a CONTROL or
|
||||
vcpkg.json file, which contains the name, version, description, and features
|
||||
of the package; and a portfile.cmake file which contains the information on
|
||||
how to download and build the package. There may be other files in this
|
||||
registry, like patches or usage instructions, but only those two files are
|
||||
needed.
|
||||
|
||||
### Existing vcpkg Registry-like Features
|
||||
|
||||
There are some existing features in vcpkg that act somewhat like a custom
|
||||
registry. The most obvious feature that we have is overlay ports – this
|
||||
feature allows you to specify any number of directories as "overlays", which
|
||||
either contain a package definition directly, or which contain some number of
|
||||
package directories; these overlays will be used instead of the ports tree
|
||||
for packages that exist in both places, and are specified exclusively on the
|
||||
command line. Additionally, unfortunately, if one installs a package from
|
||||
overlay ports that does not exist in the ports tree, one must pass these
|
||||
overlays to every vcpkg installation command.
|
||||
|
||||
There is also the less obvious "feature" which works by virtue of the ports
|
||||
tree being user-editable: one can always edit the ports tree on their own
|
||||
machine, and can even fork vcpkg and publish their own ports tree.
|
||||
Unfortunately, this then means that any updates to the source tree require
|
||||
merges, as opposed to being able to fast-forward to the newest sources.
|
||||
|
||||
### Why Registries?
|
||||
|
||||
There are many reasons to want custom registries; however, the most important reasons are:
|
||||
|
||||
* Legal requirements – a company like Microsoft or Google
|
||||
needs the ability to strictly control the code that goes into their products,
|
||||
making certain that they are following the licenses strictly.
|
||||
* There have been examples in the past where a library which is licensed under certain terms contains code
|
||||
which is not legally allowed to be licensed under those terms (see [this example][legal-example],
|
||||
where a person tried to merge Microsoft-owned, Apache-licensed code into the GPL-licensed libstdc++).
|
||||
* Technical requirements – a company may wish to run their own tests on the packages they ship,
|
||||
such as [fuzzing].
|
||||
* Other requirements – an organization may wish to strictly control its dependencies for a myriad of other reasons.
|
||||
* Newer versions – vcpkg may not necessarily always be up to date for all libraries in our registry,
|
||||
and an organization may require a newer version than we ship;
|
||||
they can very easily update this package and have the version that they want.
|
||||
* Port modifications – vcpkg has somewhat strict policies on port modifications,
|
||||
and an organization may wish to make different modifications than we do.
|
||||
It may allow that organization to make certain that the package works on triplets
|
||||
that our team does not test as extensively.
|
||||
* Testing – just like port modifications, if a team wants to do specific testing on triplets they care about,
|
||||
they can do so via their custom registry.
|
||||
|
||||
Then, there is the question of why vcpkg needs a new solution for custom registries,
|
||||
beyond the existing overlay ports feature. There are two big reasons –
|
||||
the first is to allow a project to define the registries that they use for their dependencies,
|
||||
and the second is the clear advantage in the user experience of the vcpkg tool.
|
||||
If a project requires specific packages to come from specific registries,
|
||||
they can do so without worrying that a user accidentally misses the overlay ports part of a command.
|
||||
Additionally, beyond a feature which makes overlay ports easier to use,
|
||||
custom registries allow for more complex and useful infrastructure around registries.
|
||||
In the initial custom registry implementation, we will allow overlay ports style paths,
|
||||
as well as git repositories, which means that people can run and use custom registries
|
||||
without writing their own infrastructure around getting people that registry.
|
||||
|
||||
It is the intention of vcpkg to be the most user-friendly package manager for C++,
|
||||
and this allows us to fulfill on that intention even further.
|
||||
As opposed to having to write `--overlay-ports=path/to/overlay` for every command one runs,
|
||||
or adding an environment variable `VCPKG_OVERLAY_PORTS`,
|
||||
one can simply write vcpkg install and the registries will be taken care of for you.
|
||||
As opposed to having to use git submodules, or custom registry code for every project,
|
||||
one can write and run the infrastructure in one place,
|
||||
and every project that uses that registry requires only a few lines of JSON.
|
||||
|
||||
[legal-example]: https://gcc.gnu.org/legacy-ml/libstdc++/2019-09/msg00054.html
|
||||
[fuzzing]: https://en.wikipedia.org/wiki/Fuzzing
|
||||
|
||||
## Specification
|
||||
|
||||
We will be adding a new file that vcpkg understands - `vcpkg-configuration.json`.
|
||||
The way that vcpkg will find this file is different depending on what mode vcpkg is in:
|
||||
in classic mode, vcpkg finds this file alongside the vcpkg binary, in the root directory.
|
||||
In manifest mode, vcpkg finds this file alongside the manifest. For the initial implementation,
|
||||
this is all vcpkg will look for; however, in the future, vcpkg will walk the tree and include
|
||||
configuration all along the way: this allows for overriding defaults.
|
||||
The specific algorithm for applying this is not yet defined, since currently only one
|
||||
`vcpkg-configuration.json` is allowed.
|
||||
|
||||
The only thing allowed in a `vcpkg-configuration.json` is a `<configuration>` object.
|
||||
|
||||
A `<configuration>` is an object:
|
||||
* Optionally, `"default-registry"`: A `<registry-implementation>` or `null`
|
||||
* Optionally, `"registries"`: An array of `<registry>`s
|
||||
|
||||
Since this is the first RFC that adds anything to this field,
|
||||
as of now the only properties that can live in that object will be
|
||||
these.
|
||||
|
||||
A `<registry-implementation>` is an object matching one of the following:
|
||||
* `<registry-implementation.builtin>`:
|
||||
* `"kind"`: The string `"builtin"`
|
||||
* `<registry-implementation.directory>`:
|
||||
* `"kind"`: The string `"directory"`
|
||||
* `"path"`: A path
|
||||
* `<registry-implementation.git>`:
|
||||
* `"kind"`: The string `"git"`
|
||||
* `"repository"`: A URI
|
||||
* Optionally, `"path"`: An absolute path into the git repository
|
||||
* Optionally, `"ref"`: A git reference
|
||||
|
||||
A `<registry>` is a `<registry-implementation>` object, plus the following properties:
|
||||
* Optionally, `"scopes"`: An array of `<package-name>`s
|
||||
* Optionally, `"packages"`: An array of `<package-name>`s
|
||||
|
||||
The `"packages"` and `"scopes"` fields of distinct registries must be disjoint,
|
||||
and each `<registry>` must have at least one of the `"scopes"` and `"packages"` property,
|
||||
since otherwise there's no point.
|
||||
|
||||
As an example, a package which uses a different default registry, and a different registry for boost,
|
||||
might look like the following:
|
||||
|
||||
```json
|
||||
{
|
||||
"default-registry": {
|
||||
"kind": "directory",
|
||||
"path": "vcpkg-ports"
|
||||
},
|
||||
"registries": [
|
||||
{
|
||||
"kind": "git",
|
||||
"repository": "https://github.com/boostorg/vcpkg-ports",
|
||||
"ref": "v1.73.0",
|
||||
"scopes": [ "boost" ]
|
||||
},
|
||||
{
|
||||
"kind": "builtin",
|
||||
"packages": [ "cppitertools" ]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
This will install `fmt` from `<directory-of-vcpkg.json>/vcpkg-ports`,
|
||||
`cppitertools` from the registry that ships with vcpkg,
|
||||
and any `boost` dependencies from `https://github.com/boostorg/vcpkg-ports`.
|
||||
Notably, this does not replace behavior up the tree -- only the `vcpkg-configuration.json`s
|
||||
for the current invocation do anything.
|
||||
|
||||
### Behavior
|
||||
|
||||
When a vcpkg command requires the installation of dependencies,
|
||||
it will generate the initial list of dependencies from the package,
|
||||
and then run the following algorithm on each dependency:
|
||||
|
||||
1. Figure out which registry the package should come from by doing the following:
|
||||
1. If there is a registry in the registry set which contains the dependency name in the `"packages"` array,
|
||||
then use that registry.
|
||||
2. For every scope, in order from most specific to least,
|
||||
if there is a registry in the registry set which contains that scope in the `"scopes"` array,
|
||||
then use that registry.
|
||||
(For example, for `"cat.meow.cute"`, check first for `"cat.meow.cute"`, then `"cat.meow"`, then `"cat"`).
|
||||
3. If the default registry is not `null`, use that registry.
|
||||
4. Else, error.
|
||||
2. Then, add that package's dependencies to the list of packages to find, and repeat for the next dependency.
|
||||
|
||||
vcpkg will also rerun this algorithm whenever an install is run with different configuration.
|
||||
|
||||
### How Registries are Layed Out
|
||||
|
||||
There are three kinds of registries, but they only differ in how the registry gets onto one's filesystem.
|
||||
Once the registry is there, package lookup runs the same, with each kind having it's own way of defining its
|
||||
own root.
|
||||
|
||||
In order to find a port `meow` in a registry with root `R`, vcpkg first sees if `R/meow` exists;
|
||||
if it does, then the port root is `R/meow`. Otherwise, see if `R/m-` exists; if it does,
|
||||
then the port root is `R/m-/meow`. (note: this algorithm may be extended further in the future).
|
||||
|
||||
For example, given the following port root:
|
||||
|
||||
```
|
||||
R/
|
||||
abseil/...
|
||||
b-/
|
||||
boost/...
|
||||
boost-build/...
|
||||
banana/...
|
||||
banana/...
|
||||
```
|
||||
|
||||
The port root for `abseil` is `R/abseil`; the port root for `boost` is `R/b-/boost`;
|
||||
the port root for `banana` is `R/banana` (although this duplication is not recommended).
|
||||
|
||||
The reason we are making this change to allow more levels in the ports tree is that ~1300
|
||||
ports are hard to look through in a tree view, and this allows us to see only the ports we're
|
||||
interested in. Additionally, no port name may end in a `-`, so this means that these port subdirectories
|
||||
will never intersect with actual ports. Additionally, since we use only ASCII for port names,
|
||||
we don't have to worry about graphemes vs. code units vs. code points -- in ASCII, they are equivalent.
|
||||
|
||||
Let's now look at how different registry kinds work:
|
||||
|
||||
#### `<registry.builtin>`
|
||||
|
||||
For a `<registry.builtin>`, there is no configuration required.
|
||||
The registry root is simply `<vcpkg-root>/ports`.
|
||||
|
||||
#### `<registry.directory>`
|
||||
|
||||
For a `<registry.directory>`, it is again fairly simple.
|
||||
Given `$path` the value of the `"path"` property, the registry root is either:
|
||||
|
||||
* If `$path` is absolute, then the registry root is `$path`.
|
||||
* If `$path` is drive-relative (only important on Windows), the registry root is
|
||||
`(drive of vcpkg.json)/$path`
|
||||
* If `$path` is relative, the registry root is `(directory of vcpkg.json)/$path`
|
||||
|
||||
Note that the path to vcpkg.json is _not_ canonicalized; it is used exactly as it is seen by vcpkg.
|
||||
|
||||
#### `<registry.git>`
|
||||
|
||||
This registry is the most complex. We would like to cache existing registries,
|
||||
but we don't want to ignore new updates to the registry.
|
||||
It is the opinion of the author that we want to find more updates than not,
|
||||
so we will update the registry whenever the `vcpkg.json` or `vcpkg-configuration.json`
|
||||
is modified. We will do so by keeping a sha512 of the `vcpkg.json` and `vcpkg-configuration.json`
|
||||
inside the `vcpkg-installed` directory.
|
||||
|
||||
We will download the specific ref of the repository to a central location (and update as needed),
|
||||
and the root will be either: `<path to repository>`, if the `"path"` property is not defined,
|
||||
or else `<path to repository>/<path property>` if it is defined.
|
||||
The `"path"` property must be absolute, without a drive, and will be treated as relative to
|
||||
the path to the repository. For example:
|
||||
|
||||
```json
|
||||
{
|
||||
"kind": "git",
|
||||
"repository": "https://github.com/microsoft/vcpkg",
|
||||
"path": "/ports"
|
||||
}
|
||||
```
|
||||
|
||||
is the correct way to refer to the registry built in to vcpkg, at the latest version.
|
||||
|
||||
The following are all incorrect:
|
||||
|
||||
```json
|
||||
{
|
||||
"$reason": "path can't be drive-absolute",
|
||||
"kind": "git",
|
||||
"repository": "https://github.com/microsoft/vcpkg",
|
||||
"path": "F:/ports"
|
||||
}
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"$reason": "path can't be relative",
|
||||
"kind": "git",
|
||||
"repository": "https://github.com/microsoft/vcpkg",
|
||||
"path": "ports"
|
||||
}
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"$reason": "path _really_ can't be relative like that",
|
||||
"kind": "git",
|
||||
"repository": "https://github.com/microsoft/vcpkg",
|
||||
"path": "../../meow/ports"
|
||||
}
|
||||
```
|
Loading…
Reference in a new issue