Rebuilding Composer in Rust - Fetching package information from Packagist

rust
Table of Contents

Welcome back! If you're here for the first step, I recommend starting from the beginning so you don't get lost.

This post is going to focus on interacting with the Packagist API to retrieve information about a given package.

To grab information about a package from Packagist, you can use a public API endpoint as below.

https://repo.packagist.org/p2/ryangjchandler/blade-cache-directive.json

This returns some minified JSON that contains information about the Composer package. Things like the name, require, autoload, etc.

In order to grab this information from the project, I need to make an HTTP request to the endpoint. Like the previous episodes, I'm going to abstract this out into a new package in the workspace. I'm also going to add the reqwest package to I can make HTTP requests.

cargo init crs-package-metadata --lib
cargo add reqwest --package crs-package-metadata --features blocking

Serialising and deserialising JSON in Rust is fairly straightforward. There's a package called serde that lets you use macros on various structures that represent the structure of the JSON.

cargo add serde --package crs-package-metadata --features derive
cargo add serde_json --package crs-package-metadata

This lets me write a struct to represent the information returned from the Packagist API and deserialise a string of JSON into an instance of the struct.

The actual structure of the JSON is something like this:

{
    "packages": {
        "[vendor]/[package]": [
            {
                "name": "[vendor]/[package]",
                "description": "[description]",
                "version": "[version1]",
                // ...
            },
            {
                "version": "[version2]",
                // ...
            }
            // ...
        ]
    },
    "minified": "composer/2.0"
}

I don't care about the minified key, since I know it's being minified. The packages key contains a single object with a single key (the name of the package). That key contains an array of objects that hold all of the information for a set of versions of the package.

The JSON is minified, which means that any keys that are common between the various versions of a package are omitted. Composer themselves have a package composer/metadata-minifier that can be used in PHP to expand the minified versions, but since I'm writing Rust, I don't have the luxury of using existing code.

That's not a big problem right now since I don't need all of that information. I only care about the following things:

  • The source of a version.
  • The distributable artefacts (dist).
  • The raw version of the package.
  • The version_normalized field.
#[derive(Debug, Deserialize)]
pub struct PackageMetadata {
    versions: Vec<PackageVersionMetadata>,
}

#[derive(Debug, Deserialize)]
pub struct PackageVersionMetadata {
    version: String,
    version_normalized: String,
    dist: PackageDistMetadata,
    source: PackageSourceMetadata,
}

#[derive(Debug, Deserialize)]
pub struct PackageDistMetadata {
    url: String,
    r#type: String,
    shasum: String,
    reference: String,
}

#[derive(Debug, Deserialize)]
pub struct PackageSourceMetadata {
    url: String,
    r#type: String,
    reference: String
}

INFO

Rust has a set of reserved keywords – things you'd use to create structures, types, etc. It's possible to still use those reserved keywords by prefixing them with r# to create a "raw identifier".

With the rough structure defined, I can start to write the deserialisation logic.

impl PackageMetadata {
    pub fn for_package(package: &str) -> Result<Self, PackageMetadataError> {
        let url = format!("https://repo.packagist.org/p2/{package}.json");
        let response = get(&url).map_err(|_| PackageMetadataError::FailedToFetchPackageMetadata(package.to_string()))?;

        // If the package was not found, return an error.
        if response.status() == StatusCode::NOT_FOUND {
            return Err(PackageMetadataError::PackageNotFound(package.to_string()));
        }

        // Grab the raw JSON from the response.
        let json = response.text().map_err(|_| PackageMetadataError::FailedToReadPackageMetadata(package.to_string()))?;

        // Convert the raw JSON into an untyped JSON value.
        let metadata: Value = serde_json::from_str(&json).map_err(|_| PackageMetadataError::FailedToParsePackageMetadata(package.to_string()))?;
        let versions = metadata["packages"][package].clone();

        Ok(PackageMetadata {
            versions: serde_json::from_value(versions).map_err(|_| PackageMetadataError::FailedToParsePackageMetadata(package.to_string()))?
        })
    }
}

The serde_json API is really nice I think. Rust has support for operator overloading through the trait system, so you can access keys inside of an object in a similar way to JavaScript.

The code above does a bit of manual handling to grab the array of version objects. Since the serde macros rely on things being defined at compile-time, you have to do some trickery to make it read from nested keys that have variable names. In this case, that would be the package name being used as the key.

It's not impossible to do, but I find it much nicer to take the explicit route of grabbing that value from the JSON manually and then deserialising it further into the typed PackageVersionMetadata instances.

Time to write some tests. Thankfully, I know for a fact that at least one of my packages will be available at this endpoint. So I can use that as my test case.

#[cfg(test)]
mod tests {
    use super::PackageMetadata;

    #[test]
    fn it_can_fetch_metadata_for_a_package() {
        let metadata_result = PackageMetadata::for_package("ryangjchandler/blade-cache-directive");

        assert!(metadata_result.is_ok());

        let metadata = metadata_result.unwrap();

        assert!(!metadata.versions.is_empty());
        assert!(metadata.versions.iter().any(|version| version.version == "v1.0.0".to_string()));
    }

    #[test]
    fn it_returns_an_error_when_a_package_is_not_found() {
        let metadata = PackageMetadata::for_package("ryangjchandler/this-package-does-not-exist");

        assert!(metadata.is_err());
    }
}

More progress. Nice.

The next post in the series is going to focus on parsing out semantic version strings into nice little VersionConstraint values that will let me easily test a specific version against a constraint.

Until next time.

Enjoyed this post or found it useful? Please consider sharing it on Twitter.