Skip to main content

Hardening the pAI-OS API

· 4 min read
Sam Johnston
Product Manager

One of the key design decisions we made in developing pAI-OS was that the frontend/s (by default, a single page web application written in React) and backend (a Python Flask app) are separate services. This separation of concerns allows us to have a more fine-grained control over the API security, and to be able to use the tools and techniques we already know best for each layer. By being strict about not using private APIs, you can be sure that all the features and functions you see in any interface are available to all of them.

In the ever-evolving landscape of API security, ensuring that all requests and responses are compliant with predefined specifications is increasingly important in a multi-layer defense strategy. At pAI-OS, we have adopted OpenAPI and "pattern" regexps with Connexion to achieve this goal without having to rely on third-party services (though you can upload your OpenAPI spec to providers like Cloudflare who will enforce it before requests even reach your server!).

Why OpenAPI and Connexion?

OpenAPI is a powerful specification for defining APIs, allowing us to describe our API endpoints, request/response formats, and validation rules in a standardized way. Connexion is a Python framework that automates the validation of requests and responses against an OpenAPI specification. By integrating these tools, we ensure that our API adheres to strict validation rules, reducing the risk of security vulnerabilities and improving overall reliability. This can exact a small performance penalty, but given this is an administrative interface rather than one that's invovled in user requests, the trade-off is worth it.

Using "pattern" Regular Expressions for Validation

One of the key features we leverage in our OpenAPI spec is the use of "pattern" regexps. These regular expressions allow us to define precise validation rules for various parts of our API, such as path parameters, query parameters, and request bodies. By specifying patterns, we can enforce constraints on the data being sent to and from our API, ensuring it meets our security and format requirements.

Examples

Filenames

pAI-OS is designed to be cross platform, but different platforms accept different characters in filenames (and use different path formats too, which is why use pathlib.Path internally). We also use the following regexp to ensure that filenames are valid on macOS, Windows, and Linux:

fileName:
type: string
description: A filename that is valid on macOS, Windows, and Linux
example: Mistral-7B-Instruct-v0.3-Q8_0.gguf
pattern: '^[^<>:;,?"*|/]+$'

UUIDs

For scalability and security, we don't let users define the id of objects they create (via POST) or update (via PUT). Instead, we generate UUIDs for them. This way we can ensure that the id of an object is a valid UUID, that it follows the correct format, and that it's unique even if it's generated by one of several servers.

Rather than hoping users only ever send us valid UUIDs, we can be sure that they will by using the pattern regexp:

uuid4:
type: string
format: uuid
example: 7bea4732-214f-40e7-9161-4e7241a2b97e
pattern: ^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$

Emails

For a more complex example one need look no further than an almost 100% RFC-compliant email regex:

email:
type: string
format: email
example: [email protected]
pattern: (?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)_|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])_")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]\*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

Path Traversal Example

For example, here Connexion has intercepted an attempt at executing a path traversal vulnerability before reaching the rest of our code:

Path Traversal Example

Conclusion

In summary, by using OpenAPI and Connexion, we can ensure that our API is secure and compliant with the latest standards. This approach allows us to have a more fine-grained control over the API security, albeit with a small performance penalty. By being strict about not using private APIs, you can be sure that all the features and functions you see in any interface are available to all of them.

Developers! Developers! Developers!

· One min read
Sam Johnston
Product Manager

If you've been waiting to kick the tires then now's a good time to git clone https://github.com/pAI-OS/paios, run python -m paios (after running python3 paios/scripts/setup_environment.py the first time and activating the virtual environment every time with source paios/.venv/bin/activate), before visiting http://localhost:3080

One of the biggest changes is that it's now a single server (instead of one for frontend and one for backend) all served out of that same URL (with the API at /api/v1 and its docs at /api/v1/ui). I've also eliminated the Node.js dependency for early adopters and users by having the frontend built for the canary branch by GitHub Actions on every commit to main. You do still need to have Python installed, but it comes with Linux there are installers for macOS and Windows (you don't need to install Docker or anything like that).

I've also been migrating to a modern asynchronous architecture so we don't have to look back once we start building on it. I'm in the process of migrating the abilities infrastructure in backend/api.py to backend\managers\AbilitiesManager.py (and DownloadsManager.py) so you can't download and run models for the moment, but I've broken the back of it now.

pAI-OS Development Update (May 2024)

· 2 min read
Sam Johnston
Product Manager

I started working on bringing the front- and backend into the same server environment and was initially going to find a cross-platform way to deploy nginx as a reverse proxy for various pAI-OS components, but the result was likely to be brittle. I then moved on to having a single server instance that also serves the frontend/dist files as static files, meaning you can start one server and immediately access pAI-OS.

Uvicorn (https://www.uvicorn.org/) is an ASGI (asynchronous equivalent of WSGI) web server implementation for Python that happens to be the one Connexion uses. Connexion, which reads the pAI-OS OpenAPI spec and automatically binds URLs to python functions, has recently had a major release that moves from synchronous (WSGI) to asynchronous (ASGI) by default (https://connexion.readthedocs.io/en/latest/v3.html) and it's as good a time as any to jump into the future.

This probably would have been unnecessary if the OS was only involving itself in configuration/deployment/etc. but as it already has awareness of assets I figured it will likely end up in the flow of user queries before long, and many of those will likely be WebSockets which are kryptonite for synchronous servers (WSGI). As such, building a common server for front- and backend has turned into a migration from WSGI to ASGI which is now nearly complete on the feature/uvicorn branch: https://github.com/pAI-OS/paios/tree/feature/uvicorn

I'd suggest holding off a few more days on testing the code on main as I'll be merging soon.

pAI-OS Early Demo

· One min read
Sam Johnston
Product Manager

Here's a quick demo of the Personal Artificial Intelligence Operating System (PAIOS) developed for Kwaai, a California 501(c)3 nonprofit organisation.

pAI-OS, like Linux, allows users to bring together the best free and open-source software to securely use artificial intelligence agents with direct access to all of their own data.

Kwaai's mission is to empower users to retain and own their personal data and knowledge model, enhancing their digital abilities and competitiveness in the modern economy.