Easily Run AI Models Locally on Windows 11 with Microsoft’s New Features

Copilot+ PCs represent a groundbreaking advancement as the inaugural computers capable of executing Small Language Models (SLM) directly on the device. This innovative technology offers significant advantages by delivering faster results for various tasks, such as image and text generation, compared to the cloud-based Copilot application. Recently, Microsoft has unveiled the AI Dev Gallery, which simplifies the integration of on-device AI capabilities into any application.

The AI Dev Gallery is designed specifically for developers interested in testing various models to enhance their applications with AI features. This tool provides access to over 25 downloadable samples, which can be easily run on your device. Additionally, users can export projects or source code straight into their applications for immediate functionality. It is compatible with both Windows 10 and 11, supporting x64 and ARM64 architectures.

In an interesting experiment, Windows Latest duplicated the AI Dev Gallery from its GitHub repository. Currently, accessing it requires building the project through Visual Studio before execution. Moreover, a minimum of 20GB of storage and a multi-core CPU is necessary. Although a GPU with 8GB of VRAM is recommended, it is only essential for more demanding models.

We initiated our testing with a Windows 11 PC equipped with a 4-core CPU and 4GB of RAM. The app has two operational modes: Sample and Models. We utilized the Sample mode to explore the diverse models available, which are organized into categories including Text, Image, Code, Audio and Video, and Smart Controls.

AI Dev Gallery app interface in Windows 11

Evaluating the Models

The models available for image and video generation are relatively large, with sizes approaching 5GB. Our initial choice was a smaller model focused on upscaling, which was under 100MB. We took a screenshot and attempted to upscale it using our CPU for processing. Notably, the option to switch between CPU and GPU for processing requests is available.

The upscaling process was completed in less than 30 seconds on this modest virtual machine, resulting in temporary RAM usage peaking at 1GB. The app then displayed an upscaled image with a resolution of 9272*4900. However, the quality of image elements, particularly text, was significantly compromised and rendered unreadable.

Enhancing image AI model in the AI Dev Gallery app

Unfortunately, there is no functionality available to preview the generated image in a larger format or full-screen mode, nor is there an option to download the image directly to your disk.

We proceeded to test another model, known as Detect Human Pose, which is designed to ascertain the positioning of individuals within an image. This model successfully recognized a simple walking figure but also began to display position markers over screenshots of our desktop featuring several open applications.

Detect Human Pose model demonstration in AI Dev Gallery app

While the exact means of integrating these models into applications remain unclear, certain features can indeed run locally. Nevertheless, PCs will require substantial storage space for these models, along with robust CPUs and at least 16GB of RAM.

What’s your take on this? Is it worth downloading a hefty 5GB model to transform a text prompt into an image, or would it be more efficient to wait 30 seconds using a web-based application? It’s evident that many of these features cater to specific use cases and operational environments, which may not necessarily appeal to the broader Windows 11 user demographic.

Source & Images

© 2021 The Filibuster Blog