Release 2.4

Introduce RAM-Based Intermediate Cache to Reduce GPU Cache Thrashing

To improve performance under heavy load, an intermediate RAM-based cache layer has been introduced between GPU memory and disk storage.

The previous GPU caching system could experience slowdowns when GPU memory was exhausted. Frequently accessed images were evicted and later reloaded from disk, causing unnecessary I/O overhead and latency.

With the new approach:

When GPU cache reaches capacity, evicted images are stored in RAM instead of being discarded

If the same image is requested again, the system checks RAM cache before loading from disk

This significantly reduces reload time and improves responsiveness.

Benefits:

Reduced latency for frequently accessed data

Lower disk I/O under high concurrency

Improved GPU cache efficiency

More stable performance during peak workloads.

Support using SAM2 from Roboflow for images

Support for SAM2 models from Roboflow has been added for image annotation.

This enables:

Faster object segmentation

Improved annotation accuracy

Seamless integration with existing ML workflows

Improve Roboflow configuration UI

The Roboflow configuration interface has been simplified.

Changes include:

Clearer configuration options

Reduced number of setup steps

More intuitive workflow when connecting models

Import files from cloud storage (S3 / GCS / Azure)

It is now possible to add files directly from connected cloud storage.
In projects configured with S3, GCS, or Azure buckets:

A new option is available in the “Add files” menu to browse remote storage

Users can select files directly from the bucket

Selected files are loaded into the standard upload flow

Supported file types:

Images

Video files

Image sequences (as pre-cut videos)

Additional improvements:

File list pagination has been added to prevent UI freezes when working with large datasets

Improved stability when loading large file collections

Environment-based Roboflow account isolation

Roboflow accounts are now isolated per environment.
If the same ML server is shared across multiple environments (e.g. staging, test, production):

Accounts configured in one environment are no longer visible in others

This prevents mixing models and data between environments